Machine Transcription

As of Imaginary Captions 2.1, we support machine transcription of your video content using either the Imaginary Captions Cloud or Amazon Web Services (AWS) "Amazon Transcribe" service. As with our machine translation services, machine transcription requires either the purchase of In-App credits (Imaginary Captions) or a BYOC subscription (AWS).

Transcription Basics

If primary dialog language (aka media language) of your video is in a supported language, then you can get a head start creating captions using machine transcription. Just load your video like you normally would and select Video | Transcribe. A dialog box will pop up saying that your video has been sent to the cloud and that it may take some time before it is ready.

Go grab some lunch, because this will take a while. A 30-second video will likely take about 5 minutes. A feature film will take a very long time.

When the transcription is done, you will see another pop-up indicating the transcription completed and noting any errors that may have occurred. If all went well, you're captions window will be populated with the transcribed captions at roughly the right point in the video.

Setting Things Up

Support for machine transcription is a premium feature of Imaginary Captions that requires an in-app purchase or subscription, depending on your approach.

For details on configuring Imaginary Captions to do machine translation, see Cloud Service Options.

Quality

MACHINE TRANSCRIPTION IS THE START OF YOUR WORKFLOW, NOT THE END!!!

If you use the results of your machine transcription without reviewing and correcting errors, you are making a huge mistake. Machine transcription is designed to give you a head start, but it's definitely subject to errors.

Even if the machine transcription is absolutely perfect, it will capture only dialog. That's not sufficient for closed-captions. You need to add in ambient sounds, voice over directions, music, and other non-dialog information.

Once the machine transcription comes back, you should watch the video and make sure each caption is accurate and you are capturing any notes about the context of the dialog, ambient sounds, or music so that everything makes sense to a viewer with hearing impairments. This is also an opportunity to fix any incorrect captions or any timing issues.

Troubleshooting

The most likely issue you will encounter is that we cannot perform transcription of an asset that is remote (e.g. an asset in Vimeo). We need a local asset because we have to transcode your video into an audio file before uploading it to the cloud. We cannot currently do this with video streams.