Guest Post: Add Live Captioning to Your Wowza Streaming Engine Streams with iCap Falcon
By Dave Watts and Bill McLaughlin, EEG
In this post we’ll discuss captioning workflows in general, then cover how Wowza Streaming Engine and EEG’s iCap Falcon combine to provide high-quality TV-style closed captions through an end-to-end pure software workflow.
Closed captioning of live content delivered over the Web is a crucial component of the streaming video workflow: Captioning expands audiences, improves video search, and is now often required for FCC and ADA regulatory compliance.
Streaming video producers and engineers are often surprised to find that adding closed captions to live content can be a much different workflow than the relatively simple steps required for captioning on-demand (non-live) video.
- For on-demand videos, it is typical to obtain and upload a “sidecar” caption file, in a format like TTML, WebVTT, SRT, or SCC. The sidecar file contains caption text with timing cues for the entire video in a single small file.
- For live videos, each line of text must be generated (usually by a hired stenographer) and transmitted in real-time to maintain synchronization with the program. Workflows vary considerably with respect to both producing these live text segments and delivering them from a server to the end user.
One common approach requires a broadcast SDI hardware closed caption encoder (for example, EEG’s HD492 iCap encoder), and an SDI-input live streaming encoder (from vendors such as Elemental, Teradek, and Matrox). The closed-caption encoder puts CEA-708 captions on the SDI signal, and the streaming encoder embeds these captions in H.264, RTMP, or other formats for delivery to Wowza Streaming Engine—which is subsequently passed on to the viewers.
While the hardware encoding approach still offers good compatibility with a variety of vendor systems, streaming media producers who are operating with little or no traditional SDI broadcast infrastructure require a smaller footprint and a less expensive solution for streaming live captions. For this market, EEG has leveraged Wowza Streaming Engine closed-captioning API features to create a plug-in that provides direct connectivity between Wowza servers and EEG’s iCap network of live caption service providers, by way of a new virtual caption encoding tool called Falcon.
The Wowza Plug-in Component
In order to enable direct connection between live caption service providers and Wowza Streaming Engine servers, EEG developed a unique plug-in using the Wowza Streaming Engine Java API. This Java API enables custom behaviors to be deployed on a per-application basis on an end user’s Wowza Streaming Engine installation.
Specifically, the plug-in is a subclass of the HTTP Provider class in the API. The HTTP Provider provides a customized URI path where HTTP requests are initially handled through the Wowza Streaming Engine web server, and then passed into the custom HTTP Provider. In our case, caption data passed from a transcription provider over iCap is reformatted by the Falcon service into an HTTP POST message. The body of the message carries an XML payload containing a small chunk of real-time transcription. The XML payload is unpacked, interpreted, and checked for validity in the custom HTTP Provider subclass.
Once a valid captioning string is recovered, the Wowza AMF data-insertion API is used to inject a new caption message live into the configured stream. Although the direct result of the API call is to create RTMP captions in the onTextData format, a correctly configured Wowza Streaming Engine instance will also be able to translate this data into equivalent captions for the output modules that feed HLS streams (which use the CEA-608 data format for captioning) and other supported captioning formats.
On successful completion of each POST, an HTTP OK is sent to confirm the message through Falcon. Other error messages are provided if, for example, captions are uplinked for associated with a stream that is not currently active on the Wowza Streaming Engine, and therefore AMF data insertion into the video will fail.
Logging from the caption-creation process appears in the end user’s Wowza system logs through the use of the WMSLoggerFactory class.
The EEG HTTP Provider plug-in is compiled as a JAR output file, and needs to be installed by the end user into their Wowza Streaming Engine installation. The applicable VHost.xml file needs a new section activating the HTTP Provider for a set URI pattern and specifying HTTP authentication options. Proper functioning of the plug-in for all output codecs also requires the Application.xml file for the target application to include a section enabling the OnTextDataToCEA608 module that is provided in the Wowza base installation but not enabled by default.
How Falcon Works
Accessed on-demand through the EEG Cloud Services website, Falcon offers immediate caption connectivity for live online presentation of news, sporting events, e-learning classrooms, corporate webinars, municipal meetings, and more. Connect to the caption agency of your choice, and the resulting live caption data will be routed directly into your live Wowza stream over the iCap network. (Think of it as a secure data tunnel into your live stream for the caption service provider.) This hardware-free alternative makes live closed captioning much more accessible and affordable for streaming-only media, as it eliminates the need for the previously mentioned SDI signal path.
Falcon sessions are easily generated by supplying the Falcon web application with login details for your Wowza stream, and the identity of your caption service provider. You can read specific instructions on how to get started in this detailed PDF.
Live captions injected into your video stream through the EEG plug-in will also be present in all recordings of the event, so full access to the text for both accessibility and metadata will be present for replays and long-tail content.
Sending captions to Wowza Streaming Engine through Falcon is very simple for experienced users, but it is always a good idea to test procedures with your caption service provider prior to a live event to ensure a smooth, high-quality result. Once all the steps are completed and testing is successful, go live and watch the captions roll in!