4 Technologies for Video Delivery at Scale
Table of contents
In my last blog post, I covered what cloud-based transcoding is and why it’s advantageous compared to on-premises transcoding. Beyond generic cloud transcoding, though, are there other practical technologies to pay attention to when scalability is a key decision point?
If your use case involves scaling content delivery to a global audience, then the short answer is yes.
The key components needed to scale live event delivery include an ability to seamlessly ingest content, including associated metadata from each ingest point; capability to transcode, transrate, transmux, and repackage a vision-mixed video stream for use on a variety of device types and network topologies; and the ability to deliver these streams to multiple content delivery networks (CDNs).
Similarly, for on-demand (VOD) delivery, scaling up requires converting a master or mezzanine file into a pre-defined list of on-demand data rates and resolutions called renditions; a capability to store these dozens of renditions; and the same ability to deliver these renditions via one or more CDNs.
Content Delivery Networks (CDN)
Let’s cover the last portion first: CDNs and the at-scale delivery process.
For those CDNs geared towards delivering VOD or higher-latency live streams, the process works fairly consistently because CDNs are optimized to deliver the same streams to millions of end users.
If the entire delivery is on-demand content, the CDN solution can often be integrated directly into a content management system (CMS) that contains the very high-quality mezzanine file. If so, this allows content management control to reside within the customer’s CMS with any initial VOD request resulting in a request to the CMS for a mezzanine file. This is then turned into multiple renditions using CDN resources, so that they can be delivered using what’s called an adaptive bitrate approach. Essentially, adaptive bitrate delivery allows a single end-user to view the optimal rendition for any given period of time, based on what their device and current data network throughput allows.
Content Management System (CMS)
Other solutions require an upload into an online video platform’s CMS, where renditions are transcoding and transrated in the background. Some of these immediately occur, resulting in a need for more storage space, while other solutions allow renditions of the mezzanine file to occur when an end user’s player makes a request for the particular data rate and resolution that makes up a given rendition.
For this latter approach, there’s a lower need for storage, but there’s also possibility that not every mezzanine file’s immediate playback will be possible. The reasoning is fairly simple: since the initial conversion to a specific rendition takes time, even in powerful cloud-based transcoding solutions, the resulting delay for end-user playback also takes significantly more time.
Regardless of whether the renditions occur at time of upload or time of playback, the need to find and deliver renditions for long-tail content (content that hasn’t been played in weeks or months) may also occur, since these renditions are often stored in less-costly backup systems (hard drive or tape based) rather than SSDs that reside on the main delivery server.
Does live add any complexity to at-scale delivery? It does.
Low Latency Solutions
The first factor is latency, or the delay between a live stream being encoded and the time it takes to deliver that same stream to end users.
While higher-latency live streams rely on the same renditions model noted above, delivery begins immediately after the first three segments of video are encoded. Those segments are often six to 10 seconds in length, so most of these streams can be delivered within 30 seconds of encoding.
The continued upswing in use cases that require delivery at much lower latencies means that newer approaches can be deployed to stream low-latency live stream delivery at scale. Rather than delving deeply into those solutions here, please see my recent review of the Wowza Real-Time Streaming at Scale solution.
Metadata Synchronization
In addition, there’s a growing need to synchronize streams — including audio, video, and data streams — so that the experience is consistent across all devices within a given location. This is one of the reasons that metadata synchronization across multiple ingest points is so important: if two cameras are out of synch with their associated metadata, what’s being displayed on an end user device may not match with the video and audio being displayed.
While this post only scratches the surface of the key components needed to scale live and VOD content delivery, take time to explore a few of the others: timed-text (subtitles) and pay-per-view authentication. Wowza has numerous resources that explain them in more detail, and feel free to contact me (insights@hmsrf.org) with any questions on these topics.