WebRTC vs. CMAF: Which is Best for Your Use Case
Emerging Technologies for Low-Latency Streaming
Low-latency CMAF is the new kid on the streaming block. Much like WebRTC, it aims to overcome a key stumbling block in the industry: reducing the delay between video capture and playback. So which technology is better suited for your live-streaming use case? With the CMAF vs. WebRTC debate heating up, we decided to illuminate this not-exactly-apples-to-apples comparison.
Update July 5, 2019: Apple’s recent announcement about Low-Latency HLS has impacted low-latency CMAF. To learn more about this, please read our post, Apple Low-Latency HLS: What It Is and How It Relates to CMAF.
Why Do We Need Low-Latency Streaming?
From within a burning building, an emergency responder communicates with her commander via a live-streaming bodycam. While video enables better information sharing, any lag could mean the difference between life and death.
Meanwhile, crowds pack into Churchill Downs to watch the Kentucky Derby. Gamblers across the world also participate via their mobile devices and computers. To ensure legal online wagering — especially in case of a controversial post-race disqualification — the stream must be delivered in near real-time.
And let’s say that recently discovered Caravaggio painting goes up for auction via a live stream. Selling to the highest bidder, no matter where in the world they’re bidding from, starts with a low-latency streaming solution.
Benefits of CMAF and WebRTC
Latency underpins any streaming scenario that requires two-way participation. But depending on which technologies you’re using to deliver a stream, video lag could increase by up to 45 seconds. That’s where the low-latency CMAF vs. WebRTC consideration comes into play.
Both of these technologies represent the latest-and-greatest methods for speeding up stream delivery. While WebRTC performs better on the latency spectrum, it’s not without its own drawbacks.
To unpack this comparison further, let’s define the two technologies and explore how each achieves low-latency streaming.
What Is CMAF
The Common Media Application Format, or CMAF, is a streaming format intended to simplify the delivery of HTTP-based streaming media. CMAF is not a protocol, but rather a format that can be referenced by both DASH and HLS.
Prior to CMAF, any content distributor wanting to reach users on both Apple and Microsoft devices had to encode and store the same data twice. That’s because the .ts format was used to deliver content to Apple devices, whereas Microsoft devices accepted .mp4 containers based on ISOBMFF. This made reaching viewers across iPhones, smart TVs, Xboxes, and PCs both costly and inefficient.
CMAF helps streamline the process by acting as a standardized transport container. While CMAF itself is nothing more than a media format, leading organizations are moving the industry forward by incorporating it into a larger system aimed at reducing latency.
What Is Low-Latency CMAF
In order to be classified as ‘low latency,’ a CMAF stream must be delivered with two key technologies:
This process involves breaking the video into smaller chunks of a set duration, which can then be immediately published upon encoding. This allows delivery to take place while later chunks are still processing.
When done right, chunked-encoded and chunked-transferred CMAF enables sub-three-second latency — no matter the scale.
How Low-Latency CMAF Works
In traditional HTTP-based streaming workflows, the encoder waits to create a full segment before sending it to the CDN. With low-latency CMAF, individual chunks making up the segment are pushed out for delivery while the segment itself is still being encoded.
Whereas reducing the segment size of HLS or DASH streams is one way to minimize delay, low-latency CMAF decouples latency from segment size entirely.
Source: Akamai, Ultra-Low-Latency Streaming Using Chunked-Encoded and Chunked Transferred CMAF, 2018
The above graphic juxtaposes standard CMAF delivery with low-latency CMAF. You’ll notice that each chunk in the bottom graphic includes a Movie Fragment Box (moof) accompanied by a Media Data Box (mdat). By including a complete pairing within each chunk, the player is able to add each encoded chunk directly to its buffer without requiring an entire segment.
What Is WebRTC
WebRTC is a combination of standards, protocols, and JavaScript APIs that enables real-time communications (RTC, hence its name). Users connecting via Chrome, Firefox, or Safari can communicate directly through their browsers — enabling sub 500ms latency.
How WebRTC Works
The WebRTC framework creates a near-simultaneous exchange of communication, utilizing a peer-to-peer connection between browsers without requiring plug-ins. Specifically, it employees three HTML5 APIs built into Chrome, Firefox, and Safari to allowing direct browser-based communication.
Low-Latency CMAF vs. WebRTC: Delivery Speed
Without a doubt, WebRTC comes in first place in the latency race. It takes less than 500 milliseconds to get the video and audio data from one browser to another, enabling the real-time communications that WebRTC takes as its namesake.
CMAF comes in at sub-three-second delivery, but again, that’s only when deployed with chunked encoding and chunked transfer encoding.
Low-Latency CMAF vs. WebRTC: Scalability
Low-latency CMAF is the best route to take any time scalability is a concern. For one thing, the format is optimized for single-encoding delivery to any device that supports the HLS or DASH protocols. This streamlines server efficiency and allows you to reach a broad audience.
WebRTC, on the other hand, simply wasn’t designed with scalability in mind. This bandwidth-intensive option requires each participating browser to connect with each other via a peer connection. To put that into perspective, WebRTC expert Tsahi Levent-Levi recommends staying shy of any more than 50 concurrent peer connections.
A WebRTC stream can be transcoded using a media server software to solve for scalability, but you’ll introduce latency in the process. And while a vast network of live-repeating servers can also be leveraged to support viral spikes, this tactic will likely break the bank.
Low-Latency CMAF vs. WebRTC: Broadcast Quality
Low-latency CMAF supports HD features such as 4K and high-frame-rate streaming. While these features can increase encode time, CMAF is still your best option for high-quality low-latency broadcasting.
By virtue of the fact that WebRTC was designed for video conferencing and similar use cases, quality wasn’t a primary goal. WebRTC sacrifices B-frames from the GOP structure to enable real-time delivery, which can impact quality. And with bandwidth concerns top of mind when it comes to WebRTC, limiting the frame rate is also a good idea.
Low-Latency CMAF vs. WebRTC Use Cases
If you’re looking for speed, WebRTC reigns supreme. But when it comes to quality, scale, and affordability, low-latency CMAF is a much better option. For that reason, we recommend identifying what’s best for your specific use case based on these criteria.
The table below lists which technology we recommend for several streaming scenarios.
Low-Latency CMAF | WebRTC |
One-to-many interactive streaming Live sports and e-gaming Online gambling/auctions Large-scale product demos | One-to-few interactive streaming Video conferencing for small groups Audio/video calling Small-scale product demos |
To summarize, let’s take an example from above: “an emergency responder communicates with her commander via a live-streaming bodycam.”
If this line of communication is limited to a small team and urgency takes priority over anything else, WebRTC would be the best route. Alternatively, if the bodycam broadcast needs to be streamed concurrently to an ecosystem of viewers, then low-latency CMAF would be better suited.
When weighing your options between WebRTC and CMAF, it all comes down to what you’re trying to achieve. Luckily, Wowza’s platform can be used to build hybrid workflows that leverage the benefits of different protocols along the path from capture to playback.