Review: Real-Time Streaming at Scale for Wowza Video
Tim Siglin is a product reviewer for streaming solutions and the founder of Help Me Stream Research Foundation
Wowza adds feature that merges benefits of legacy and newer delivery solutions
Wowza contacted me recently to review a new feature of Wowza Video that enables real-time streaming at scale. I’ve had the chance to look at several real-time solutions over the past 25 years, both in my role as product reviewer for a number of magazines and online newsletters (including Streaming Media magazine) and more recently as the founding executive director of the industry’s first charity-based research firm, the 501(c)3 Help Me Stream Research Foundation, so I was delighted to put Wowza’s solution through its paces.
Readers of this review have invariably felt the pain points of interactive video over the past 18 months: inconsistent delivery of content that’s intended to be very low latency, unpredictable audience size, and an overall lack of interactivity that plagues traditional streaming.
Why Did Streaming Switch to HTTP?
One of the main reasons that most of today’s streaming has significant latency is that it’s been purpose-built to do so. About a decade ago, give or take, the industry decided to replace traditional streaming server software with generic HTTP-based web server software.
There were four main benefits to taking this HTTP approach:
- More generic, less robust hardware could be used for HTTP delivery
- HTTP traffic was already open on most networks (no need to open special streaming only ports)
- On-demand content could be cached closer to large pockets of end users
- Ability to scale at levels that streaming servers (both hardware and software) were unable to achieve at anywhere near a reasonable cost.
Unfortunately, all the benefits of HTTP segment-based delivery came at the expense of latency. Latencies increased significantly to accommodate for desired scale, which worked beautifully for on-demand content delivery, and allowed the streaming industry to grow. But those of us who had been around since the inception of streaming — or who, like me, came from a background in zero-latency video conferencing — knew it was a matter of time before HTTP streaming ran into the brick wall of low latency needs.
I remember writing an article in 2018 bemoaning the fact that we’re not able to get to scale:
“In the short run, the industry needs innovation that helps get to scale, at lower latencies,” I wrote. “Everyone agrees on this. But what we don’t need is another scenario like the one we’ve faced over the past 6 years, where the expedient scalable solution (HLS) fundamentally painted the whole industry into a corner.”
Interactivity Is the Future. And the Past. But It Works This Time.
Skip forward to early 2020, and the streaming industry didn’t just need low-latency at scale to thrive, we needed it to address the global scale of interactivity that hit without warning as pandemic lockdowns took effect.
There were interactive solutions available to meet the challenge, most notably from the video conferencing industry I’d left in the late 1990s to join the streaming revolution. Yet, while they were interactive, they didn’t really scale, as anyone who tried last year to do Zoom webinars above 100 attendees could attest. This lack of scale at low latencies was mainly due to a combined lack of innovation in traditional video conferencing and the inherent limitations of scaling interactive video, both of which required specialized and costly servers to get to scale. In other words, the same issues that moved the streaming industry towards HTTP-based delivery in the first place.
Anyone that’s attempted to use these video conferencing or web conferencing solutions will quickly find, to their chagrin, that low latency at scale is a relative term. For those during the 2020 lockdowns attempting to host all-hands meetings or webinars, the first 100 people could join with full access to polling, chat, Q&A and other features, but the remaining attendees were relegated to watching on non-interactive platforms like YouTube or with latencies ranging from five to forty-five seconds.
To say many solutions available at the outset of the pandemic yielded poor user experience would be an understatement. For users to truly interact with others and experience the stream “in real time” requires glass-to-glass delivery of well under a second — and that includes all the steps in the delivery chain, from the encoding time at the presenter’s camera to the decoding and playback on the screens of hundreds, thousands, or even millions of viewers.
Why Live HTTP Streaming Isn’t Fast Enough
Anyone who has attempted to use emerging low-latency solutions such as Low-Latency HLS (LL-HLS) realizes that HTTP-based delivery still faces a greater than one-second barrier; these modified HTTP solutions still run in the 2-3 second range based on our testing here at Help Me Stream Research Foundation.
WebRTC to the Rescue?
So we were curious, when Wowza approached Help Me Stream about a review of their Real-Time Streaming at Scale in the Wowza Video, whether we’d face similar latencies, or a lack of scale, or both.
Our basic finding are that, with a few pre-release quirks that Wowza is looking into, the Real-Time Streaming at Scale solution works as advertised.
We were able to successfully view a single published stream at latencies that averages about 400 milliseconds (ms) across a wide variety of devices, using only a browser-based publisher and browser-based viewer.
While we didn’t test to the scale of a million viewers that Wowza advertises, we did test on three different types of networks, using multiple devices accessing the same live low-latency stream across at least two different networks — typically WiFi to a cable modem as well as a cellular data network — and two different browsers (mobile and desktop versions of Chrome and Safari) each time we performed a test.
And to make sure the low-latency offering didn’t need any additional applications, which many low-latency solutions require, we tested both publishing and playback strictly in a browser-native environment, without the aid of a any specialized plugins or downloaded publisher or player app. The fact that everything worked in a browser was in itself impressive.
Latencies were also low enough, even across both LAN and cellular networks, that we found the audio “feedback loop” sweet spot that only comes with analog sound systems or low-latency digital audio broadcasts. One area of interest was the fact that, while we experienced some video latency shifting (where browser playback in the background might slowly lose synch with browser playback in the foreground), the audio latency always remained consistent enough to generate this audio feedback loop.
Beyond our initial browser-based tests, we were also able to broadcast via WebRTC from a customized version of OBS (the well-known open-source video mixer solution) that’s been modified to use WebRTC as the output. In this instance, latencies averaged around 600ms.
Finally, Wowza offers RTMP, a legacy low-latency technology, as the publishing engine for its Real-Time Streaming at Scale feature.
RTMP is almost as old as the underlying protocols that power WebRTC, such as RTSP and SIP, but comes with the benefit of being a TCP-based transport protocol — in contrast to legacy UDP transport, which isn’t implemented on as many routers as TCP — that’s withstood the test of time from the early days of the streaming industry.
If the last paragraph has you thinking “everything old is new again” you’re not alone, because we’re not only using RTMP quite a bit longer after its initial required server has been deprecated (think Flash Media Server), but the streaming industry is finding new uses for RTMP. One of those is RTMP’s ability to generate low-latency encodes that can now be scaled by solutions like Wowza’s that the original RTMP streaming servers were incapable of performing.
The downside of RTMP and its TCP-based streaming is that there’s about an additional half-second of latency added on average to the publishing side, which means an average of about one second of latency. According to Wowza’s Barry Owen, who is the company’s vice president of solutions engineering, the ability to use RTMP at scale without requiring a specialized player is very attractive to a number of its potential customers.
Workflow: Real-Time Streaming at Scale for Wowza Video
“It’s worth noting that so far all of the customers in our pipeline have RTMP workflows they wish to use,” said Owen. “None have expressed an interest in broadcasting from the browser, so while some of those customers are candidates for the WebRTC version of OBS, all are currently using either hardware or other software RTMP encoders in their workflows.”
It’s a valid point, but it also shows just how entrenched RTMP continues to be in the streaming industry. Even those who have done at-scale live HTTP delivery often start with an RTMP stream, which is then either transrated, transmuxed (or both) before being segmented into HLS or DASH fragments for delivery 10-30 seconds later.
RTMP’s continued popularity also shows how WebRTC may find more traction on the delivery side — as a way to truly deliver low-latency at scale, even if that scale is around a second of latency — rather than on using WebRTC on the ingest side.
In our tests of OBS for WebRTC, we found consistent latency across networks, including consistency of playback on the same LAN that OBS was broadcasting from. Interestingly, our cellular data playback test exhibited slightly lower latencies than the playback via a WiFi access point attached to a cable modem. But the difference appeared to be less than 250 ms between all the devices on each network, meaning overall latency from publish glass to playback glass fell between 400–650 ms when using the OBS WebRTC ingest.
For customers wanting to implement the Real-Time Streaming at Scale feature in Wowza Video, the company is currently offering Professional Services assistance to take playback integration from pilot project to production scale.
Currently, too, the only way to call the service is via the Wowza Video API.
“A customer can call a Wowza Video API call to provision a stream,” said Owen, noting that UI access is pending. “The API call returns you a stream key, a publish token, and an optional RTMP URL.”
A typical API call might return something like this
{
"real_time_stream": {
"id": "2kbkl5yp",
"name": "TSTestRealTimeStream",
"stream_name": "<<STREAM-NAME>>",
"token": "<<STREAM-TOKEN>>",
"rtmp_url": "<<RTMP-URL>>",
"state": "active",
"created_at": "2021-09-13T22:18:52.000Z",
"updated_at": "2021-09-13T22:18:52.000Z"
}
}
Next Steps
So what quirks did we find, and how are they being addressed?
Because we tested the pre-release version of Wowza’s real-time streaming publisher, which allows WebRTC streaming to Wowza’s Video directly from a browser, we worked closely with Wowza to identify and address any major issues.
Over the course of the 10 days of testing, we tested three versions of the publisher, using both Apple’s Safari (version 15.0 build 17612.1.28.5) and Google Chrome (version 93.0.4577.82) on a MacBook Air using the Big Sur macOS operating system.
One impressive finding to note is that multiple players (one each on multiple devices) used on intermittent networks (such as a WiFi signal that’s exhibiting intermittent connectivity issues) all maintained viewing synchronization between players. This was true even if the latencies increased above the 600 ms range, meaning all players were equally delayed.
Testing anomaly | Status | Notes |
Integrated laptop camera and microphone on MacBook Air would not connect to publisher. | Resolved | Wowza updated publisher security settings to accommodate Apple’s newest Safari browser settings. |
Latency shifting, when browser window with publisher is in background. Results in longer-than-average latencies. | Ongoing | Chrome exhibits lowest latency shifting, while Safari latency could exceed 2.0 seconds; subsequent publisher versions lowered Safari latency shift to ~1.5 seconds |
Viewer count inconsistencies in publisher. | Ongoing | If a published stream abruptly ends and is restarted, viewer counts don’t accurately reflect those who were already viewing the stream. Wowza says this will be addressed in an upcoming version of the SDK. |
Players remain “live” even after a stream ends. | Ongoing | For those who use a permanent or hard-coded stream key in publisher, any player left open after a published stream ends will be able to view the next stream once it begins to publish. Wowza is addressing this issue, noting “the ideal scenario is probably to send a signal from the platform when the publisher stops for a certain period of time that the players can see as an event and reset the stream.” |
Auto-view | Ongoing | Chrome automatically begins viewing when URL is entered, but Safari viewing inconsistently requires clicking the “play” button at the bottom of the browser window. This very well could be a Safari version mismatch, but Wowza is looking into the issue as it appears across all three versions of publisher we tested |
Hand-off between networks | Ongoing | Viewing content on one network (e.g., Wi-Fi) then switching to another network (e.g., LTE cellular) results in playback continuing for about 5 seconds, with a portion of the live published stream playing back on the new network, but then viewing freezes and browser viewer needs to be refreshed. Wowza is exploring this issue. |
There’s one other note from our testing: Wowza is using a version of OBS for WebRTC that not only allows customers who already have RTMP workflows to use OBS as typical, but also to explore the option of using the lower-latency WebRTC on the publishing side.
The version of OBS that Wowza is using works quite well. There’s only one issue we encountered: an NDI error dialog box continues to appear at the launch of OBS, even if an NDI driver is not present. Wowza is addressing, but it appears to be based on an issue within the Github OBS distribution that Wowza is currently using, as that version on Github is a bit long in the tooth.
Conclusion
So what’s our bottom line? With the exception of the testing anomalies noted above, the Real-Time Streaming at Scale feature in the Wowza Video works well, and appears ideal for the one-to-many or few-to-many low-latency publishing needs of auction houses, sportsbooks, interactive education, and other key low-latency markets.
Wowza also plans to offer a software development kit (SDK) that will allow customers to customize both the publisher and, more importantly, the player, to better integrate both into their workflow. If an RTMP workflow is already in place, the SDK would only be required for customizing the player to fit into existing applications.