Live Streaming System Design

aditya goel
6 min readJan 26, 2023

Question :- Why Live Streaming of Video Content is challenging ?

Answer → It’s challenging because of following reasons :-

  • Because video content is sent over the Internet in the near real time.
  • The entire process of Video Processing is compute Intensive. Sending a large volume of video over the Internet takes time.

Question :- How does any Video goes from Streamer‘s end ?

Answer → Here is the process involved in uploading the video :-

Step #1.) The streamer starts the video-stream. The source could be any video and audio wired upto an encoder, something like the popular open-source OBS Software.

Note: Some platforms like Youtube provide easy to use software to stream from a browser with a webcam OR directly from a mobile phone camera.

Question:- What’s the purpose of Encoders ?

Answer → The job of the Encoder is to package the video-stream and send it in a Transport-Protocol, that the live streaming platform can receive for further processing. The most popular protocol is called RTMP (i.e. Real Time Messaging Protocol).

Question:- Explain something about the RTMP ?

  • RTMP is a TCP based protocol. It started long time ago as the video-streaming protocol for Adobe Flash.
  • The Encoders can easily speak RTMP OR it’s secure variant RTMPS.

Question:- Explain something about the SRT ?

  • There is another popular option called as SRT i.e. Secure Reliable Transport Protocol.
  • SRT is UDP based protocol. It promises lower latency and better resilience to poor network conditions.
  • Most of the streaming platforms, might not support the SRT yet.

Question:- Showcase a comparative analysis of various protocols, in order to perform the Video Broadcating from the Client’s end ?

Question:- How to provide the best upload condition for the Streamer ?

Answer → In order to provide the best upload conditions for the Streamer :-

  • Most Live-Streaming platforms provide the point of presence servers worldwide.
  • The Streamer connects to the closest Point of Presence Server.

Question:- How does Streamer connects to the closest POP Server ?

Answer → This happens automatically with either :-

  • DNS Latency based routing.
  • Any-cast Network.

Question:- What happens after the video reaches to the closest POP Server ?

Answer → Once the stream reaches to the closest POP Server, it is transmitted over a fast and reliable backbone network, for further processing to the Platform/Data-Centre.

Question:- What’s the main goal of transmitting the video to the Platform ?

Answer → The main goal of transmitting the video to the platform is to offer the video-stream in different qualities and bit-rates.

Note: The exact processing steps may vary from platform to platform and output streaming formats.

Question:- What’s the Adaptive BitRate Streaming ?

Answer → Modern Video Players automatically choose the best video resolution and bit-rate, based on quality of viewer’s internet connection and can adjust on the fly, by requesting different bit-rates as the network condition changes.

Question:- Explain in detail the general steps performed by the Platform ?

Answer → Following are the general steps, followed by the Platform :-

Step #1.) The incoming video-stream is transcoded to different resolutions and bit-rates. Basically, these are different quality levels for the video as shown below :-

Step #2.) The transcoded stream is divided into smaller video-segments, which are of few-seconds in length.

Question:- How does the process of transcoding looks like from CPU prospective ?

Answer → This step of Transcoding/Segmentation is highly compute intensive and therefore the Input stream is usually transcoded to different formats in parallel.

Question:- What happens after the Transcoding process ?

Answer is Packaging → The collection of video-segments from the Transcoding process are packaged into different live streaming formats, that video-players can understand.

Question:- What’s the most popular format for the Live-Streaming ?

Answer → The most common format for doing the Live-streaming is : “HTTP LiveStreaming (i.e. HLS)”. HLS was invented by Apple in 2009. It’s the most popular streaming format, even till date.

Question:- Explain something about the HLS-Stream ?

Answer → An HLS-Stream consists of a manifest file and a series of video-chunks :-

  • Each chunk contains a video-segment as short as few seconds.
  • The Manifest File is a directory to tell the video-player, what all output formats are and where to load the video-chunks over the HTTP.

Question:- Explain something about DASH ?

Answer → DASH stands for Dynamic Adaptive Streaming over HTTP and it’s yet another popular streaming format. Apple devices doesn’t natively supports DASH.

Question:- How do we reduce the Last-Mile-Latency ?

Answer → The resulting HLS file and video-chunks are also cached by the CDN layer. This reduces the Last-Mile-Latency to the Viewers.

Question:- What happens next in the process of Video-Delivery ?

Answer → Now, the video starts to arrive at the Viewer’s video-player. Here is the end to end process of Live Streaming :-

Question:- What’s the overall Latency in end to end video-delivery process and how can we optimise the same ?

Answer → The “Glass To Glass Latency” of around 20 seconds is normal.

  • There are several factors that a Streamer OR Live-Streaming-Platform could tune, in order to improve the latency, by sacrificing the various aspects of the overall video quality.
  • The best thing that a Streamer can do to improve Latency is by optimising their local setup for the lowest latency from camera to streaming platform.

Question:- What are the different challenges for the Live Streaming Platform ?

That’s all in this blog. If you liked reading this, please do clap on this page. We shall see you in the next document.. Thanks and take care till then.

--

--

aditya goel

Software Engineer for Big Data distributed systems