How Streaming Video on the Internet Works
Hey Everyone,
Today we’ll be talking about
How Notion sharded their Postgres databases
Notion sharded their Postgres monolith into a horizontally-partitioned database fleet.
What the Postgres VACUUM process is and why it influenced Notion’s decision to shard.
Application-level sharding and how to pick a shard key.
How Video Works on the Internet
What is HLS and how does it work
The MP4 and WebM formats
How video files are delivered and the origin server vs. the CDN
Plus, a couple awesome tech snippets on
How a Single Line of Code Made a 24-core Server Slower Than a Laptop
Understanding the Power of LISP
Papers We Love - a community for going through academic computer science papers
A free introductory textbook on Algorithmic Graph Theory
We also have a solution to our last coding interview question on backtracking to find IP addresses, plus a new question from Google.
Questions? Please contact me at arpan@quastor.org.
Quastor Daily is a free Software Engineering newsletter sends out FAANG Interview questions (with detailed solutions), Technical Deep Dives and summaries of Engineering Blog Posts.
How Video Works
Leandro Moreira is a Lead Software Engineer at Globo where he works on their live video streaming platform and infrastructure.
He wrote a great blog post on all the engineering behind how videos play on your computer (adaptive bitrate streaming, HLS, etc.), how videos are delivered to your computer (CDNs, Multi-CDNs, etc.) and how film is processed into digital video (Codecs, Containers, FFMPEG, etc.).
Here’s a summary of some parts from the post
Playback
When you come across a website that has a video player embedded in it, there’s quite a bit going on behind the scenes.
You have the player UI, with the pause/play button, subtitle controls, video speed and other options.
Players will support different options around DRM, ad injection, thumbnail previews, etc.
Behind the scenes, modern video platforms will use adaptive bitrate streaming to stream the video from the server.
Adaptive bitrate streaming means that the server has several different versions of the video (known as renditions) and each version differs in display size (resolution) and file size (bitrate).
The video player will dynamically choose the best rendition based on the user’s screen size and bandwidth. It will choose the rendition that minimizes buffering and gives the best user experience.
HLS
HTTP Live Streaming (HLS) is a protocol designed by Apple for HTTP-based adaptive bitrate streaming. It’s the most popular streaming format on the internet.
The basic concept is that you take your video file and break it up into small segments, where each segment is 2-12 seconds long.
If you have a 2 hour long video, you could break it up into segments that are 10 seconds long and end up with 720 segments.
Each of the segments is a file that ends with a .ts
extension. The files are numbered sequentially, so you get a directory that looks like this
segments/
00001.ts
00002.ts
00003.ts
00004.ts
00005.ts
The player will then download and play each segment as the user is streaming. It will also keep a buffer of segments in case the user loses network connection.
Again, HLS is an adaptive bitrate streaming protocol, so the web server will have several different renditions (versions) of the video that is being played.
All of the renditions will be broken into segments of the same length. So, going back to our example with the 2 hour long video, it could have 720 segment files at 1080p, 720 segment files at 720p, 720 segment files at 480p.
All the segment files are ordered and are each 10 seconds in length.
The player will then look at the amount of bandwidth available and make the best guess as to which rendition’s segment file it should download next.
If your network connection slows down while you’re watching a video, the player can downgrade you to a lower quality rendition for the next segment files.
When your connection gets faster, the player can upgrade your rendition.
MP4 & WebM
An alternative is taking an HTML <video>
element and adding an “src” attribute that points directly to an MP4 or WebM file.
This is called pseudo-streaming or progressive download, where the video file is downloaded to a physical drive on the user’s device.
Typically, the video is stored in the temporary directory of the web browser and the user can start watching while the file is being downloaded in the background.
The user can also jump to specific points in the video and the player will use byte-range requests to estimate which part of the file corresponds to the place in the video that the user is attempting to seek.
What makes MP4 and WebM playback inefficient is that they do not support adaptive bitrates.
Every user who wants to watch the file buffer-free must have an internet connection that is fast enough to download the file faster than the playback.
Therefore, when you are using these formats you have to make a tradeoff between serving a higher quality video file vs. decreasing the internet connection speed requirements.
Delivery
When delivering video to your user, there’s two primary components
the origin server
the content delivery network
The origin server is the source of truth. It’s where the developer uploads the original video files.
The CDN will then pull files from the origin server and cache that file on a bunch of interconnected servers around the world (in locations that are close to your users).
That way, when users want to request the file, they can do so from a server in the CDN. This is way faster (and much more scalable) than your origin server sending the entire file to all your users.
Many enterprises will choose a Multi-CDN environment, where the load is distributed among multiple CDNs. This improves the user experience by giving them more servers to choose from and improving the availability of your website.
This is a brief summary from the blog post.
You can read the full post here.
Quastor Daily is a free Software Engineering newsletter sends out FAANG Interview questions (with detailed solutions), Technical Deep Dives and summaries of Engineering Blog Posts.
Tech Snippets
Understanding the Power of LISP - You’ll frequently hear programmers talk about the beauty of LISP. Eric Raymond said understanding LISP is a profound enlightenment experience.
This is a great blog post on LISP, what makes it cool and how it can make you a better programmer.
Papers We Love - This is an awesome community of computer scientists that hosts weekly meetups to discuss various papers in the academic CS field.
There are chapters in Seattle, New York, San Francisco, Pune, Berlin, Chicago, Mumbai, London, Singapore, and a ton of other cities/countries.
You can view their YouTube channel here.
A free introductory textbook on Algorithmic Graph Theory - Algorithms for reading and manipulating trees and other types of graphs are obviously incredibly important in CS. This textbook is an awesome introduction to graph theory and the important algorithms involved.
It covers the basics of Graphs like Graph transformations, identifying Isomorphic graphs, minimum spanning trees, etc. but also delves into more advanced topics.
You can download the full textbook for free here.
How to build a second brain as a software engineer - A second brain is a personal knowledge management system (PKM) that you can use to organize your thoughts and references. Two popular ways of doing this are Zettelkasten and The PARA method.
This article goes through how to build your own PKM so you can become a better engineer.
How Notion sharded their Postgres Database
Notion is an app that is meant to serve as a personal (or corporate) workspace.
You can store notes, tasks, wikis, kanban boards and other things in a Notion workspace and you can easily share it with other users.
If you’ve been a Notion user for a while, you probably noticed that the app got extremely slow in late 2019 and 2020.
Last year, Notion sharded their Postgres monolith into a fleet of horizontally scalable databases. The resulting performance boost was pretty big.
Sharding a database means partitioning your data across multiple database instances.
This allows you to run your database on multiple computers and scale horizontally instead of vertically.
When to Shard?
Sharding your database prematurely can be a big mistake. It can result in an increased maintenance burden, new constraints in application code and little to no performance improvement (so a waste of engineering time).
However, Notion was growing extremely quickly, so they knew they’d have to implement sharding at some point.
The breaking point came when the Postgres VACUUM process began to stall consistently.
The VACUUM process clears storage occupied by dead tuples in your database.
When you update data in Postgres, the existing data is not modified. Instead, a new (updated) version of that data is added to the database.
This is because it’s not safe to directly modify existing data, as other transactions could be reading it.
This is called Multiversion Concurrency Control (MVCC).
At a later point, you can run the VACUUM process to delete the old, outdated data and reclaim disk space.
If you don’t regularly vacuum your database (or have Postgres run autovacuum, where it does this for you), you’ll eventually reach a transaction ID wraparound failure.
So, you must vacuum your database or it will eventually fail.
Having the VACUUM process consistently stall is not an issue that can be ignored.
Application-Level vs. Managed
Sharding can be divided into two approaches
Application-Level Sharding - You implement the data partitioning scheme in your application code. You might direct all American users to one database and all Asian users to another database.
Third-Party Sharding - You rely on a third party to handle the sharding for you. An example is Citus, an open source extension for Postgres.
Notion decided to go with Application-Level sharding.
They didn’t want to go with a third party solution because they felt it’s sharding logic would be opaque and hard to debug.
Shard Key
In order to shard a database, you have to pick a shard key. This determines how your data will be split up amongst the shards.
You want to pick a shard key that will equally distribute loads amongst all the shards.
If one shard is getting a lot more reads/writes than the others, that can make scaling very difficult.
Notion decided to partition their database by workspace. Workspaces are the folders that contain all the pages, tasks, notes, etc.
So, if you’re a student using Notion, you might have separate Workspaces for all your classes.
Each workspace is assigned a UUID upon creation, so that UUID space is partitioned into uniform buckets.
Each bucket goes to a different shard.
How many Shards?
Notion ended up going with 460 logical shards distributed across 32 physical databases (with 15 logical shards per database).
This allows them to handle their existing data and scale for the next two years (based off their projected growth).
Database Migration
After establishing how the sharded database works, you still have to migrate from the old database to the new distributed database.
This is the 4 step process Notion used for the migration.
Double-write: Incoming writes are applied to both the old and new databases.
Backfill: Migrate the old data to the new database.
Verification: Ensure the integrity of data in the new database.
Switch-over: Actually switch to the new database. This can be done incrementally, e.g. double-reads, then migrate all reads.
Read the full details in the blog post!
Quastor Daily is a free Software Engineering newsletter sends out FAANG Interview questions (with detailed solutions), Technical Deep Dives and summaries of Engineering Blog Posts.
Interview Question
Given an integer, convert it to a Roman Numeral.
Example
Input: 3
Output: “III”
Input: 4
Output: “IV”
Input: 1994
Output: “MCMXCIV”
Here’s the question in LeetCode
Previous Solution
As a reminder, here’s our last question
You are given a string s
that contains only digits.
Return all possible valid IP addresses that can be obtained from s
(without changing the order of the digits in s
or removing any of the digits).
Example
Input: “25525511135”
Output: ["255.255.11.135","255.255.111.35"]
Input: "0000"
Output: ["0.0.0.0"]
Input: "010010"
Output: ["0.10.0.10","0.100.1.0"]
Input: "101023"
Output:["1.0.10.23","1.0.102.3","10.1.0.23","10.10.2.3", "101.0.2.3"]
Here’s the question in LeetCode
Solution
This question can be solved with backtracking, where you can imagine a search tree of all the possible IP addresses.
With backtracking, we search the search-tree in a depth-first manner and we backtrack (or recurse) whenever we reach a candidate solution that cannot be an IP address.
Remember that an IP address consists of 4 integers that are all between 0 and 255.
We start off with our result
array, which keeps track of all the IP addresses that can be formed.
Then, we call our backtrack
function with an empty string as our current candidate and we pass in s
.
When our candidate has 4 distinct integers and we’ve used up all the digits in s
, then we can add the candidate to result
.
Otherwise, if we need more integers, then we can continue to iterate s
.
We’ll run a for loop that iterates through the next 1 - 3 digits in s
and uses them as potential integers for our IP address.
Then, we’ll remove those digits from s (since we’ve used them) and backtrack on the rest of s
with our current candidate solution.
Quastor Daily is a free Software Engineering newsletter sends out FAANG Interview questions (with detailed solutions), Technical Deep Dives and summaries of Engineering Blog Posts.