Robinhood's Tech Stack
Hey Everyone,
Today we’ll be talking about
Robinhood’s Tech Stack
Robinhood has been cloud-native since Day 1
Going the PaaS route vs. DIY
How they implemented Sharding in their backend
Plus, a couple awesome tech snippets on
MIT’s Distributed Systems class
Behind the scenes of CPython
The different kinds of tech debt
We also have a solution to yesterday’s difficult interview question. Plus, a new question from Microsoft.
If you’re not signed up for Quastor, join here!
Tech Snippets
Bracket Pair Colorizer is an extension that colors matching bracket pairs.
Due to the asynchronous way in which extensions communicate with the VSCode renderer, a massive speed up in the extension itself was impossible.
Therefore, the extension developer (CoenraadS) worked with engineers at Microsoft to reimplement the feature as part of the core of VSCode.
Now, the Bracket Pair colorizer uses a recursive descent parser (a top-down parsing technique) to build an Abstract Syntax Tree to describe the structure of all bracket pairs.
It also accesses VSCode’s token information so it can skip brackets that are part of comments or in a string.
The data structure chosen to build the Abstract Syntax Tree are 2-3 trees and the time complexity of colorizing brackets after user input went from linear to polylogarithmic.
Just a warning but this article is super technical and dives into quite a few areas of CS. It’s a great reminder of how seemingly simple tasks (colorize all the matching brackets) can be quite complex to implement efficiently.
A fantastic series of lectures on Distributed Systems by Robert Morris (co-founder of YCombinator).
The lectures cover actual applications, so there are lectures on ZooKeeper, Google Cloud Spanner, Apache Spark and Bitcoin!
Behind the scenes of CPython - A series of blog posts that digs into how CPython (the default and most widely used implementation of Python) works under the hood.
In software engineering, you’ll often have to choose between shipping a feature quickly (and later refactoring your code) vs. building the feature slowly (and implementing it in a more robust manner).
In many cases, it makes sense to take on technical debt, because you don’t know if the feature is something the customer wants.
It would be pointless to spend a huge amount of engineering time on a feature the customer didn’t even want!
Technical debt can be categorized into 3 buckets
Code
This is the most common type of tech debt. This refers to suboptimal code that has been committed.
An example is writing one big function that does everything. Later, you’ll have to refactor that function down into smaller components that are well designed.
Code technical debt is also the easiest to fix. If you have good tests set up, then refactoring the code base isn’t too bad.
Data
This refers to tech debt around the data model you’ve chosen.
Teams can sometimes pick a very flexible data model for early iteration and flexibility
Example - Let’s just put everything in one JSON object and then do all the filtering client side!
Your data models is one of the hardest things to get right, and also one of the hardest things to change.
Architecture
This refers to design considerations around the system architecture, runtime choices, interfaces, etc.
Changing your database from SQL to NoSQL (or vice-versa) can be very painful!
If you’re not signed up for Quastor, join here!
Robinhood
Robinhood is a Stock Brokerage app founded in 2013. It charges zero commissions on trades of stocks and ETFs and is also mobile-first. The majority of users place trades from their mobile devices.
Robinhood’s Tech Stack
Robinhood is cloud-native, based on AWS.
Robinhood paid $60 million dollars to AWS in 2020 for cloud hosting fees. This is a 5x increase from their 2019 cloud bill of $12 million dollars.
If you’d like to learn more about Robinhood’s engineering, check out their interview on the Software Engineering Daily podcast. We’ll summarize the interesting bits here.
When Robinhood was first getting started, they were a Python/Django shop, however they’ve been shifting towards Go.
They’ve also been microservices oriented, and most of their APIs are written in Python and Go. There is some Java and Rust in their codebase however.
Robinhood is built on AWS, so they use Amazon RDS (Relational Database Service) for their data store. They use Postgres as the database engine for RDS.
PaaS vs. DIY
Since they’re on AWS, an interesting choice that comes up is whether they should utilize AWS’s PaaS (Platform as a Service) offerings or if they should go the “Do It Yourself” route.
An example is Elasticsearch. At first, Robinhood utilized Amazon Elasticsearch Service and went the PaaS route. However, their specific workload wasn’t well suited for AWS’s PaaS offering.
Therefore, they made the decision to deploy Elasticsearch on an EC2 instance and modify the database so it suited their workload (Elasticsearch is open source).
Jaren Glover (tech lead at Robinhood) found that PaaS products work great when you play inside the guardrails in which they’re presented.
However, if you move from a generic compute to something domain specific, then you may start to bump against those guardrails and not have the quality of service that you want.
Another interesting challenge that comes up when building a Stock brokerage is how you manage time.
It’s very important to process orders in the same order that the customer placed them and also to route orders correctly relative to which customer placed which order first.
In order to do this, Robinhood relies on NTP - Network Time Protocol. NTP allows you to synchronize clocks between computers over a variable-latency network.
NTP works in a client-server type model, but can also be adapted for a peer-to-peer system.
NTP can usually maintain time to within tens of milliseconds over the public Internet, and can achieve better than one millisecond accuracy over LANs.
Robinhood (and other brokerages) are required to ensure their trading computers are synced with atomic clocks maintained by the National Institute of Standards and Technology (NIST), a non-regulating agency of the US Department of Commerce.
How Robinhood Scaled
Robinhood, despite being a brokerage, has experienced viral growth. The app is frequently ranked #1 in the iOS and Android app stores, a spot that is usually reserved for some type of social media app (Instagram, Snapchat, YouTube, etc.). It’s unheard of for a financial app to get that kind of growth.
In December 2019, Robinhood’s system received 100k requests per second at peak time. In June 2020, Robinhood’s system received 750k requests per second, a 7x load increase to the system in a 6 month period.
To respond to this, Robinhood set out to make their brokerage infastructure horizontally scalable. They accomplished this by implementing sharding into their system at the application layer (not just the database).
Sharding is where you split up a large dataset into smaller partitions (shards).
Robinhood created multiple shards where each shard held a subset of their users and every shard had its own application servers, database and deployment pipeline.
In order to make the divided system appear as one system (rather than independent shards), Robinhood built out several new layers.
Routing Layer - the routing layer handles routing external API requests to the correct shards. The layer first inspects and maps the request to a specific user. Then, it makes a synchronous API call to Robinhood’s shard mapping service to look up the shard ID for the user. After, it sends the request to the correct application server with that shard.
Aggregation Layer - the aggregation layer is an intermediary layer for internal API traffic. It enables other services to query data without knowing which shard it lives in, by joining data from multiple shards. It is a stateless service that fans our requests across shards and then merges the results and returns them.
Kafka Message Streaming - Robinhood uses Kafka streams for backend services to send messages to each other. However, Robinhood has to ensure that messages are consumed by the correct shard. To ensure this, Robinhood has every shard consume all messages, but only process messages that correspond to users that exist in that shard.
You can read more about Robinhood’s horizontal scaling efforts here.
If you’re not signed up for Quastor, join here!
Interview Question
You’re given an infinite complete binary tree.
Every level of the tree is filled with values from 1 to infinity.
However, the ordering of the nodes has been flipped for even numbered rows, where they go from largest to smallest.
Given the label of a node in this tree, return the labels in the path from the root of the tree to that row.
Example
Input: 14
Output: [1, 3, 4, 14]
Can you solve this in O(log n) time?
We’ll send the solution in our next email, so make sure you move our emails to primary, so you don’t miss them!
Gmail users—move us to your primary inbox
On your phone? Hit the 3 dots at the top right corner, click "Move to" then "Primary"
On desktop? Back out of this email then drag and drop this email into the "Primary" tab near the top left of your screen
A pop-up will ask you “Do you want to do this for future messages from quastor@substack.com” - please select yes
Apple mail users—tap on our email address at the top of this email (next to "From:" on mobile) and click “Add to VIPs”
Previous Solution
As a reminder, here’s our last question
You are given a string s.
You can convert s to a palindrome by adding characters in front of it.
Return the shortest palindrome that you can find by adding characters in front of s.
Example
Input: s = ”abcd”
Output: “dcbabcd”
Here’s the question in LeetCode
Solution
Another way to think about this question is that we want the longest palindromic prefix in s.
We want to look at all the substrings in s that start at index 0 (prefixes) and find the longest one that is a palindrome.
Once we do that, we can look at the corresponding suffix (all elements in s that come after the prefix) and put that in a new string.
Then, we reverse the new string and append it to the start of s. This makes s a palindrome by adding the fewest number of characters.
An example is if we have “aacecaaab” for s.
The longest palindromic prefix for s is “aacecaa”.
The suffix (the elements after our prefix up to the end) is just “ab”.
Now, we take a new string with our suffix and we reverse the new string, so we have “ba”.
Now, we add the new reversed string to the front of s which will give us "baaacecaaab".
"baaacecaaab" is our final answer!
Here’s the Python 3 code for this approach
The issue with this approach is that it’s too slow.
In the worst case, it takes quadratic time to find the longest palindromic prefix.
We need to speed it up.
One way is to use the Knuth Morris Pratt (KMP) Algorithm.
If you’re unfamiliar with the KMP algorithm, here’s a fantastic explanation of how it works and the time complexity.
The “building table” part of the KMP algorithm is exactly what we need to solve this question.
We’ll create a combined string that is just s, some separator, and then s reversed.
combinedString = s + “*” + s[::-1]
Now, we can run the KMP algorithm on combinedString.
The resulting KMP table will give us the longest matching prefix-suffix pair in combinedString.
We can find that by looking at the KMP table’s value for the last element in combinedString.
That value will represent the end of the longest palindromic prefix in s.
Now, we can just repeat the same process of finding the rest of the characters in s after the longest palindromic prefix, reversing them and appending them to the front of s.
Here’s the Python 3 code.