The Architecture of Uber's API Gateway
How an engineering manager at Reddit judges candidates, building great UIs and scaling from 10k RPMs to 1 million RPMs over 2 months
Hey Everyone,
Today we’ll be talking about
The Architecture of Uber’s API Gateway
How the API Gateway Works
How an HTTP request flows through
The different components of the Gateway
Plus, a couple awesome tech snippets on
How an engineering manager at Reddit judges candidates
Collisions generated for Apple’s NeuralHash function
Tesla’s AI Day
The science of building great UIs
How Clubhouse scaled from 10k RPM (requests per minute) to 1 million RPM over the course of 2 months
We also have a solution to our last coding interview question, plus a new question from Apple.
Don’t forget to move our emails to primary, so you don’t miss them!
Gmail users—move us to your primary inbox
On your phone? Hit the 3 dots at the top right corner, click "Move to" then "Primary"
On desktop? Back out of this email then drag and drop this email into the "Primary" tab near the top left of your screen
A pop-up will ask you “Do you want to do this for future messages from quastor@substack.com” - please select yes
Apple mail users—tap on our email address at the top of this email (next to "From:" on mobile) and click “Add to VIPs”
Tech Snippets
Collisions generated for NeuralHash - In an email last week, we gave a technical overview of Apple’s NeuralHash hash function, the core of Apple’s new Child Sexual Abuse Material (CSAM) detection efforts.
In the past week, researchers have been able to extract NeuralHash from Apple’s latest operating system update and they made it available for testing.
It became clear that it’s not too difficult to generate adversarial images that produce collisions, where Apple’s NeuralHash hash function will generate the same hash function for two completely different images.
However, Apple says that adversarial collisions are a known limitation of perceptual hashing algorithms and that this was to be expected.
The company emphasized a secondary server-side hashing algorithm, separate from NeuralHash, the specifics of which are not public.
If an image that produced a NeuralHash collision were flagged by the system, it would be checked against the secondary system and identified as an error before reaching human moderators.
Tesla hosted their AI day last week and revealed the inner workings of their software and hardware infrastructure. You can view a replay here.
The event included discussions around
Reddit Interview Problems: The Game of Life - Alex Golec, an engineering manager at Reddit, goes through an interview problem he used to use to screen candidates. He talks about how he evaluated candidate responses to the question and why he thinks it’s a good question.
Funny enough, the question Alex goes through in this blog post is already published on LeetCode. You can view it here.
It might be interesting to try and solve the question yourself and see how far you can get before reading Alex’s solution (and evaluation criteria).
The Science of Great UI - A great talk that delves into great design and the guidelines that you should be thinking about.
Specific points mentioned are
Emphasis & Importance
Contrast & Readability
Proximity & Layout
Borders & Spacing
Fill & Corners
Reining in the thundering herd ⛈ Getting to 80% CPU utilization with Django - In early 2021, Clubhouse, a social app for audio conversations, went from less than 10,000 backend requests per minute to over 1,000,000 requests per minute (over the course of 2 months).
This is a blog post on how they scaled their existing tech stack (Python/Django with Gunicorn and NGINX) to handle the 100x increase in load with only 2 backend engineers.
The Architecture of Uber's API gateway
This is an article on the technical components of Uber’s API gateway.
Summary
When Uber’s ride sharing app makes a request to the backend, the first point of contact is Uber’s API gateway.
The API gateway provides a single point of entry for all of Uber’s apps and gives a clean interface to access data, logic or functionality from back-end microservices.
The API gateway is the place to implement things like rate limiting, security auditing, user access blocking, protocol conversion, and more.
How does the API Gateway work
A backend engineer at Uber will be working on their own microservice (you can read about how Uber handles microservices here).
Their microservice will have an API with it’s own configuration parameters: path, type of request data, type of response, maximum calls allowed, apps allowed, observability, etc.
The engineer can then configure these parameters in a UI for Uber’s API gateway. The UI walks the user through a step-by-step process for creating their API endpoint.
The gateway infrastructure will then convert these configurations into valid and functional APIs that can serve traffic from Uber’s apps.
How a request flows through the API gateway
The four components are the Protocol Manager, Middleware, Endpoint Handler and finally the Client.
Each of the components operates on the request object on the way in and the same components are run in the reverse order on the response object’s way out.
Protocol Manager - This is the first layer of the stack. It contains a deserializer and serializer for all of the protocols supported by the gateway. It can ingest any type of relevant protocol payload, including JSON, Thrift, or Protobuf.
Middleware - This layer handles things like rate limiting, authentication and authorization, etc. Each endpoint can choose to configure one or more middleware. If a middleware fails execution, the call short circuits the remainder of the stack and the response from the middleware will be returned to the caller.
Middleware is configured in a YAML file.
Endpoint Handler - This layer is responsible for request validation, payload transformation and converting the endpoint request object to the client request object based on the configured schema and serialization.
Client - This layer performs the request to the specific backend microservice. Clients are protocol-aware and generated based on the protocol selected during configuration.
The full blog post delves more deeply into each of these layers (and how users configure settings in the API gateway) and also talks about challenges faced and lessons learned.
If you’d like to read about how Uber thinks about scaling this API gateway, here’s another interesting blog post on that.
Interview Question
You are given the roots of two binary trees.
Return true if the second tree is a subtree of the first.
Example
Input: [3,4,5,1,2], [4,1,2]
Output: True
We’ll send the solution in our next email, so make sure you move our emails to primary, so you don’t miss them!
Gmail users—move us to your primary inbox
On your phone? Hit the 3 dots at the top right corner, click "Move to" then "Primary"
On desktop? Back out of this email then drag and drop this email into the "Primary" tab near the top left of your screen
A pop-up will ask you “Do you want to do this for future messages from quastor@substack.com” - please select yes
Apple mail users—tap on our email address at the top of this email (next to "From:" on mobile) and click “Add to VIPs”
Previous Solution
As a reminder, here’s our last question
You are given 3 strings: s1
, s2
and s3
.
Find whether s3
is formed by an interleaving of s1
and s2
.
s3
is an interleaving of s1
and s2
if it contains all the characters of s1 and s2 (and only the characters from s1 and s2) and the order of all the characters in the individual strings is preserved.
Input: s1 = “aabcc”, s2 = “dbbca”, s3 = “aadbbcbcac”
Output: True
Here’s the question in LeetCode.
Solution
We can solve this question with Dynamic Programming.
We’ll create three counter variables (c1
, c2
, c3
) that will iterate through the 3 given strings (s1
, s2
, s3
).
We can then create a function _interweave(c1, c2, c3)
that takes in the 3 counter variables and tries to find a possible interweaving of s1
and s2
that creates s3
.
Our function will first check if c1
, c2
and c3
are all equal to their respective string’s lengths (meaning we’ve iterated through all 3 strings).
If this is the case, then that means we’ve found an interweaving and we can return True
and terminate the function.
If this is not the case, then we’ll first check if s1[c1]
==
s3[c3]
.
If so, then we can continue down this possible solution. We’ll recursively call our function with the parameters (c1
+ 1
, c2
, c3
+ 1
).
We’ll also check if s2[c2]
== s3[c3]
.
If this is true, then we’ll check this possible solution. We recursively call our function with parameters (c1
, c2
+ 1
, c3
).
If either of those recursive function calls returned True
, then we can return True
. Otherwise, we return False
.
This function will result in a lot of repeat computations, we’ll add a memo table to cache all our computations.
Here’s the Python 3 code.