How Airbnb does Continuous Delivery

May 13, 2022

Howdy!

Today we’ll be talking about

Continuous Delivery at Airbnb
- Airbnb recently shifted from a Ruby on Rails monolith to a microservices architecture. Due to this, they had to change their deployment platform.
- They decided to adopt Spinnaker, an open source delivery platform developed at Netflix and Google
- Why they picked Spinnaker and how they changed deployment platforms with a globally distributed team of thousands of engineers.
Plus, some tech snippets on
- The Chop the Monolith Architecture Pattern
- Building an Operating System in Rust
- PyCon 2022 Highlights
- Facebook transfers Jest over to OpenJS

We have a new question from Microsoft and a solution to our last question on dynamic programming. We give the top down and the bottom up DP solutions. The bottom up solution is slightly more space efficient.

Continuous Delivery at Airbnb

Airbnb has recently migrated from a Ruby on Rails monolith to a Services-Oriented Architecture (SOA).

This migration helped Airbnb scale their application, but it also introduced new challenges.

One of these challenges was around Airbnb's Continuous Delivery process and how they had to adapt it to the new services architecture.

Jens Vanderhaeghe is a senior software engineer at Airbnb and Manish Maheshwari is a lead product manager. They wrote a great blog post on Airbnb's new continuous delivery process and how they migrated.

Here’s a summary

Previously, Airbnb used an internal service called Deployboard to handle their deploys. Deployboard worked great when Airbnb was using a Ruby on Rails monolith but over the past few years the company has shifted to a Microservices-oriented Architecture.

A microservices architecture means decentralized deployments where individual teams have their own pipeline.

Airbnb needed something more templated, so that each team could quickly get a standard, best-practices pipeline, rather than building their own service from scratch.

Spinnaker is an open source continuous delivery platform that was developed internally at Netflix and further extended by Google.

Airbnb decided to adopt Spinnaker because it

was shown to work at Airbnb's scale by Google and Netflix
allows you to easily plug in custom logic so you can add/change functionality without forking the core codebase
automates Canary analysis. Canary deployments let you expose the new version of the app to a small portion of your production traffic and analyze the behavior for any errors. Spinnaker helps automate this.

Migrating to Spinnaker

Airbnb has a globally distributed team of thousands of software engineers. Getting all of them to shift to Spinnaker would be a challenge.

They were particularly worried about the Long-tail Migration Problem, where they could get 80% of teams to switch over to the new deployment system but then struggle to get the remaining 20% to switch over.

Being forced to maintain two deployment systems can become very costly and is a reliability/security risk because the legacy system gets less and less maintenance/attention over time.

To prevent this, Airbnb had a migration strategy that focused on 3 pillars.

Focus on Benefits
Automated Onboarding
Provide Data

Focus on Benefits

Airbnb started by encouraging teams to adopt Spinnaker voluntarily.

They did this by first onboarding a small group of early adopters. They identified a set of services that were prone to causing incidents and switched those teams over to Spinnaker.

The automated Canary analysis quickly demonstrated its value to those teams as well as the other features that Spinnaker provided.

These early adopters ended up becoming evangelists for Spinnaker and spread the word to other teams at Airbnb organically. This helped increase voluntary adoption.

Automated Onboarding

As more teams started adopting Spinnaker, the Continuous Delivery team at Airbnb could no longer keep up with demand. Therefore, they started building tooling to automate the onboarding process to Spinnaker.

They created an abstraction layer on top of Spinnaker that let engineers make changes to the CD platform with code (IaC). This allowed all continuous delivery configuration to be source controlled and managed by Airbnb's tools and processes.

Data

The Continuous Delivery team also put a great amount of effort into clearly communicating the value-add of adopting Spinnaker.

They created dashboards for every service that adopted Spinnaker to show metrics like number of regressions prevented, increase in deploy frequency, etc.

Final Hurdle

With this 3 pillar strategy, the vast majority of teams at Airbnb had organically switched over to Spinnaker.

However, adoption began to tail off as the company reached ~85% of deployments on Spinnaker.

At this point, the team decided to switch strategy to avoid the long-tail migration problem described above.

Their new plan consisted of

1. Stop the bleeding - Stop any new services/teams from being deployed using the old continuous delivery platform.

2. Announce deprecation date - Announce a deprecation date for the old continuous delivery platform and add a warning banner at the top.

3. Send out automated PRs - Airbnb has an in-house refactor tool called Refactorator that helped with making the switch to Spinnaker easier.

4. Deprecate and post-deprecation - On deprecation date, they had code in-place that blocked deploys from the old continuous delivery platform. However, they had exemptions in-place for emergencies where the old system had to be used.

Conclusion

With this strategy, Airbnb was able to get to the 100% finish line in the migration.

This migration serves as the blueprint for how other infrastructure-related migrations will be done at Airbnb.

Read the full article for more details.

RocaNews

News at it should be: No fearmongering, no spin

In 2020, the 3 co-founders of RocaNews quit their jobs because they hated the news. Don't we all?

The news was negative, partisan, and alarmist. They wanted news that didn't obsess over politics. They wanted to know what was going on in the world, but wanted facts – not opinions. So they created RocaNews.

The free newsletter covers the most interesting events happening in the world, and is designed to fit into your day.

Join the 1.2 million+ people who start their morning with RocaNews!

Subscribe Today!

sponsored

Tech Snippets

Chopping the Monolith - This is a great post that is a bit critical about all the hype around Microservices. Instead, the author advocates a strategy of “chopping the monolith”. Find the part of the monolith that is changing the most frequently and chop that part of the monolith off into another service. This way, that part can be deployed independently and it increases developer productivity.
Writing an OS in Rust - This is a great series of blog posts that creates a small operating system with Rust. You can view all the code in this github repo.
PyCon 2022 Highlights - PyCon US 2022 happened last week and there were a ton of great talks about the past, present and future of Python. Eric Matthes wrote a great blog post on his experience attending and he gave an overview of all the talks that he went to. He also gave some useful advice about attending conferences and getting the most out of them, so this is a great read if you plan on going to any develop conferences in the future.
Facebook transferred Jest to the OpenJS Foundation - Jest is the most popular JavaScript testing framework out there and has consistently received extremely high user satisfaction ratings. It was created in 2011 at Facebook and was previously part of Facebook’s open source collective. Now, Jest is joining the OpenJS foundation, which is also host to other projects like jQuery, Node.js, Electron and Webpack.

Interview Question

Write a function that sorts the elements in a stack so that the smallest elements are on top.

You can use an additional temporary stack, but you cannot copy the elements into any other data structure.

The stack supports the following operations

push
pop
peek
isEmpty

Previous Question

As a refresher, here’s the previous question

You are walking up a staircase with n steps.

You can hop either 1 step, 2 steps or 3 steps in a single time (you have long legs).

Write a function to count how many possible ways you can walk up the stairs.

Solution

We can solve this question with Dynamic Programming.

If we have n steps and we have a function numSteps, that tells us the number of possible ways we can walk up the stairs.

Then, the number of ways we can walk up the stairs is numSteps(n - 1) + numSteps(n - 2) + numSteps(n - 3).

Therefore, with top down DP, we can just recursively call each individual function and add them together to get our result.

Our base case will be if n == 0, in which case we just return 1.

For Bottom Up DP, we’d typically use a memo table.

However, in this question we’re just accessing the last 3 values (n - 1, n - 2, n - 3).

Therefore, we can solve this question using constant space by just using 3 variables to keep track of those values. We don’t have to use an entire array to maintain our memo table.

Here’s the Bottom Up DP code.

The time complexity is linear and the space complexity is constant.

Here’s one of our past tech dives in case you missed it!

Google File System

In 1998, the first Google index had 26 million pages. In 2000, the Google index reached a billion web pages. By 2008, Google was processing more than 1 trillion web pages.

As you might imagine, the storage needs required for this kind of processing were massive and rapidly growing.

To solve this, Google built Google File System (GFS), a scalable distributed file system written in C++. Even in 2003, the largest GFS cluster provided hundreds of terabytes of storage across thousands of machines and it was serving hundreds of clients concurrently.

GFS is a proprietary distributed file system, so you’ll only encounter it if you work at Google. However, Doug Cutting and Mike Cafarella implemented Hadoop Distributed File System (HDFS) based on Google File System and HDFS is used widely across the industry.

LinkedIn recently published a blog post on how they store 1 exabyte of data across their HDFS clusters. An exabyte is 1 billion gigabytes.

In this post, we’ll be talking about the goals of GFS and its design. If you’d like more detail, you can read the full GFS paper here.

Goals of GFS

The main goal for GFS was that it be big and fast. Google wanted to store extremely large amounts of data and also wanted clients to be able to quickly access that data.

In order to accomplish this, Google wanted to use a distributed system built of inexpensive, commodity machines.

Using commodity machines is great because then you can quickly add more machines to your distributed system (as your storage needs grow). If Google relied on specialized hardware, then there may be limits on how quickly they can acquire new machines.

To achieve the scale Google wanted, GFS would have to use thousands of machines. When you’re using that many servers, you’re going to have constant failures. Disk failures, network partitions, server crashes, etc. are an everyday occurrence.

Therefore, GFS needed to have systems in place for automatic failure recovery. An engineer shouldn’t have to get involved every time there’s a failure. The system should be able to handle common failures on its own.

The individual files that Google wanted to store in GFS are quite big. Individual files are typically multiple gigabytes and so this affected the block sizes and I/O operation assumptions that Google made for GFS.

GFS is designed for big, sequential reads and writes of data. Most files are mutated by appending new data rather than overwriting existing data and random writes within a file are rare. Because of that access pattern, appending new data was the focus of performance optimization.

Design of GFS

A GFS cluster consists of a single master node and multiple chunkserver nodes.

The master node maintains the file system’s metadata and coordinates the system. The chunkserver nodes are where all the data is stored and accessed by clients.

Files are divided into 64 megabyte chunks and assigned a 64 bit chunk handle by the master node for identification. The chunks are then stored on the chunkservers with each chunk being replicated across several chunkservers for reliability and speed (the default is 3 replicas).

The master node keeps track of the file namespace, the mappings from files to chunks and the locations of all the chunks. It also handles garbage collection of orphaned chunks and chunk migration between the chunkservers. The master periodically communicates with all the chunkservers through HeartBeat messages to collect its state and give it instructions.

An interesting design choice is the decision to use a single master node. Having a single master greatly simplified the design since the master could make chunk placement and replication decisions without coordinating with other master nodes.

However, Google engineers had to make sure that the single master node doesn’t become a bottleneck in the system.

Therefore, clients never read or write file data through the master node. Instead, the client asks the master which chunkservers it should contact. Then, the client caches this information for a limited time so it doesn’t have to keep contacting the master node.

GFS Mutations

A mutation is an operation that changes the contents or the metadata of a chunk (so a write or an append operation).

In order to guarantee consistency amongst the replicas after a mutation, GFS performs mutations in a certain order.

As stated before, each chunk will have multiple replicas. The master will designate one of these replicas as the primary replica.

Here are the steps for performing a mutation to a chunk:

The client asks the master which chunkserver is the primary chunk and for the locations of the other chunkservers that have that chunk.
The master replies with the identity of the primary chunkserver and the other replicas. The client caches this information.
The client pushes data directly to all the chunkserver replicas. Each chunkserver will store the data in an internal LRU buffer cache.
Once all the replicas have acknowledged receiving the data, the client sends a write request to the primary chunkserver. The primary chunkserver then applies the mutations to its state.
The primary chunkserver forwards the write requests to the other chunkservers. Each chunkserver then applies the mutation to their state.
The secondary chunkservers all reply to the primary chunkserver indicating that they’ve completed the operation.
The primary chunkserver replies to the client informing the client that the write was successful (or if there were errors).

GFS Interface

GFS organizes files hierarchically in directories and identifies them by pathnames, like a standard file system. The master node keeps track of the mappings between files and chunks.

GFS provides the usual operations to create, delete, open, close, read and write files.

It also has snapshot and record append operations.

Snapshot lets you create a copy of a file or directory tree at low cost.

Record append allows multiple clients to append data to a file concurrently and it guarantees the atomicity of each individual client’s append.

To learn more about Google File System, read the full paper here.

If you’d like to read about the differences between GFS and HDFS, you can check that out here.

A browsable Petascale Reconstruction of the Human Cortex

A connectome is a map of all the neural connections in an organism’s brain. It’s useful for understanding the organization of neural interactions inside the brain.
Releasing a full mapping of all the neurons and synapses in a brain is incredibly complicated, and in January 2020, Google Research released a “hemibrain” connectome of a fruit fly - an online database with the structure and synaptic connectivity of roughly half the brain of a fruit fly.
The connectome for the fruit fly has completely transformed neuroscience, with Larry Abbott, a theoretical neuroscientist at Columbia, saying “the field can now be divided into two epochs: B.C. and A.C. — Before Connectome and After Connectome”.
- You can read more about the fruit fly connectome’s influence here.
Google Research is now releasing the H01 dataset, a 1.4 petabyte (a petabyte is 1024 terabytes) rendering of a small sample of human brain tissue.
- The sample covers one cubic millimeter of human brain tissue, and it includes tens of thousands of reconstructed neurons, millions of neuron fragments and 130 million annotated synapses.
The initial brain imaging generated 225 million individual 2D images. The Google AI team then computationally stitched and aligned that data to produce a single 3D volume.
- Google did this using a recurrent convolutional neural network. You can read more about how this is done here.
You can view the results of H01 (the imaging data and the 3D model) here.
The 3D visualization tool linked above was written with WebGL and is completely open source. You can view the source code here.
H01 is a petabyte-scale dataset, but is only one-millionth the volume of an entire human brain. THe next challenge is a synapse-level brain mapping for an entire mouse brain (500x bigger than H01) but serious technical challenges still remain.
- One challenge is data storage - a mouse brain could generate an exabyte of data so Google AI is working on image compression techniques for Connectomics with negligible loss of accuracy for the reconstruction.
- Another challenge is that the imaging process (collecting images of the slices of the mouse brain) is not perfect. There is image noise that has to be dealt with.
- Google AI solved the imaging noise by imaging the same piece of tissue in both a “fast” acquisition regime (leading to higher amounts of noise) and a “slow” acquisition regime (leading to low amounts of noise). Then, they trained a neural network infer the “slow” scans from the “fast” scans, and can now use that neural network as part of the connectomics process.

Quastor System Design Case Studies