DoorDash's Migration from Python to Kotlin
Hey Everyone,
Today we’ll be talking about
Why DoorDash migrated from Python to Kotlin - Matt Anger wrote a great blog post on DoorDash’s migration from Python 2 to Kotlin for their backend.
DoorDash migrated from a Python 2 / Django monolith to a microservices architecture
They considered the pros/cons of Kotlin, Java, Go, Rust and Python 3. They picked Kotlin.
Some of the migration pains were educating engineers on the language, developing best practices for using coroutines, java interoperability and dependency management.
Best Books on Managing Software Complexity? - This is from a thread yesterday on the front page of Hacker News. The top three recommendations were
John K Ousterhout, A Philosophy of Software Design
Titus Winter (et al), Software Engineering at Google
Hanson and Sussman, Software Design for Flexibility
Plus some tech snippets on
Which Programming Paradigm Gives the Most Expressive Code?
Free Software Licenses Explained: MIT
Robert Morris’ class on Distributed Systems
Picking Your Tech Stack for Dummies
Questions? Please contact me at arpan@quastor.org.
Quastor is a free Software Engineering newsletter that sends out summaries of technical blog posts, deep dives on interesting tech and FAANG interview questions and solutions.
Why DoorDash migrated from Python to Kotlin
DoorDash is the largest food delivery app in the United States with more than 450 thousand restaurants, 20 million customers and 1 million deliverers.
Matt Anger is a Senior Staff Engineer at DoorDash where he works on the Core Platform and Performance teams.
He published a great blog post (May 2021) on DoorDash’s migration from Python 2 to Kotlin. Here’s a summary.
Summary
DoorDash was quickly approaching the limits of what their Django-based monolithic codebase could support.
With their legacy system, the number of nodes that needed to be updated added significant time to releases. Debugging bad deploys with bisection got harder and longer due to the number of commits each deploy had. The monolith was built with Python 2 which was also rapidly entering end-of-life.
Engineers at DoorDash decided to transition from the monolith to a microservices architecture. They also looked for a new tech stack to replace Python 2 and Django.
One of their goals was to only use one language for the backend.
Having one language would let them
Promote Best Practices - Having one language makes it easier for teams to share development best practices across the entire company.
Build Common Libraries - All engineers can share common libraries and tooling.
Change Teams - Engineers can change teams with minimal friction, which encourages more collaboration.
Picking the Right Coding Language
First, DoorDash engineers looked at the parts of their tech stack that would not change.
They had a lot of experience with Postgres and Apache Cassandra, so they would continue to use those technologies as data stores.
They would use gRPC for synchronous service-to-service communication, with Apache Kafka as a message queue.
In terms of the programming language, the choices in contention were Kotlin, Java, Go, Rust and Python 3.
Here’s the comparison they did…
After doing the comparison, they went with Kotlin. They had already done some testing around the language and it worked well.
Kotlin mitigated some of the pain points around Java like Null Safety and Coroutines.
Some of the growing pains they faced with Kotlin were
Educating DoorDash engineers on the language - Much of the online community around Kotlin is specific to Android dev, and there isn’t as much content on backend engineering.
To help engineers learn the language, they regularly held Lunch and Learn sessions and set up a slack channel for questions.
Avoiding coroutine gotchas - DoorDash used gRPC for service-to-service communication however gRPC Kotlin wasn’t available when they first made the switch. They used gRPC-Java, which lacked support for coroutines.
gRPC Kotlin is now generally available so they made the migration to that.
There are several other gotchas around coroutines that are discussed in the article.
Getting around Java interoperability pain points - There were some pain points with Java interop. Many libraries claiming to implement modern Java Non-blocking I/O standards did so in an unscalable manner. This caused issues when using coroutines. Check the article for full details.
Making dependency management easier - The build system and dependency management are a lot less intuitive than more recent solutions like Rust’s Cargo or Go’s modules. Some dependencies are particularly sensitive to version upgrades and can lead to issues where compilation succeeds but the app fails on boot up with odd, seemingly irrelevant back traces.
DoorDash engineers learned which projects tend to cause these issues most often and have guidelines for how to catch and bypass them.
For more details, read the full article
Quastor is a free Software Engineering newsletter that sends out summaries of technical blog posts, deep dives on interesting tech and FAANG interview questions and solutions.
Tech Snippets
Which Programming Paradigm Gives the Most Expressive Code? - Jonathan Boccara is a Principal Engineer at Doctolib (the largest e-health company in Europe). He wrote a very interesting blog post talking about trends in programming paradigms. He discusses the increasing popularity of functional programming and the declarative paradigm trend.
Free software licenses explained: MIT - Drew DeVault wrote a great blog post explaining the full text of the MIT software license. The MIT license is the most common open source software license out there, so it’s good to have an understanding of what it means and how it allows you to use the code.
Picking Your Tech Stack for Dummies - This is a great blog post on some general guidelines you should be thinking about when picking a tech stack for your project.
When it’s really early, just use what you know - There’s no point in learning a new stack just for a prototype.
Pick Winners - Look for popular technologies that have been popular for more than a couple years. Having several major companies use the technology is great.
Be Boring - Boring, well-tested technologies are great.
Don’t be too Boring - If you’re too boring it gets hard to recruit new developers. Good developers have a healthy want and interest in new technology.
A fantastic series of lectures on Distributed Systems by Robert Morris (co-founder of YCombinator).
The lectures cover actual applications, so there are lectures on ZooKeeper, Google Cloud Spanner, Apache Spark and more!
Ask HN: Best books on managing software complexity?
Someone on Hacker News posted a thread yesterday asking for the best books on managing software complexity, both from an architectural as well as organizational perspective.
These three books were recommended quite a bit:
John K Ousterhout, A Philosophy of Software Design
Titus Winter (et al), Software Engineering at Google
Hanson and Sussman, Software Design for Flexibility
Other books that were recommended were:
Peter Naur, Programming as Theory Building
Scott Wlaschin, Domain Modeling Made Functional
Nick Tune, Patterns, Principles, and Practises of Domain Driven Design
Robert L. Glass, Facts and Fallacies of Software Engineering
Donald Reinertsen, The Principles of Product Development Flow
Eric Normand, Grokking Simplicity
Interview Question
Given two binary trees, write a function that checks if they are the same or not.
Two binary trees are considered the same if they are structurally identical and corresponding nodes have the same values.
Here’s the question on LeetCode
Previous Solution
As a reminder, here’s our last question
You are given an array that contains an expression in Reverse Polish Notation.
Return the result from evaluating the expression. You can assume the expression will always be valid.
Input - [“2”, “3”, “+”, “3”, “*”]
Output - 15 because ( (2 + 3) * 3)
Input - [“4“, “5“, “/“, “30“, “+“]
Output - 30.8 because ( (4 / 5) + 30)
Here’s the question in LeetCode.
Solution
The key to solving this question is a stack.
Remember, stacks are Last In, First Out.
So they work perfectly for Reverse Polish Notation.
We iterate through our input array and run an if condition
if the current element is an integer, add it to the stack
if the current element is an operation (+, /, * or -) then pop the last 2 numbers off the stack, apply that operation, and then push the result onto the stack
After iterating through the array, we should be left with only 1 number in our stack.
Here’s the Python 3 code…