The System Design Courses

Go beyond memorizing solutions to specific problems. Learn the core concepts, patterns and templates to solve any problem.

System Design Uber, Lyft

Functional Requirements

We will focus on the core functionalities:

Ride Request: Users should be able to request a ride by providing their location and destination. The system should find the nearest available driver to fulfill the ride request.
Driver Tracking: When a rider is matched to a driver, the system should be able to track the real-time location of drivers and update their status (available, busy, offline) and update the rider accordingly.

Non-functional Requirements

100M Daily Active Users
Read:write ratio = 10:1
Data retention for 5 years
Assuming 10 million ride requests per day
Assuming each ride (including all data information related to the ride) is about 1KB

Using our resource calculator, we get about 1000 read RPS and 100 write RPS.

Since this is one of the hard system design questions, we won't spend too much time on these cookie-cutter calculations and will jump into the designs which are more interesting.

High-level Design

Overview

Uber System Design Diagram

This design diagram may be intimidating at first with arrows going seemingly in all directions. Let's break it down into pieces, look at each one, and go through the sequence diagram of data flow so it will all make sense.

Entities

To satisfy our key functional requirements, we'll need the following entities:

1. Rider

RiderID (Primary Key)

This table stores information about users who use the platform to request rides. It includes personal information such as name and contact details, and preferred payment methods for ride transactions.

2. Driver

DriverID (Primary Key)
Status (Available, Busy, Offline)

This table stores information specific to users who are registered as drivers on the platform and provide transportation services. It includes their personal details, vehicle information (make, model, year, etc.), preferences, and availability status.

3. Ride

RideID (Primary Key)
RiderID (Foreign Key from User)
DriverID (Foreign Key from Driver)
Status (Requested, In Progress, Completed, Cancelled)

This entity represents an individual ride from the moment a rider requests an estimated fare all the way until its completion. It records all pertinent details of the ride, including the identities of the rider and the driver, vehicle details, state, the planned route, the actual fare charged at the end of the trip, and timestamps marking the pickup and drop-off.

4. Location

LocationID (Primary Key)
DriverID (Foreign Key from Driver)
Latitude
Longitude

This entity stores the real-time location of drivers. It includes the latitude and longitude coordinates, as well as the timestamp of the last update. This entity is crucial for matching riders with nearby drivers and for tracking the progress of a ride.

5. RideRequest

RequestID (Primary Key)
RiderID (Foreign Key from User)
Status (Pending, Accepted, Declined)

This table logs ride requests made by riders, tracking the request's status from pending to accepted or declined.

We can store everything in a SQL database but we should store location in an in-memory database it requires frequent read and write.

Components

Rider App: The app riders use to request rides and get updates about their ride.
Driver App: The app drivers use to get ride requests and update their location.
Load Balancer and Firewall: This makes sure the WebSocket connections are always available and secure by distributing traffic and blocking unauthorized access.
Rider WebSocket Service: Manages real-time communication between the Rider App and the backend services.
Driver WebSocket Service: Manages real-time communication between the Driver App and the backend services.
Ride Matcher: The main service that handles ride requests, finds available drivers, and updates ride statuses.
Ride DB: A database that stores information about rides, like their status, driver assignments, and details.
Location Service: Tracks and manages the real-time locations of drivers.
Location DB (in-memory): A fast in-memory database that stores current driver locations for quick access.

You might notice two dedicated WebSocket services: one for riders and one for drivers. This is crucial for maintaining a live feed of driver locations, which is essential for both riders (to track their ride's progress) and Uber itself (to manage its fleet effectively). This is where WebSocket excels - two way communication. We have dedicated service for them because they are user-facing request handlers that would scale differently than the other services such as Rider Matcher.

WebSocket is what you would typically suggest in a system design interview. In production, Uber actually uses a more modern technology - QUIC/http3. We will cover this in the detailed design section as well comparing SSE, WebSocket and long polling.

Also note that even though they are called "WebSocket Service", they are essentially request handlers that handle HTTP REST APIs too.

Data Flow and Interactions

1. Driver Sign on and Sends Its Location

Uber System Design Driver Location Update Flow

When a driver comes online it needs to start sharing its location data with Uber so the system can match it with nearby riders. The sequence of operations are:

Establish WebSocket Connection: Driver App establishes a WebSocket connection with the Driver WebSocket Service.
Send Location (Every Few Seconds): Driver App sends the driver's current location to the Driver WebSocket Service.
Forward Location to Location Service: Driver WebSocket Service forwards the driver's location data to the Location Service.
Update Driver Location: Location Service updates the driver's location in the Location DB.

2. Rider Requesting a Ride and Rider Matching

Grasping the building blocks ("the lego pieces")

This part of the guide will focus on the various components that are often used to construct a system (the building blocks), and the design templates that provide a framework for structuring these blocks.

Core Building blocks

At the bare minimum you should know the core building blocks of system design

Scaling stateless services with load balancing
Scaling database reads with replication and caching
Scaling database writes with partition (aka sharding)
Scaling data flow with message queues

System Design Template

With these building blocks, you will be able to apply our template to solve many system design problems. We will dive into the details in the Design Template section. Here’s a sneak peak:

System Design Template

Additional Building Blocks

Additionally, you will want to understand these concepts

Processing large amount of data (aka “big data”) with batch and stream processing
- Particularly useful for solving data-intensive problems such as designing an analytics app
Achieving consistency across services using distribution transaction or event sourcing
- Particularly useful for solving problems that require strict transactions such as designing financial apps
Full text search: full-text index
Storing data for the long term: data warehousing

On top of these, there are ad hoc knowledge you would want to know tailored to certain problems. For example, geohashing for designing location-based services like Yelp or Uber, operational transform to solve problems like designing Google Doc. You can learn these these on a case-by-case basis. System design interviews are supposed to test your general design skills and not specific knowledge.

Working through problems and building solutions using the building blocks

Finally, we have a series of practical problems for you to work through. You can find the problem in /problems. This hands-on practice will not only help you apply the principles learned but will also enhance your understanding of how to use the building blocks to construct effective solutions. The list of questions grow. We are actively adding more questions to the list.

Pro Member Exclusive

Upgrade your account to continue

Benefits

Unlimited access to practice tool with AI grading

Unlimited access to expert-written solutions

Unlock 100+ lessons with full course access

Access to all future content while subscribed

The System Design Courses

Go beyond memorizing solutions to specific problems. Learn the core concepts, patterns and templates to solve any problem.

Start Learning

System Design Master Template

Comments

Adriano Quast

I just watched the video and read the article, and I’m very thankful for the effort put into this subject! In a design interview, we might be asked to choose a database type—such as SQL or NoSQL—and justify our choice. Since the video highlights two key services—one for finding a ride and another for tracking a driver's location—a common follow-up question would be how to scale the in-memory database or prove that the current choice is sufficient for a larger scale. The article mentions 1000 QPS, which seems reasonable for a Redis database, and the Ride component handles 100 QPS for writes. This makes it safe to assume that even a simple SQL database could manage the workload. Even when considering Uber’s recent figures of 31M trips per day—roughly 350 trips per second—the read QPS would increase to about 3500, with writes at around 350 QPS. Assuming each location read involves a geohash string with 5-character precision and a driver ID as a 32-byte UUID, the throughput would be approximately: 3500 × (32 + 5) = 129,500 bytes per second (roughly 126.5 KBps) A single instance of Redis should still be capable of handling that load. So, where is the catch regarding scaling issues? While it’s clear that 10 million WebSockets wouldn’t fit into a single server—necessitating horizontal scaling—the database itself doesn’t seem to be a bottleneck. However, I have a feeling that I might be missing something. My main point is to explore how we can deep dive into each part of the solution to confidently prove that it works. Thanks again for the great content!

Fri Feb 07 2025

The System Design Courses

Go beyond memorizing solutions to specific problems. Learn the core concepts, patterns and templates to solve any problem.

Start Learning