Resource Estimation
Queries per second (QPS)
Assuming each sync involves a read and a write operation, let's calculate the number of operations per second. We note that there are 86,400 seconds in a day.
Using the resource estimator, we get the following results:
High Level Design
How Dropbox Works
Before getting into the details of high level design, let's get the background knowledge of how files are stored in a cloud storage service like Dropbox.
File Chunking
In the beginning, one might think that the simplest way to handle file uploads and downloads in a cloud storage system would be to treat each file as a singular entity. Let's call this the naïve approach. In this scenario, whenever a user wants to upload a file, the entire file would be sent to the server. Similarly, when downloading, the entire file is fetched all at once. At first glance, this seems straightforward and uncomplicated.
However, problems arise when we delve deeper into real-world scenarios. Imagine a professional video editor working on a 10GB video file. After making a tiny edit, such as adjusting the color on a single frame, they need to save and backup their work. In the naïve approach, this minuscule change would necessitate re-uploading the entire 10GB file. Such a method is clearly inefficient, consuming unnecessary bandwidth and time.
This is where the chunk-based approach comes into play. Instead of treating a file as one large block, it's divided into smaller, manageable pieces or chunks. Let's visualize this with the video editor's dilemma. Using the chunk-based approach, the 10GB video might be split into 1000 chunks of 10MB each. When the editor makes that small change, perhaps only one or two of these chunks get altered. Now, instead of re-uploading 10GB, only 20MB needs to be transmitted. The savings in time and bandwidth are evident.
But the benefits don't end there. The chunking strategy has multiple advantages:
- Resilience: If our video editor had an unstable internet connection and the upload was interrupted, the naïve method would potentially start over, causing frustration. In contrast, with chunking, only the interrupted chunks need to be retransmitted. This resilience becomes a boon in unreliable network conditions.
- Deduplication: Over time, our editor might have multiple projects with some shared footage. Rather than storing duplicate data, the system can recognize identical chunks and store them just once. This not only saves storage space but also upload time.
- Parallelism: Uploading or downloading several chunks simultaneously can maximize bandwidth utilization. For our editor, this means quicker backup and retrieval times, especially vital when deadlines are looming.
- Streaming: If our editor wanted to preview a clip from the cloud, chunking allows them to stream just the part they need. They don't have to wait for the entire file; they can start viewing as soon as the relevant chunks are loaded.
- Security: Each chunk can be encrypted individually. If there were a security breach and a chunk's encryption was compromised, the entire file wouldn't necessarily be vulnerable.
This progressive shift from the naïve to the chunk-based approach showcases a classic evolution in system design. What starts as a straightforward solution soon reveals its shortcomings under real-world pressures. By addressing these issues with a more sophisticated method, like chunking, systems can provide improved performance, efficiency, and user experience.
How Files are Stored on Dropbox
Now that we have a good idea of how file chunking works. Let's take a look at how Dropbox divides and stores a file.
Dividing the File into 4MB Blocks
Imagine you've just completed a high-definition video project that's about 40MB in size. When you decide to back up this video to the cloud, the system doesn't just take the video as one large data blob. Instead, it intricately slices this file into ten distinct blocks, each 4MB in size. This segmented approach is the first step in ensuring an efficient storage and retrieval process.
Hashing Each Block for Uniqueness
Now, with ten blocks ready, the system must ensure that each block can be uniquely identified. For this purpose, it utilizes a cryptographic hashing function, such as SHA-256. This function takes in the data from each 4MB block (a rather arbitrary choice according to Dropbox team's tech talk) and outputs a unique hash value. Even a minor change in the block's data would result in a drastically different hash, ensuring that each block's content can be reliably identified and verified.
Storing Blocks on the Block Server
With each of the ten blocks hashed, they're ready to be sent to the block server. Here, the previously calculated hash serves a dual purpose: it acts as a unique identifier (like a fingerprint) and a lookup key. The block server, in its vast digital vault, stores each block in a manner reminiscent of a key-value storage system. The hash (key) provides a direct path to the block's data (value). This methodology ensures swift retrieval and also offers benefits like deduplication. For instance, if another user uploads a file with a block identical to one already in the system, there's no need to store that block again. The system can simply reference the existing block using its hash.
Metadata Management and Its Storage
While the block server is busy safeguarding the data blocks, another equally crucial process unfolds: the organization and storage of metadata. Metadata, in this context, is the information about the file that doesn't include its actual content. This encompasses the file's name, its path, and a specially designated 'namespace'. A namespace is a unique identifier, often used to differentiate files from different users or different directories.
As the video file's blocks are stored, the system crafts a comprehensive metadata record. This record, beyond the file's name and path, includes an ordered list of the blocks (or their hashes) that constitute the complete file. This list is pivotal, as it maps out the sequence to reconstruct the original file from its constituent blocks.
This metadata is then sent to its dedicated storage space, often managed by a specialized metadata server. This server acts as the system's directory or index, guiding it on how to piece together files from the blocks when necessary.
To encapsulate, this orchestrated dance between dividing files, hashing blocks, managing two distinct servers, and meticulously tracking metadata exemplifies the depth of sophistication present in modern cloud storage systems. It showcases the lengths to which technology goes to ensure efficient, reliable, and organized storage for users.
Design Diagram
Uploading a New File (Write Path)
When a user adds a new file to their Dropbox folder:
-
- File Preparation: The Dropbox client on the user's device divides the file into 4MB blocks, calculating a hash for each block. These blocks are then transmitted to the Block Service.
-
- Block Storage: The Block Service stores these blocks, using the block hashes as reference keys, in its Block Storage system.
- 3, 4. Metadata Transmission: The client sends the file's metadata, which includes information such as namespace, path, and the list of blocks, to the Metadata Service via its load balancer.
- 5, 6. Metadata Storage: The chosen Metadata Server then writes these details into the Metadata Database and simultaneously caches this information for quick access.
-
- Notification: After recording the metadata, the Metadata Service alerts the Notification Service about the new file.
-
- Syncing Alert: The Notification Service informs other client installations about the new file addition, prompting them to synchronize.
Downloading the New File (Read Path)
When another device attempts to sync the new file from Dropbox:
-
- Notification Receipt: The client on the user's secondary device receives the update from the Notification Service about the newly added file and initiates the syncing process.
- 2, 3, 4, 5. Metadata Retrieval: The client communicates with the Metadata Service's load balancer, which routes the request to an appropriate Metadata Server. This server checks its cache for the required metadata. If unavailable, it fetches the data from the Metadata Database.
-
- Block Request: Armed with the metadata, the client sends a request to the Block Service to retrieve the associated file blocks.
-
- Block Retrieval: The Block Service fetches the necessary blocks, identified by their hashes, and sends them to the client.
-
- File Reconstruction: Upon receipt, the client assembles these blocks to reconstruct the original file and saves it to the device's storage.
API Design
Now that we have a good idea how the system works, let's flesh out the API endpoints.
Metadata Server
/save_metadata
- Request Type: POST
- Request Parameters: None
Request JSON:
{
"filename": "example.txt",
"path": "/user/directory/example.txt",
"block_hashes": ["hash1", "hash2", ...]
}
Response JSON:
{
"status": "success", // or "error"
"message": "Metadata saved successfully." // or error message
}
/get_metadata
- Request Type: GET
- Request Parameters:
path
(e.g.,/get_metadata?path=/user/directory/example.txt
)
Response JSON:
{
"filename": "example.txt",
"path": "/user/directory/example.txt",
"block_hashes": ["hash1", "hash2", ...]
}
Block Server
/upload/<block_hash>
- Request Type: POST
- Request Parameters:
block_hash
(e.g.,hash1
)
Request Data: Binary data of the block
Response JSON:
{
"status": "success", // or "error"
"message": "Block uploaded successfully." // or error message
}
/download/<block_hash>
- Request Type: GET
- Path Parameters:
block_hash
(e.g.,hash1
) - Response: Binary data of the block
Notification Server
/long_poll
- Request Type: GET
- Request Parameters: None
Response: Connection kept open until a change occurs.
Response JSON (upon change):
{
"status": "change_detected",
"message": "A change has been detected.",
"changed_files": [
{
"filename": "example.txt",
"path": "/user/directory/example.txt"
}
// ... potentially other changed files
]
}
Detailed Design
Database Design
The core function of the Metadata Database in Dropbox is to store essential details about the files without actually storing the file content itself. One of the core requirements is maintaining strong ACID properties and consistency across devices. We use a Relational Database Management System (RDBMS) structure to manage and query the metadata generated by users.
File Metadata Table:
This table tracks individual files and their metadata.
Grasping the building blocks ("the lego pieces")
This part of the guide will focus on the various components that are often used to construct a system (the building blocks), and the design templates that provide a framework for structuring these blocks.
Core Building blocks
At the bare minimum you should know the core building blocks of system design
- Scaling stateless services with load balancing
- Scaling database reads with replication and caching
- Scaling database writes with partition (aka sharding)
- Scaling data flow with message queues
System Design Template
With these building blocks, you will be able to apply our template to solve many system design problems. We will dive into the details in the Design Template section. Here’s a sneak peak:
Additional Building Blocks
Additionally, you will want to understand these concepts
- Processing large amount of data (aka “big data”) with batch and stream processing
- Particularly useful for solving data-intensive problems such as designing an analytics app
- Achieving consistency across services using distribution transaction or event sourcing
- Particularly useful for solving problems that require strict transactions such as designing financial apps
- Full text search: full-text index
- Storing data for the long term: data warehousing
On top of these, there are ad hoc knowledge you would want to know tailored to certain problems. For example, geohashing for designing location-based services like Yelp or Uber, operational transform to solve problems like designing Google Doc. You can learn these these on a case-by-case basis. System design interviews are supposed to test your general design skills and not specific knowledge.
Working through problems and building solutions using the building blocks
Finally, we have a series of practical problems for you to work through. You can find the problem in /problems. This hands-on practice will not only help you apply the principles learned but will also enhance your understanding of how to use the building blocks to construct effective solutions. The list of questions grow. We are actively adding more questions to the list.