Solution

Design a Contest Platform Like LeetCode

Problem description

Design a coding contest platform like LeetCode and Hackerrank.

Functional requirements

  • Users should be able to solve coding questions by submitting their solution code and run it against built-in test cases and get results.
  • Users can participate daily contests which involves solving one or many coding questions. Users will be ranked based on the questions they solved and the time it took. A leaderboard will be display during the contest showing live scores of the top 50 user’s username and score.

Scalability requirements

  • Supporting 10k users participating contests concurrently
  • 1 contest a day
  • Each contest lasts 2 hours
  • Each user submits solutions 20 times on average
  • Each submission executes 20 test cases on average
  • User submitted code have a retention policy of 1 month after which they will be deleted.

Non-functional requirements

  • Low latency: Leaderboard should be updated live
  • High availability: Code submission should be available at all time
  • Security: User submitted code should not cause threat to the service

Resource Estimation

Assuming that the storage space required for the solution of each coding question is 1KB.

Assuming that the Read:write ratio is 100:1.

QPS: 463

Storage: 12GB

Using the resource estimator

QPS Calculations

To calculate Queries Per Second (QPS), we need to look at the peak load or the maximum number of operations that our system will need to support. Services we need to support

  • Executing user-submitted solutions
  • Updating leaderboard based on grading of user-submitted solutions
  • User profile creation and updates etc

During the contest, each submission needs to execute against 20 test cases on average so this is clearly the bottleneck of the entire service.

Let's start by understanding the total number of submissions a day: 10k users * 20 submissions/user = 200k submissions

Since each submission executes 20 test cases on average: 200k submissions * 20 test cases/submission = 4 million test case executions

Assuming that the contest runs for 2 hours (or 7200 seconds), then we can calculate the peak QPS:

4 million test case executions / 7200 seconds = ~556 QPS

This estimate assumes that the load is evenly distributed throughout the contest duration, but in real-world scenarios, we often see peak loads at the start and end of contests. Let’s add a 2x multiplier to the QPS and get 556*6 ~= 1200QPS

Storage Calculations

Let’s first take a look at the type of data we need to store:

  • User information: provided by the user upon signup such as email, username.
  • Contest information: Contest details, problem sets, test case inputs and expected outputs (in file formats).
  • User submissions: Based on the scalability requirements in the problem description, each user makes 20 submissions per contest and each submission is about 100KB (including the code, metadata, etc.), then the total data for submissions in a day would be 10k users * 20 submissions/user * 100KB/submission = 200MB/day. In a month of time, this will be 200MB * 30 = 6GB.
  • Test case execution results: Each submission has 20 test case results and each result is 10KB (including input, expected output, actual output, pass/fail status), the total data for test case results in a day would be 200k submissions * 20 test cases/submission * 10KB/test case = 40GB/day. In a month of time, this will be 1.2TB.

Clearly, the storage for test case executing results is the bottleneck and it’s about 1.2PB a month. Since the retention policy is only 1 month, we can assume this is the maximum storage our system needs.

API design

Code submission:

POST /problems/{problem_id}/submit

Request body:

{ user_id: string code: string language: string }

Response body:

{ status: success | fail test_case: [ { status: success | fail | timeout } … ] }

Fetching leaderboard

GET /contests/{contest_id}/leaderboard

Response body:

{ ranking: [ {user_id, score} ] }

High Level Design

Design LeetCode System Design Diagram

Services

  • User service: ① responsible for user sign up, update and retrieving user info to be displayed in the user profile page, backed by the user database.
  • Code submission service: the main workhorse of the platform. ② The service accepts user-submitted code and ③ enqueues them as messages into a message queue (see the Message Queue section for detailed examples). ④ The workers fetch from the queue, extract the user code and run them against the test cases inside docker containers and ⑤ writes the output to the test result database (or cloud storage). ⑥ The outputs are compared with the expected outputs and the result are sent back to the client. User score and ranking are also updated based on the output comparisons. If user code passes all test cases for a problem, the problem is marked as a success and its corresponding score is added to the user’s total. ⑦ User scores and ranking are stored in an in-memory data store such as Redis. Redis has a built-in data structure called sorted set that is tailored for the leaderboard purpose. We can use (contest_id, user_id) as the key and scores as the value and Redis sorted set will automatically ranks the users by their scores.
  • Contest service: ⑧ responsible for retrieving contest related information such as questions, date of contest etc from the contest database.

Detailed Design Q&A

Q1: How to prevent users from submitting malicious code that messes up our service internals?

A: Run each test case inside sandboxes like docker containers and only allow the containers to access temporary storage (e.g. linux’s /tmp folder).

Q2: In what format should the test cases be stored and how should they be used?

A: Each test case has an input file and an expected output file. Each problem has a driver code for each language that parses the input file. When user submitted code is executed by the docker container, an output file is created in a temporary folder mounted on the container. The output file can be compared to the expected output file for correctness.