The System Design Courses

Go beyond memorizing solutions to specific problems. Learn the core concepts, patterns and templates to solve any problem.

Start Learning

Design Tinder

Introduction

Tinder is a location-based dating app where a user creates a profile with photos and preferences, updates their current location, and discovers nearby candidates that match their criteria. The user swipes right (like) or left (pass) on each candidate. When both users swipe right on each other, the system detects the mutual like, creates a match, and notifies both users in real-time.

Tinder Core Flows

The system handles nearby candidate discovery across millions of profiles using geospatial indexing, enforces mutual match detection with strong consistency (no duplicate or missed matches), delivers real-time notifications on match, and maintains scalable read and write paths for feed generation and swipe recording.

Functional Requirements

We extract verbs from the problem statement to identify core operations:

  • "creates a profile" and "updates preferences" → WRITE operation (Profile & Preferences)
  • "discovers nearby candidates" → READ + FILTER operation (Candidate Feed)
  • "swipes right/left" and "detects mutual like" → WRITE + CHECK operation (Swipe & Match)
  • "notifies both users" → PUSH operation (Match Notification)

Each verb maps to a functional requirement. Each requirement builds on the previous, progressively adding components to the architecture.

  1. Users create and update their dating profile (name, bio, photos) and set discovery preferences (age range, gender preference, search radius). The system records the user's current location for geo-based candidate discovery.

  2. Users fetch a paginated feed of candidate profiles that match their discovery preferences (age, gender, distance) and excludes profiles they have already swiped on.

  3. Users swipe right (like) or left (pass) on a candidate. The system records the decision idempotently. On a right-swipe, it checks whether the other user has already liked the current user — and if so, creates a match atomically.

  4. When a mutual match is detected, both users receive a real-time notification. The user who just swiped sees the match in the API response; the other user receives a push notification.

Scale Requirements

  • Assume ~75 million daily active users (think: global dating app at scale)
  • Roughly 50 million swipes per day across all users
  • Roughly 5 feed fetches per user per day (roughly 4,300 feed QPS average, peaking at roughly 2-3× during evening hours)
  • Location updates on app open or background refresh (not continuous like ride-hailing)
  • Feed fetch latency target: sub-second at p95

Non-Functional Requirements

We extract adjectives and descriptive phrases from the problem statement to identify quality constraints:

  • "nearby" candidate discovery → Geospatial queries must be fast; the feed service needs a spatial index, not full-table scans on lat/lng columns
  • "mutual" match detection → Strong consistency on the match-creation path; two simultaneous right-swipes must produce exactly one match, not zero or two
  • "real-time" notifications → Low-latency push delivery when a match occurs; users expect to see the match within seconds
  • "scalable" read and write paths → Feed reads vastly outnumber swipe writes (~50:1 ratio); both paths must scale horizontally without coupling

Each adjective becomes a non-functional requirement that constrains design choices. These NFRs imply: a geospatial index for candidate discovery, transactional guarantees on the swipe/match write path, a push notification pipeline for match delivery, and independent scaling of read and write services.

  • Low Latency: Feed fetch under sub-second p95; swipe response near-instant
  • High Availability: Swipe and feed paths must remain available during peak hours (evenings, weekends)
  • Strong Consistency: Match creation must be atomic — no duplicate matches, no missed matches when both users swipe right
  • Horizontal Scalability: Feed reads (~50:1 over writes) and swipe writes scale independently

Data Model

The data model is derived from extracting nouns in the problem statement:

  • "profile" and "preferences" → Profile entity with preference fields (gender, age range, radius)
  • "location" → Profile fields for lat/lng and geohash for spatial indexing
  • "swipe" and "decision" → Swipe entity recording each like/pass action
  • "match" and "mutual like" → Match entity with canonical user ordering to prevent duplicates

The Profile Service owns user profiles and location data. The Swipe Service owns both Swipe and Match records, since match creation happens atomically within the swipe transaction.

Profile

Core entity representing a user's dating profile. Contains personal info, photos, discovery preferences, and current location. The geohash field enables spatial indexing for nearby search.

Swipe

Records a user's decision (like or pass) on a candidate. The unique constraint on (from_user_id, to_user_id) ensures idempotent writes. Used for match detection and "already swiped" filtering.

Match

Represents a mutual like between two users. Created atomically when the system detects that both users have swiped right on each other. The canonical ordering (user_a_id < user_b_id) prevents duplicate match records.

Profile and Swipe have a one-to-many relationship. A user creates many swipes over time. Each swipe references a from_user_id and to_user_id, both linking to Profile.

Swipe and Match have a many-to-one relationship. A match is created when two complementary swipes exist (A→B LIKE and B→A LIKE). The match links back to both users' profiles.

Data Model Diagram

API Endpoints

We derive API endpoints directly from the functional requirements (verbs identified in Step 0):

  • CREATE operation: "creates a profile" → POST /profiles (initial profile setup)
  • UPDATE operation: "updates preferences" → PATCH /profiles/me (modify profile and preferences)
  • UPDATE operation: "updates location" → POST /location (report current coordinates)
  • READ operation: "discovers nearby candidates" → GET /feed (fetch swipe candidates)
  • CREATE operation: "swipes right/left" → POST /swipes/{targetUserId} (record decision)
  • READ operation: "match list" → GET /matches (list mutual matches)
  • READ operation: "notifications" → GET /notifications (list recent notifications)

Each endpoint maps to a core user action. The feed endpoint is the highest-traffic read path; the swipe endpoint is the primary write path.

POST
/profiles

Create a new user profile with photos, bio, and discovery preferences.

PATCH
/profiles/me

Update profile fields or discovery preferences. Partial updates supported.

POST
/location

Update the user's current location. Called on app open or background refresh. The server computes the geohash for indexing.

GET
/feed?cursor={cursor}

Fetch a paginated list of candidate profiles matching the user's preferences and location. Returns profiles the user has not yet swiped on.

POST
/swipes/{targetUserId}

Record a swipe decision on a candidate. Returns whether a match was created.

GET
/matches?cursor={cursor}

List the user's matches in reverse chronological order.

GET
/notifications?cursor={cursor}

List recent notifications (matches, likes received on premium, etc.).

High Level Design

1. Profile & Preferences

Users create and update their dating profile (name, bio, photos) and set discovery preferences (age range, gender preference, search radius). The system records the user's current location for geo-based candidate discovery.

A user downloads the app, signs up, and fills in their profile: name, photos, a short bio, birth date, and gender. Then they set discovery preferences: "Show me women, ages 22-32, within 25km." Finally, the app reports the user's current GPS coordinates so the system knows where to search for candidates.

The Problem

Profile data is straightforward CRUD — the interesting design decision is how to handle location. Unlike ride-hailing where drivers stream GPS every few seconds, dating app users update location far less frequently. The app reports coordinates when the user opens it or during periodic background refreshes. The system must store this location in a way that supports fast nearby-user queries later in the feed.

Step 1: Profile Service and Storage

The Profile Service handles all profile operations: create, update, and read. It stores profiles in a relational database with standard indexes on user_id.

Profile photos are stored in object storage (a CDN serves them to mobile clients). The Profile DB stores only the photo URLs, not the binary data. This keeps the database lean — each profile row stores text fields and metadata (name, bio, preferences JSON, photo URLs, coordinates), on the order of a few KB.

Step 2: Location Update and Geohash Indexing

When the app sends a location update (POST /location), the Profile Service does two things:

  1. Store the raw coordinates — Update location_lat and location_lng on the user's profile record
  2. Compute and store the geohash — Convert (lat, lng) to a geohash string and update the geohash field

The geohash is the key to fast nearby queries. It converts 2D coordinates into a 1D string where nearby locations share a common prefix. A geohash at precision 6 covers roughly 1km × 1km cells. When the Feed Service later needs "users within 25km," it queries by geohash prefix rather than calculating distance against every profile.

(For the encoding algorithm, see Geohash.)

The Profile Service also updates the geospatial index — an in-memory store keyed by geohash cell, where each cell contains a set of user_ids located in that area. This index is separate from the Profile DB and optimized for spatial lookups.

Step 3: Preference Storage

Preferences (age range, gender, radius) are stored as columns on the Profile record. When a user changes their preferences, the Profile Service updates the row. These preferences are read by the Feed Service during candidate generation — we cover that in the next requirement.

Profile & Preferences Architecture

The architecture at this stage: the User App sends requests through the API Gateway to the Profile Service. The Profile Service writes user data to the Profile DB and updates location in the Geo Index (in-memory). Photo binaries are stored in Object Storage and served through a CDN. This is the foundation — two services, two stores — that the feed and swipe paths build on.

Got a question? Discuss this section with AI tutor

2. Candidate Feed

Users fetch a paginated feed of candidate profiles that match their discovery preferences (age, gender, distance) and excludes profiles they have already swiped on.

The user opens the app and starts swiping. Behind the scenes, the app calls GET /feed and receives a batch of candidate profiles — each matching the user's preferences, within their search radius, and not yet swiped on. This feed is the core product experience, so it must load fast and return relevant candidates.

The Problem

Consider a user in Manhattan with preferences "women, ages 22-32, within 10km." The system has 75 million profiles. A naive approach queries the Profile DB directly:

SELECT * FROM profiles
WHERE gender = 'FEMALE'
  AND age BETWEEN 22 AND 32
  AND distance(lat, lng, user_lat, user_lng) < 10
  AND user_id NOT IN (SELECT to_user_id FROM swipes WHERE from_user_id = ?)

This query has two problems. First, the distance calculation runs against every matching row — there is no way to index a radius efficiently with a standard B-tree. Second, the NOT IN subquery against the swipe table grows with the user's swipe history. A power user who has swiped 50,000 times creates a massive exclusion list.

Step 1: Geo-Filtered Candidate Retrieval

Instead of scanning all profiles, the Feed Service queries the Geo Index (populated by the Profile Service in FR1). The retrieval pipeline:

  1. Compute search cells — Take the user's geohash and their preferred radius. At precision 5 (roughly 5km × 5km cells), a 10km radius covers roughly 9-25 cells depending on position relative to cell boundaries
  2. Fetch user_ids from those cells — The Geo Index returns all user_ids in the matching cells
  3. Filter by preferences — For each candidate, check gender and age against the requesting user's preferences. Also apply bi-directional filtering: the candidate's preferences must also include the requesting user (if candidate wants men aged 25-35 and requesting user is a 40-year-old woman, skip). This is a key product requirement — candidates should only appear if both users could potentially match
  4. Exclude already-swiped — Remove any candidate the user has already swiped on (we cover the scaling challenge of this step in the Deep Dive on "Already-Swiped Filtering")

Feed Pipeline

Step 2: Ranking

After filtering, the Feed Service ranks the remaining candidates. A simple approach sorts by last_active descending — recently active users appear first, since they are more likely to swipe back. Production systems use ML-based ranking that considers factors like profile completeness, historical match rates, and mutual interest signals. For this design, we treat the ranker as a black box that takes a candidate list and returns a scored, ordered list.

Step 3: Pagination with Cursor

The feed returns a page of roughly 20 candidates with a cursor for the next page. The cursor encodes the user's position in the candidate list (e.g., an offset or the last-seen candidate ID). When the user swipes through all 20, the app fetches the next page.

To avoid re-fetching the same candidates on each page, the Feed Service can cache the full candidate list for a session. This feed cache stores the pre-filtered, ranked list keyed by user_id with a short TTL (roughly 5-10 minutes). Page requests read from the cache. If the cache expires or is missing, the service runs the full pipeline again.

Handling Preference Changes

When a user changes their preferences (wider radius, different age range), the feed cache must be invalidated. The Profile Service publishes a preference-change event, and the Feed Service evicts the cached candidate list. The next feed request triggers a fresh pipeline run.

Candidate Feed Architecture

The architecture now adds three components. The Feed Service receives feed requests and queries the Geo Index for nearby user_ids, filters by preferences using profile data, excludes already-swiped users by checking the Swipe DB, ranks candidates, and stores the result in a Feed Cache for pagination. All previous components (API Gateway, Profile Service, Profile DB, Object Storage) remain unchanged.

Got a question? Discuss this section with AI tutor

3. Swipe & Match

Users swipe right (like) or left (pass) on a candidate. The system records the decision idempotently. On a right-swipe, it checks whether the other user has already liked the current user — and if so, creates a match atomically.

So far we have profiles with location data and a feed pipeline that delivers filtered candidates. Now we need the core interaction: the user sees a candidate, swipes right or left, and the system records that decision. The critical challenge here is not recording the swipe itself — it is detecting and creating a mutual match correctly.

The Problem

Alice swipes right on Bob. The system needs to:

  1. Record Alice's swipe (LIKE on Bob)
  2. Check if Bob has already swiped right on Alice
  3. If yes, create a Match between Alice and Bob

This seems simple as a sequence, but consider what happens when Alice and Bob swipe right on each other at the same time. Two concurrent requests each execute steps 1-3. If both check for the inverse swipe before either writes, both see "no inverse swipe" and no match is created. Two mutual likes, zero matches — a missed match that neither user ever knows about.

The reverse is also possible: both requests see the other's write and both create a match — producing a duplicate. Either failure breaks the core product promise.

Race Condition Problem

Step 1: Record the Swipe

The Swipe Service receives POST /swipes/{targetUserId} with the decision (LIKE or PASS). It writes a row to the Swipe DB:

INSERT INTO swipes (swipe_id, from_user_id, to_user_id, decision, created_at)
VALUES (gen_id(), 'alice', 'bob', 'LIKE', now())

A unique constraint on (from_user_id, to_user_id) makes this idempotent. If Alice's app retries due to a network timeout, the second insert is a no-op (or an upsert). This prevents duplicate swipe records.

For PASS swipes, the write completes here. No match check is needed.

Step 2: Check for Mutual Like

On a LIKE swipe, the Swipe Service checks for the inverse:

SELECT decision FROM swipes
WHERE from_user_id = 'bob' AND to_user_id = 'alice'

If Bob has already swiped LIKE on Alice, we have a mutual like and need to create a match. If Bob hasn't swiped yet (no row) or swiped PASS, no match.

Step 3: Atomic Match Creation

To avoid the race condition described above, the swipe insert and match check must be serialized for a given pair. A plain READ COMMITTED transaction does not prevent missed matches — two concurrent transactions can each miss the other's uncommitted write and both finish without creating a match. The fix requires stronger isolation:

  1. Acquire a deterministic lock on the canonical user pair (e.g., SELECT ... FOR UPDATE on a row keyed by (min(alice, bob), max(alice, bob)), or use an advisory lock)
  2. Insert the swipe row
  3. Query the inverse swipe — now guaranteed to see any previously committed swipe for this pair
  4. If mutual like, insert a Match row with canonical user ordering (user_a_id < user_b_id)
  5. Commit the transaction

The deterministic lock serializes concurrent swipes for the same pair. The unique constraint on (user_a_id, user_b_id) in the Match table is a safety net against duplicates, but the lock is what prevents missed matches.

(The Deep Dive on "Swipe Consistency" explores how to handle this at scale with sharding.)

Swipe and Match Detection Flow

Step 4: Publish Match Event

If a match is created, the Swipe Service publishes a match.created event to an event queue. This event contains both user IDs and the match ID. Downstream consumers (notification service, analytics) subscribe to this event. Using an event queue decouples match creation from notification delivery — the swipe response returns immediately without waiting for push notifications to be sent.

The swipe API response tells the caller whether a match occurred:

{ "swipe_id": "swp_xyz789", "match": { "match_id": "mtc_abc123", "matched_user": { ... } } }

If no match, the match field is null. The swiping user sees the match animation instantly from the API response, while the other user gets notified asynchronously (covered in FR4).

Swipe & Match Architecture

The architecture now adds three components. The Swipe Service handles swipe writes and match detection, storing records in the Swipe/Match DB. On match creation, it publishes events to the Event Queue for downstream consumers. The Feed Service's already-swiped filter reads from the same Swipe DB. All previous components remain unchanged.

Got a question? Discuss this section with AI tutor

4. Match Notification

When a mutual match is detected, both users receive a real-time notification. The user who just swiped sees the match in the API response; the other user receives a push notification.

So far we have profile creation, a feed pipeline for candidate discovery, and a swipe path that detects mutual matches. The final piece: when a match occurs, the other user (who swiped right earlier and is no longer in the app's swipe flow) needs to know about it. This is the notification fanout.

The Problem

When Alice swipes right on Bob and a match is created, Alice sees the match instantly in the swipe API response. But Bob swiped right on Alice hours ago and may not have the app open. Bob needs a push notification: "You have a new match with Alice!" If Bob does have the app open, the notification should appear in real-time within the app as well.

Sending notifications synchronously in the swipe request would slow down the swipe response and couple the swipe path to push delivery infrastructure. If the push gateway is slow or down, swipes would fail — unacceptable for the core interaction loop.

Synchronous vs Asynchronous Match Detection

Step 1: Consume Match Events

The Notification Service subscribes to match.created events from the event queue (published by the Swipe Service in FR3). For each event, it determines what notifications to send and to whom.

For a match between Alice and Bob:

  • Alice already knows (she saw the match in the swipe response) — but the Notification Service still records the notification in the notification store for Alice's notification feed
  • Bob needs an active push notification

Step 2: Push Notification Delivery

The Notification Service sends a push notification to Bob through a Push Gateway — a service that interfaces with mobile push providers (APNs for iOS, FCM for Android). The push payload includes:

{ "type": "MATCH", "title": "New Match!", "body": "You and Alice liked each other", "match_id": "mtc_abc123" }

The Push Gateway handles device token management, delivery retries, and platform-specific formatting. If Bob's device is unreachable (phone off, no network), the push provider queues the notification and delivers it when the device reconnects.

Step 3: In-App Notification

If Bob has the app open, the notification should appear in real-time without waiting for the push notification round-trip. Two approaches:

Our Choice: SSE for in-app delivery. The dating app's real-time needs are one-directional (server pushes match alerts to the client). SSE handles this with less infrastructure complexity than WebSocket. Push notifications via the Push Gateway handle the case where the app is backgrounded or closed.

Idempotent Notification Delivery

Push notifications may be delivered more than once (network retries, event queue redelivery). The Notification Service assigns each notification a unique ID. The client deduplicates by ID, showing each match notification exactly once.

Match Notification Architecture

The final architecture adds two components. The Notification Service consumes match events from the Event Queue, writes notification records, and sends push notifications through the Push Gateway to mobile devices. For users with the app open, SSE connections deliver in-app notifications in real-time. This completes the end-to-end flow: profile creation → feed discovery → swipe and match → notification delivery.

Got a question? Discuss this section with AI tutor

Deep Dive Questions

How do you ensure that two users swiping right on each other at the same time produces exactly one match?

Senior

Alice and Bob both have the app open. Alice swipes right on Bob. At the same moment, Bob swipes right on Alice. Two concurrent requests arrive at the Swipe Service. Without careful handling, we get one of two failures:

  • Missed match: Both requests check for the inverse swipe before either writes. Both see "no inverse swipe exists." Both insert their swipe but neither creates a match. Two mutual likes, zero matches.
  • Duplicate match: Both requests insert their swipe, both find the inverse, and both create a match record. Two match records for the same pair.

Recommendation

Use pair-based partitioning with a transactional write (Approach 2). It gives synchronous match detection (the swipe response tells the user immediately) with strong consistency guarantees. The canonical pair key ensures co-location, and the unique constraint on the Match table prevents duplicates. Mention the async approach as an alternative if the interviewer asks about simplifying consistency at the cost of slightly delayed match notification.

Idempotent Swipes

Regardless of approach, the unique constraint on (from_user_id, to_user_id) in the Swipe table makes retries safe. If Alice's app retries a swipe due to a network timeout, the duplicate insert is rejected (or treated as an upsert). The match check runs again but finds the match already exists — another no-op.

Got a question? Discuss this section with AI tutor

How do you optimize feed latency when the geo-filter and preference-filter pipeline is too slow for real-time serving?

Senior

The feed pipeline from FR2 runs on every feed request: query the Geo Index, filter by preferences, exclude already-swiped, rank, return results. At roughly 10-13K feed QPS during peak evening hours (derived from 75M DAU × 5 fetches/day, with 2-3× peak multiplier), running this full pipeline per request creates significant load on the Geo Index and Swipe DB.

Cache Invalidation

When a user changes preferences or location significantly, the batch-generated list becomes stale. Two strategies:

  • Eager invalidation: The Profile Service publishes a preference-change event. The Feed Service evicts the cache entry. The next feed request falls through to the real-time pipeline as a fallback, then the batch job refreshes the cache on its next run.
  • Lazy refresh: The cached list includes a staleness flag. If the user's preferences changed since the cache was built, run the real-time pipeline and update the cache inline.

Eager invalidation is simpler. Preference changes are rare (users set preferences once and adjust infrequently), so the fallback to real-time is triggered rarely.

Recommendation

Use the hybrid approach with batch precomputation for the geo+preference candidate superset and real-time filtering for already-swiped exclusion. This keeps feed latency low (cache read + bloom filter check) while ensuring freshness on the dimension that changes most often (the user's swipe history). Mention the fully real-time approach as a simpler starting point if scale is lower.

Got a question? Discuss this section with AI tutor

How do you efficiently filter out profiles a user has already swiped on when the user has swiped 50,000+ times?

Senior

A power user swipes 200 times per day. After a year, they have 70,000+ swipe records. Every feed request must exclude all of these. Loading 70K user_ids from the database and checking each candidate against this set is expensive — both in query time and memory.

Bloom Filter Lifecycle

  1. Build: When a user becomes active, load their swipe history from the Swipe DB and build a bloom filter. Store it in the feed cache keyed by user_id.
  2. Update: On each new swipe, add the target user_id to the bloom filter (bloom filters support incremental insertion).
  3. Expire: Set a TTL matching the user's session length. When the user goes inactive, the bloom filter expires. Rebuild on next session start.
  4. Rebuild: Bloom filters can't remove entries (a user who "rewinds" a swipe in a premium feature would need a rebuild). Periodic rebuilds from the Swipe DB keep the filter accurate.

Feed Service Integration

During feed serving, the Feed Service reads the candidate list from cache (the geo+preference superset from DD2), then checks each candidate against the user's bloom filter. Candidates that "probably exist" in the filter are excluded. The remaining candidates are ranked and returned.

Bloom filter checks are O(1) per candidate — a handful of hash computations and memory lookups, taking microseconds each. For a batch of 500 candidates, the total filter time is negligible compared to any network or database call.

Recommendation

Use bloom filters cached per active user. They reduce storage by 10-12× compared to raw sets and provide O(1) membership checks with an acceptable false positive rate. The Swipe DB remains the source of truth; the bloom filter is a read-through cache for fast exclusion. Mention the database subquery approach as the starting point for low-scale systems.

Got a question? Discuss this section with AI tutor

How do you design the geospatial index for nearby profile discovery, given that location updates are infrequent compared to ride-hailing?

Senior

Unlike ride-hailing where drivers stream GPS every 4 seconds (roughly 500K writes/sec), dating app users update location only on app open — roughly 5 times per day. That is roughly 4,300 location writes per second at the 75M DAU scale. The read pattern is also different: ride-hailing needs "find nearest driver right now" with sub-second freshness; dating apps need "find all users within 25km" with freshness measured in minutes.

This changes the design constraints. Write throughput is moderate, not extreme. Read queries are broad (large radius, many results) rather than point queries (nearest single driver). Staleness of minutes is acceptable, since users don't move rapidly between swipe sessions.

Geohash Precision Selection

The user's search radius determines the geohash precision:

  • 5km radius → precision 5 (roughly 5km × 5km cells), search 9 cells (center + 8 neighbors)
  • 25km radius → precision 4 (roughly 20km × 20km cells), search 9 cells covers roughly 60km × 60km
  • 50km radius → precision 3 (roughly 150km × 150km cells), may need fewer cells but broader results

The Feed Service selects the precision level based on the user's preference_radius_km and queries the appropriate cells. Coarser precision returns more candidates but requires more distance-based filtering afterward.

(For geohash encoding and the 9-cell query pattern, see Geohash.)

Index Structure

The Geo Index is an in-memory store where each key is a geohash cell and each value is a set of user_ids in that cell. When a user updates their location:

  1. Compute the new geohash at multiple precision levels (3, 4, 5, 6)
  2. If the geohash changed from the previous value, remove the user from the old cell and add to the new cell
  3. Most location updates don't change the geohash — the user is still in the same cell. These are no-ops on the index.

Storing at multiple precision levels avoids re-querying: a 25km search reads precision-4 cells directly, a 5km search reads precision-5 cells.

Density Variation

Manhattan might have 50,000 users per precision-5 cell. Rural Kansas might have 10. Querying a dense cell returns far more candidates than needed, while a sparse cell may need expansion to adjacent cells.

For dense areas, the Feed Service can use a finer precision (precision 6, roughly 1km cells) and query more cells. For sparse areas, use coarser precision (precision 4) to cast a wider net. The adaptive precision selection based on expected density in the user's region optimizes query efficiency.

Sharding the Geo Index

Each user is stored at 4 precision levels (3, 4, 5, 6), so the index holds roughly 300M entries (75M × 4). At roughly 100 bytes per entry, the raw footprint is roughly 30GB. With hash-set overhead and metadata, expect roughly 50-80GB in practice. This fits on a large-memory instance (128-256GB RAM), but sharding improves availability and read throughput:

  • Shard by geohash prefix — Users in North America (geohash prefix "dr", "dq", etc.) on one shard, Europe on another. Most queries hit a single shard since nearby users share geohash prefixes.
  • Replicate for reads — Feed requests are read-heavy. Read replicas of each shard handle the 13K QPS peak without overwhelming the primary.

Handling Stale Locations

A user who hasn't opened the app in 30 days still has a location in the index. The Feed Service filters these out using the last_active field — profiles inactive beyond a threshold (e.g., 7 days) are excluded from feed results. A background job periodically removes truly stale entries (e.g., 30+ days inactive) from the index to keep memory lean.

Got a question? Discuss this section with AI tutor

Staff-Level Discussion Topics

Staff

The following topics contain open-ended architectural questions without prescriptive solutions. They are designed for staff+ conversations where you demonstrate systems thinking, trade-off analysis, and strategic decision-making.

Got a question? Discuss this section with AI tutor

Level Expectations

The following table summarizes what interviewers typically expect at each seniority level for dating app system design.

DimensionMid-Level (L4)Senior (L5)Staff (L6)
RequirementsIdentify core FRs: create profile, browse candidates, swipe, detect match. Basic scale math (swipes per day, feed QPS).Define precise NFRs: strong consistency for match detection, feed latency targets, read/write ratio implications. Identify already-swiped filtering as a scaling challenge.Challenge requirements — should matching optimize for match rate or engagement? How does recommendation quality affect user retention? What consistency level is actually needed for match detection vs feed freshness?
High-Level DesignDraw basic architecture: User App → API Gateway → Profile/Feed/Swipe Services → Database. Separate read (feed) from write (swipe) paths.Geospatial index for candidate discovery. Atomic match detection with pair-based partitioning. Feed caching with invalidation. Event-driven notification pipeline.Hybrid feed generation (batch precomputation + real-time filtering). Bloom filters for already-swiped exclusion. Cross-region discovery for traveling users. ELO-style ranking for match quality.
Geospatial DesignExplain that nearby search needs spatial indexing, not just lat/lng columns. Mention geohash or similar spatial encoding.Geohash precision selection based on search radius. Multi-precision storage. Bi-directional preference filtering (both users' preferences must match). Stale location handling.Adaptive precision based on user density. Sharding the geo index by geohash prefix. Cross-region index updates for traveling users. Cache-friendly query patterns at different precision levels.
Match ConsistencyUnderstand that mutual likes should create exactly one match. Basic approach: check for inverse swipe, create match if found.Identify the simultaneous-swipe race condition. Pair-based partitioning for co-located transactions. Unique constraint as safety net. Idempotent swipe writes.Compare synchronous vs async match detection tradeoffs. Discuss partition strategies at global scale. Analyze failure modes when the event queue is unavailable.
Feed OptimizationDescribe the filter pipeline: geo → preferences → already-swiped. Mention caching feed results.Compare real-time vs precomputed vs hybrid feed generation. Bloom filters for already-swiped exclusion. Cache invalidation on preference changes.Batch precomputation scheduling and resource allocation. Feed quality vs freshness tradeoffs. Candidate pool exhaustion handling in low-density areas. A/B testing feed ranking changes.

Interview Cheatsheet

Core Architecture in 60 Seconds

"Profile + location → geospatial index." User creates profile with preferences, reports location on app open. Profile Service stores data and updates an in-memory geospatial index keyed by geohash cells.

"Feed → geo filter + preference filter + already-swiped exclusion." Feed Service queries Geo Index for nearby users, filters by both users' preferences (bi-directional), excludes previously swiped profiles, ranks by recency/relevance. Hybrid approach: batch precompute the geo+preference superset, real-time filter already-swiped at serve time.

"Swipe → atomic match detection." Swipe Service records the decision. On right-swipe, checks for inverse in a single transaction. Canonical pair ordering + unique constraint on Match table guarantees exactly one match per pair. Pair-based partitioning co-locates both swipes on the same shard.

"Match → event queue → push notification." Match event published to event queue. Notification Service consumes events, sends push via mobile gateway (APNs/FCM). In-app delivery via SSE for active users.

Key Trade-offs to Mention

  • Feed generation: real-time (fresh, expensive) vs precomputed (fast, stale) vs hybrid (recommended)
  • Already-swiped filtering: DB subquery (simple, slow) vs bloom filter (fast, false positives)
  • Match detection: synchronous in swipe path (instant feedback) vs async event (simpler consistency)
  • Notification: SSE (low latency, connection overhead) vs polling (simple, wasteful)
  • Geo index: single node (simple) vs sharded by prefix (scalable, cross-shard queries at boundaries)

The System Design Courses

Go beyond memorizing solutions to specific problems. Learn the core concepts, patterns and templates to solve any problem.

Start Learning
Was this lesson clear?

System Design Master Template

Comments