The System Design Courses

Go beyond memorizing solutions to specific problems. Learn the core concepts, patterns and templates to solve any problem.

Start Learning

Design Tinder

Overview

Tinder lets users browse nearby profiles and swipe right (like) or left (pass). When two users like each other, they match and can chat. This design focuses on the matching backend: recording swipes, detecting matches, and generating the feed of potential matches.

The core technical challenges are data modeling (how to store swipes and matches), feed generation (efficiently finding candidates from millions of users), and scaling writes (handling high swipe volumes with skewed popularity).

This article walks through the design in layers: first the user flows, then the backend operations, then the data model choices, and finally scaling strategies.

Functional Requirements

Four user flows define what the system must support.

Core Flows

Swipe Feed. The user opens the app and sees a stack of profiles. Each profile represents a potential match. The feed must exclude people the user has already swiped on and people already matched.

Record Swipe. The user swipes right (like) or left (pass). The backend records this decision. On a right swipe, the system checks for a reverse swipe. If found, both users matched.

Match Detection. When Alice swipes right on Bob, the backend checks: did Bob swipe right on Alice? If yes, create a match. Both users see the match immediately.

View Matches. The user views their list of matches. Each entry shows when the match occurred. Matches persist until one user unmatches.

Out of scope: chat messaging, push notifications, and safety/reporting details beyond basic blocking.

The article includes extensions for premium features ("who liked me"), blocking, and multi-region deployment in the Feature Expansion section to show how the initial design accommodates growth.

Non-Functional Requirements

  • Consistency: A match must appear exactly once, never duplicated
  • Availability: Feed and swipe operations should remain responsive under load

High-Level Architecture

How do we go from user flows to a working system? Start with what users do, extract the verbs, then group those verbs into coherent services. This bottom-up approach ensures the architecture serves real needs rather than theoretical patterns.

From Flows to Operations

Each user action implies backend work. Walk through each flow, find the verbs, and name the operations.

FlowVerbOperation
Swipe Feed"show profiles not seen"get_swipe_candidates(user_id)
Record Swipe"record this decision"record_swipe(swiper, swipee, liked)
Record Swipe"check for reverse swipe"has_swiped(Bob, Alice)
Record Swipe"create match if mutual"create_match(user_a, user_b)
Match Detection"did Bob swipe right?"is_match(Alice, Bob)
View Matches"list all matches"get_matches(user_id)

Grouping into Services

Now cluster operations by what data they touch. Operations that read and write the same tables belong together. This minimizes cross-service calls and keeps data ownership clear.

DomainOperations
Swiperecord_swipe, has_swiped
Matchcreate_match, is_match, get_matches
Candidate / Feedget_swipe_candidates
User / Profileupdate_profile, get_profile

Swipe and Match could merge into one service since match detection happens during swipe handling. However, keeping them separate lets each scale independently. Swipe is write-heavy with every user interaction. Match reads are occasional and can tolerate caching.

Service Definitions

Each service owns specific tables and exposes operations to others.

ServiceOperationsOwns
User Serviceupdate_profile, get_profileusers, profiles, photos
Swipe Servicerecord_swipe, has_swipedswipes table
Match Servicecreate_match, is_match, get_matchesmatches table
Candidate Serviceget_swipe_candidatesread-only across tables

The Candidate Service is unusual—it owns no tables. It reads from users, swipes, and matches to build the feed. This read-only pattern lets it use read replicas and aggressive caching without worrying about write consistency.

System Overview

High-Level Architecture

The API gateway routes requests to the appropriate service. Each service owns its data and exposes operations to other services as needed.

Sequence: Swipe Right with Match

When a user swipes right, the request flows through multiple services. The API routes to the Swipe Service, which records the decision and checks for a reverse swipe. If the other user already liked them, the Swipe Service calls the Match Service to create the match. The response returns immediately with the match status.

Swipe Sequence

Sequence: Get Feed

Opening the app triggers a feed request. The Candidate Service queries the database for users matching location and preferences, then filters out already-swiped and already-matched users. It returns candidate IDs, fetches full profiles from the User Service, and sends the batch back to the client.

Feed Sequence

Data Model

The system needs three categories of data: user profiles, swipe decisions, and matches.

Users and photos are straightforward. A users table stores profile information. A photos table holds image references with ordering.

CREATE TABLE users ( id BIGINT PRIMARY KEY, name VARCHAR(100), age INT, gender VARCHAR(20), bio TEXT, city VARCHAR(100), last_active TIMESTAMP ); CREATE TABLE photos ( id BIGINT PRIMARY KEY, user_id BIGINT REFERENCES users(id), url VARCHAR(500), position INT, -- display order created_at TIMESTAMP DEFAULT NOW() );

These tables are standard. The interesting design decisions lie in how we store swipes and matches.

Core Tables: Swipes and Matches

The core data is simple: who swiped on whom, and what they decided. The question is how to structure this relationship.

Consider Alice and Bob. Alice swipes right on Bob. Later, Bob swipes right on Alice. How do we represent this?

One perspective: these are two separate actions. Alice made a decision about Bob. Bob made a decision about Alice. Each action deserves its own record. This leads to Directed Swipes - one row per swipe, direction matters.

Another perspective: there's one relationship between Alice and Bob. Both users contribute to this relationship over time. The relationship has a state (no swipes, one-sided, mutual). This leads to Pair Table - one row per unique pair of users.

Neither is objectively correct. Each optimizes for different access patterns.

Data Model Comparison

Directed Swipes Model

Each swipe gets its own row. Direction matters: Alice→Bob and Bob→Alice are separate rows.

CREATE TABLE swipes ( swiper_id BIGINT NOT NULL, swipee_id BIGINT NOT NULL, liked BOOLEAN NOT NULL, created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (swiper_id, swipee_id) ); -- Index for match detection: "Did Bob swipe right on Alice?" CREATE INDEX idx_swipee_liked ON swipes(swipee_id, liked); CREATE TABLE matches ( user_a BIGINT NOT NULL, user_b BIGINT NOT NULL, created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (user_a, user_b), CHECK (user_a < user_b) -- canonical ordering ); -- Index for "get my matches" query CREATE INDEX idx_matches_user_b ON matches(user_b);

The CHECK (user_a < user_b) constraint ensures each match stores in one canonical order. Alice-Bob and Bob-Alice both store as (Alice, Bob) where Alice's ID is smaller.

The choice between models ripples through match detection, feed queries, premium features, and sharding strategy. Understanding these effects is critical to choosing the right model.

Model Comparison

Now that we've seen how each model behaves in practice, here's the complete comparison:

AspectDirected SwipesPair Table
Writing a swipeINSERT one rowUPSERT, determine which column
Match checkQuery separate rowAlready in same row
Feed querySingle column scanOR across two columns
"Who liked me"Single column scanOR across two columns
Sharding (feed)LocalScattered
Sharding (match)Cross-shardLocal
Storage2 rows per mutual swipe1 row per pair

The right choice depends on your access patterns. Neither model is universally better. Directed Swipes optimize for per-user queries: feed filtering, "who liked me", and user history. Pair Table optimizes for per-pair queries: match state and relationship invariants.

For most dating apps, Directed Swipes fits better. Feed queries run on every app open. Match checks run once per swipe. Per-user operations dominate. But if your app centers on relationship state (collaborative features, mutual visibility controls), Pair Table may be worth the per-user query complexity.

Deep Dive: Feed Generation

The feed is the heaviest read path. Every app open triggers a feed request. What must the feed guarantee?

Feed Semantics

For user U, valid candidates V must satisfy:

Include criteria:

  • Location constraint (near U)
  • U's preferences (age, gender, etc.)
  • Optionally mutual compatibility (orientation matches)

Exclude criteria:

  • U themselves
  • Users already matched with U
  • Users U already swiped on
  • Blocked or banned users

Expressed as set operations:

Candidates(U) = EligibleProfiles
                ∩ Near(U)
                ∩ PrefsCompatible(U)
                - {U}
                - Matches(U)
                - SwipedBy(U)
                - Blocked(U)

Basic Implementation

The naïve approach queries the users table with all filters:

SELECT u.id, u.name, u.age, u.bio FROM users u WHERE u.id != :user_id AND u.city = :city AND u.age BETWEEN :min_age AND :max_age AND u.gender = :preferred_gender AND NOT EXISTS (SELECT 1 FROM swipes WHERE swiper_id = :user_id AND swipee_id = u.id) AND NOT EXISTS (SELECT 1 FROM matches WHERE user_a = :user_id AND user_b = u.id OR ...) AND NOT EXISTS (SELECT 1 FROM blocks WHERE ...) ORDER BY u.last_active DESC LIMIT 20;

This works at small scale. At millions of users, the NOT EXISTS subqueries become expensive. Each feed request re-executes the same filtering logic.

The expensive parts (location, preferences) change slowly. The cheap parts (already swiped) change quickly. Separate precomputation from live filtering.

Precomputed Candidate Pools

Offline jobs precompute the expensive filtering. Live requests handle the fast-changing exclusions.

Feed Pipeline

The straightforward approach: build a pool per user.

CREATE TABLE candidate_pools ( user_id BIGINT, candidate_id BIGINT, score FLOAT, generated_at TIMESTAMP, PRIMARY KEY (user_id, candidate_id) );

A batch job runs for each user:

-- Precompute Alice's candidate pool INSERT INTO candidate_pools (user_id, candidate_id, score, generated_at) SELECT :alice_id, u.id, compute_score(u, :alice_id), NOW() FROM users u WHERE u.city = :alice_city AND u.age BETWEEN :alice_min_age AND :alice_max_age AND u.gender = :alice_preferred_gender AND NOT blocked(:alice_id, u.id);

Feed requests become simple lookups:

def get_feed(user_id): candidate_ids = db.query( "SELECT candidate_id FROM candidate_pools WHERE user_id = ? LIMIT 1000", user_id ) # Filtering and ranking happens next

The storage explosion problem: At 50M daily active users, this creates 50 billion rows.

MetricCalculationResult
Users50M DAU50,000,000
Candidates per user1,000 profiles1,000
Total rows50M × 1,00050,000,000,000
Storage (8 bytes ID + 4 bytes score)50B × 12 bytes600 GB (just data)
Index sizeTypically 2-3x data size1.2-1.8 TB

The index doesn't fit in memory. Updates are slow. This doesn't scale.

The key observation: Most users in the same city, age range, and gender preference see nearly identical pools. Alice (NYC, 25-30, seeking males) and Carol (NYC, 25-30, seeking males) get 95% overlapping candidates. Why store this twice?

Shared segment pools deduplicate this redundancy:

CREATE TABLE segment_pools ( segment_id VARCHAR, -- e.g. 'nyc:male:25-30' candidate_id BIGINT, score FLOAT, generated_at TIMESTAMP, PRIMARY KEY (segment_id, candidate_id) );

A batch job maintains pools per segment instead of per user:

-- Precompute for segment 'nyc:male:25-30' INSERT INTO segment_pools (segment_id, candidate_id, score, generated_at) SELECT 'nyc:male:25-30', u.id, compute_base_score(u), NOW() FROM users u WHERE u.city = 'nyc' AND u.gender = 'male' AND u.age BETWEEN 25 AND 30 AND u.last_active > NOW() - INTERVAL '30 days';

Feed requests resolve user to segment:

def get_feed(user_id): # Resolve user to segment segment = f"{user.city}:{user.gender}:{user.age_bucket}" # Pull candidate IDs from shared segment pool candidate_ids = db.query( "SELECT candidate_id FROM segment_pools WHERE segment_id = ?", segment ) # Filtering happens next (covered in sections below)

Storage cost comparison:

ApproachRowsStorageReason
Per-user pools50B1.8 TB (indexes)50M users × 1,000 candidates
Segment pools10M360 MB10,000 segments × 1,000 candidates

Segment pools are 5,000x smaller. Updates are cheaper. The entire index fits in memory.

Trade-off: Segment pools lose personalization. The score is generic (computed without knowing the specific user). This is acceptable for most users. Premium subscribers can pay for per-user pools with personalized scoring—keeping storage tractable by applying it to only 5% of users (2.5M rows instead of 50B).

Deep Dive: Scaling Swipe Writes

A dating app with 20M daily active users generates roughly 2 billion swipes per day—23,000 writes per second on average, spiking to 100,000+ during evening peaks. To scale to this volume, we must shard our database.

The sharding strategy depends on which data model you chose. Each model has a natural shard key that aligns with its structure.

The Core Tension

Every swipe involves two parties: the swiper (the person acting) and the swipee (the person being seen). Our two critical operations query different dimensions of this relationship.

Feed filtering asks "Who has Alice already seen?" This is a per-user query. To show Alice fresh candidates, we must exclude everyone she's already swiped on. This query runs on every app open—the most frequent read operation in the system.

Match detection asks "Did Bob already like Alice back?" This is a per-pair query. When Alice swipes on Bob, we check if Bob previously swiped on Alice. This query runs once per swipe—far less frequent than feed generation, but still critical for the instant match reveal.

These two operations pull in opposite directions. One groups data by user; the other groups by pair. We can't optimize both with a single shard key.

Deducing the Shard Key

We must choose one way to split the data.

The Verdict

We shard by swiper ID. Feed queries run on every app open. Match checks run once per swipe. Users open the app 10-20 times per day but swipe maybe 100 times per session. Feed generation vastly outnumbers match detection.

Optimizing the heaviest read path (feed) at the cost of a network hop on the write path (match check) is the right trade-off. A cross-shard read adds 5-10ms of latency—acceptable when the alternative is scatter-gather across 256 shards on every feed load.

Bonus: writes distribute evenly. Each active user generates roughly similar swipe volumes. A popular user like Bob receives swipes from many different users—Alice on shard 7, Carol on shard 3, Dave on shard 12. Those writes scatter across many shards because each swiper controls their own shard. Bob's popularity doesn't concentrate writes; it distributes them.

Deep Dive: Race Conditions and Match Consistency

The non-functional requirements specify "A match must appear exactly once, never duplicated." This constraint seems straightforward until you consider concurrent swipes. What happens when Alice and Bob swipe right on each other at the exact same moment?

The answer depends entirely on which data model you chose.

Directed Swipes Model: Race Condition Exists

Concurrent swipes create a race condition you must handle.

def record_swipe(swiper_id, swipee_id, liked): with db.transaction(): # All in one transaction db.insert_swipe(swiper_id, swipee_id, liked) if liked: reverse = db.query( "SELECT 1 FROM swipes WHERE swiper_id=? AND swipee_id=? AND liked=TRUE", swipee_id, swiper_id ) if reverse: db.insert_match(swiper_id, swipee_id)

The race condition: When Alice and Bob swipe simultaneously, both transactions' SELECT statements can execute before either transaction's INSERT has committed. Under READ COMMITTED isolation (the default), uncommitted changes are invisible to other transactions.

Simultaneous Swipe Race

Timeline of the race:

t=0ms:  Alice's TXN begins: INSERT Alice→Bob (uncommitted)
t=0ms:  Bob's TXN begins: INSERT Bob→Alice (uncommitted)
t=5ms:  Alice's TXN: SELECT Bob→Alice → NO (Bob's INSERT not committed yet)
t=5ms:  Bob's TXN: SELECT Alice→Bob → NO (Alice's INSERT not committed yet)
t=10ms: Alice's TXN commits
t=10ms: Bob's TXN commits
Result: Two swipes recorded, zero matches created

Without handling this race, users experience "ghosted matches" - they liked someone who liked them back, but the system never created the match.

Prerequisite: Database Constraint

Regardless of which solution you choose, add a PRIMARY KEY constraint to prevent duplicate matches:

CREATE TABLE matches ( user_a BIGINT NOT NULL, user_b BIGINT NOT NULL, created_at TIMESTAMP DEFAULT NOW(), PRIMARY KEY (user_a, user_b), CHECK (user_a < user_b) -- Canonical ordering );

The CHECK (user_a < user_b) ensures Alice-Bob and Bob-Alice both map to the same row. This prevents duplicates if the background job runs twice or if both transactions somehow create a match.

Solutions for Directed Swipes Model

Choosing the Right Solution

Use background reconciliation for high-scale systems:

  • Catches missed matches from race conditions
  • No performance impact on write path
  • Requires background job infrastructure

Use two-transaction approach for lower-scale systems:

  • Eliminates the race entirely - no missed matches
  • Simpler - no background jobs needed
  • Trade-off: 2x transaction overhead

Both approaches require the PRIMARY KEY constraint on the matches table (shown in prerequisite section above).

Deep Dive: Feature Expansion

These extensions test whether your initial design accommodates new requirements. Each reveals different trade-offs.

Summary

The Tinder matching backend solves a deceptively simple problem: detect when two users like each other. The design choices ripple through every layer of the system.

Data Model: Choose between Directed Swipes (two rows per mutual swipe, optimizes per-user queries) and Pair Table (one row per pair, optimizes per-pair queries). The choice affects feed queries, match detection, sharding strategy, and race condition handling.

Feed Generation: Separate slow filtering (location, preferences) from fast filtering (already swiped). Precompute candidate pools offline, apply exclusions live. Lazy hydration fetches IDs before profiles. Bloom Filters compress exclusion checks. Hot users require fairness intervention to prevent feedback loops.

Scaling Swipe Writes: Shard by swiper_id to keep feed queries local. Accept cross-shard match checks as the less frequent operation. Hot swipees create read hot spots—handle with async match detection via Kafka.

Race Conditions: Simultaneous swipes create races in Directed Model. Fix with background reconciliation jobs or two-transaction approach. Pair Model eliminates races via row-level locking on UPSERT.

Feature Expansion: Premium features ("who liked me") reveal directional query requirements. Blocking affects multiple layers. Multi-region deployment uses home region for swipes with async cross-region match detection.

The architecture keeps services focused. Swipe service owns decisions. Match service owns matches. Candidate service owns feed generation. Each scales independently based on its access patterns.

The System Design Courses

Go beyond memorizing solutions to specific problems. Learn the core concepts, patterns and templates to solve any problem.

Start Learning

System Design Master Template

Comments