Overview
Tinder lets users browse nearby profiles and swipe right (like) or left (pass). When two users like each other, they match and can chat. This design focuses on the matching backend: recording swipes, detecting matches, and generating the feed of potential matches.
The core technical challenges are data modeling (how to store swipes and matches), feed generation (efficiently finding candidates from millions of users), and scaling writes (handling high swipe volumes with skewed popularity).
This article walks through the design in layers: first the user flows, then the backend operations, then the data model choices, and finally scaling strategies.
Functional Requirements
Four user flows define what the system must support.
Profile Feed. The user opens the app and sees a stack of profiles. Each profile represents a potential match. The feed must exclude people the user has already swiped on and people already matched.
Record Swipe. The user swipes right (like) or left (pass). The backend records this decision. On a right swipe, the system checks for a reverse swipe. If found, both users matched.
Match Detection. When Alice swipes right on Bob, the backend checks: did Bob swipe right on Alice? If yes, create a match. Both users see the match immediately.
View Matches. The user views their list of matches. Each entry shows when the match occurred. Matches persist until one user unmatches.
Out of scope: chat messaging, push notifications, and safety/reporting details beyond basic blocking.
The article includes extensions for premium features ("who liked me"), blocking, and multi-region deployment in the Feature Expansion section to show how the initial design accommodates growth.
Non-Functional Requirements
- Consistency: A match must appear exactly once, never duplicated
- Availability: Feed and swipe operations should remain responsive under load
High-Level Architecture
How do we go from user flows to a working system? We could start by drawing services and databases, but that risks building architecture for its own sake. Instead, let's work bottom-up: start with what users actually DO, extract the verbs, identify the data they touch, then let service boundaries emerge naturally.
This approach ensures we're solving real problems, not theoretical ones.
Core Operations
Before we draw any boxes, let's identify every operation the system needs to support. Each user action translates to backend operations. The three key operations are:
get_feed(user_id)— fetch candidate profiles to swipe onrecord_swipe(swiper, swipee, liked)— save the user's decisioncreate_match(user_a, user_b)— create a match when both swipe right
Grouping into Services
With operations identified, let's group them into services. The question: what's the right boundary?
We'll group operations by what data they touch. Operations that read and write the same tables belong together.
Why group by data instead of by user flow? Two reasons. First, cross-service calls add latency and failure modes. If record_swipe and has_swiped lived in different services, every swipe would require a network hop to check for matches. Second, shared data ownership creates consistency nightmares. If two services both write to the swipes table, they must coordinate to avoid conflicts. Grouping by data keeps each table owned by exactly one service.
| Domain | Operations |
|---|---|
| Swipe | record_swipe, has_swiped |
| Match | create_match, is_match, get_matches |
| Candidate / Feed | get_swipe_candidates |
| User / Profile | update_profile, get_profile |
Now we have operation groups. Should they be separate services? Let's apply decision criteria.
Swipe vs Match - should we split?
Swipe and Match could merge since match detection happens during swipe handling. But let's check our split criteria:
- Different scaling profile? Swipe is write-heavy (every user interaction writes). Match is read-light (users occasionally view their matches list). ✓ Split
- Different latency needs? Both need to be fast. ✗ No split needed
- Different reliability boundary? Both are core features. ✗ No split needed
I'm going to keep them separate. Here's why: swipes scale with total user activity (millions of writes per hour), while match reads scale with users actively checking their matches (much lower). Separating them lets each scale independently—we can add more Swipe Service instances during peak hours without over-provisioning Match Service.
Service Definitions
Each service owns specific tables and exposes operations to others.
| Service | Operations | Owns |
|---|---|---|
| User Service | update_profile, get_profile | users, profiles, photos |
| Swipe Service | record_swipe, has_swiped | swipes table |
| Match Service | create_match, is_match, get_matches | matches table |
| Candidate Service | get_swipe_candidates | read-only across tables |
The Candidate Service is unusual—it owns no tables. It reads from users, swipes, and matches to build the feed. This read-only pattern lets it use read replicas and aggressive caching without worrying about write consistency.
With services defined, let's build the system step by step through each functional requirement. We'll start simple and add complexity only when needed.
1. Profile Feed
Requirement: Users should see a stack of profiles to swipe on.
Let's start with the simplest approach: a single service handling everything.
The User Service stores profiles and generates the feed. When a user opens the app, we query the database for nearby users matching their preferences:
SELECT id, name, age, photos
FROM users
WHERE city = :user_city
AND gender = :user_preferred_gender
AND age BETWEEN :user_min_age AND :user_max_age
LIMIT 20;This works for small scale. The User Service handles profile storage and feed generation in one place. No service calls, no latency overhead.
But what happens when we scale? This query runs on every app open. At 10M users with complex filtering (location, age, gender, preferences), full table scans become expensive. We'll address this in the Feed Generation deep dive. For now, the basic query works.
With profiles showing, users can now swipe. Let's add that capability.
2. Record Swipe
Requirement: Users should be able to swipe right (like) or left (pass). The system must record each decision.
Should we add swipe recording to the User Service? Let's check our split criteria:
- Different scaling profile? Swipes are write-heavy (every interaction writes a row). Profile views are read-heavy. ✓ Split
- Async pipeline? No, swipes are immediate. ✗ No split needed
We'll create a separate Swipe Service. This lets us scale swipe writes independently from profile reads.
Components added:
- Swipe Service: Receives swipe decisions, writes to swipes table
- Swipes table:
(swiper_id, swipee_id, liked, created_at)
Flow:
- User swipes right on a profile
- Client calls
POST /swipeswith swiper, swipee, liked=true - Swipe Service writes to swipes table:
INSERT INTO swipes (swiper_id, swipee_id, liked) VALUES (:swiper, :swipee, TRUE) - Returns success
The swipe is recorded, but we haven't detected matches yet. When Alice swipes right on Bob, we save the decision. But if Bob also swiped right on Alice, we need to detect that match. Let's add that logic.
3. Match Detection
Requirement: When both users swipe right on each other, create a match.
When Alice swipes right on Bob, we need to check: did Bob already swipe right on Alice? If yes, they matched.
Should this logic live in the Swipe Service? Let's think through it. Match detection reads the swipes table (which Swipe Service owns), so it could fit there. But let's check our criteria:
- Different scaling profile? Match queries are less frequent than swipe writes. Viewing the matches list is occasional. ✓ Split
- Different reliability boundary? Matches are critical state that must be consistent. ✗ No split needed
We'll create a Match Service. This keeps match state separate from swipe writes. Swipe Service focuses on recording decisions at high write volume. Match Service focuses on relationship state and match queries.
Components added:
- Match Service: Detects matches, stores them, handles match queries
- Matches table:
(user_a, user_b, created_at)withCHECK (user_a < user_b)for canonical ordering
Flow:
- User swipes right on a profile
- Swipe Service writes the swipe
- Swipe Service calls Match Service:
check_for_match(swiper, swipee) - Match Service queries Swipe Service: "did swipee swipe right on swiper?"
- If yes, Match Service creates match:
INSERT INTO matches (user_a, user_b) VALUES (LEAST(:a,:b), GREATEST(:a,:b)) - Returns match status to client
The canonical ordering (user_a < user_b) ensures Alice-Bob and Bob-Alice map to the same row, preventing duplicates.
Matches are detected. Let's complete the loop by showing them to users.
4. View Matches
Requirement: Users should see their list of matches.
This is straightforward now that matches are stored. The Match Service owns the matches table, so it handles this query:
SELECT m.user_a, m.user_b, m.created_at,
u.name, u.photos
FROM matches m
JOIN users u ON (u.id = CASE WHEN m.user_a = :user_id THEN m.user_b ELSE m.user_a END)
WHERE m.user_a = :user_id OR m.user_b = :user_id
ORDER BY m.created_at DESC;The client calls GET /matches and the Match Service returns the list with profile details fetched from User Service.
Checkpoint: The core flows work. Users can view profiles, swipe, get matches, and view their matches. Now we face a critical decision that will ripple through every other part of the system: how should we store swipes and matches?
System Overview
The API gateway routes requests to the appropriate service. Each service owns its data and exposes operations to other services as needed.
Sequence: Swipe Right with Match
When a user swipes right, the request flows through multiple services. The API routes to the Swipe Service, which records the decision and checks for a reverse swipe. If the other user already liked them, the Swipe Service calls the Match Service to create the match. The response returns immediately with the match status.
Sequence: Get Feed
Opening the app triggers a feed request. The Candidate Service queries the database for users matching location and preferences, then filters out already-swiped and already-matched users. It returns candidate IDs, fetches full profiles from the User Service, and sends the batch back to the client.
Data Model
The system needs three categories of data: user profiles, swipe decisions, and matches.
Users and photos are straightforward. A users table stores profile information. A photos table holds image references with ordering.
CREATE TABLE users (
id BIGINT PRIMARY KEY,
name VARCHAR(100),
age INT,
gender VARCHAR(20),
bio TEXT,
city VARCHAR(100),
last_active TIMESTAMP
);
CREATE TABLE photos (
id BIGINT PRIMARY KEY,
user_id BIGINT REFERENCES users(id),
url VARCHAR(500),
position INT, -- display order
created_at TIMESTAMP DEFAULT NOW()
);These tables are standard. The interesting design decisions lie in how we store swipes and matches.
Core Tables: Swipes and Matches
So far we've designed the services and their operations. Now comes the critical question that will ripple through every other decision in this system: how do we store swipes and matches?
The core data seems simple: who swiped on whom, and what they decided. But there's a tension here that's easy to miss. Let's think through what we're actually modeling.
Consider Alice and Bob. Alice swipes right on Bob. Later, Bob swipes right on Alice. How should we represent this?
One perspective: these are two separate actions. Alice made a decision about Bob. Bob made a decision about Alice. Each action deserves its own record. This leads to Directed Swipes - one row per swipe, direction matters.
Another perspective: there's one relationship between Alice and Bob. Both users contribute to this relationship over time. The relationship has a state (no swipes, one-sided, mutual). This leads to Pair Table - one row per unique pair of users.
Neither is objectively correct. Each optimizes for different access patterns.
Directed Swipes Model
Each swipe gets its own row. Direction matters: Alice→Bob and Bob→Alice are separate rows.
CREATE TABLE swipes (
swiper_id BIGINT NOT NULL,
swipee_id BIGINT NOT NULL,
liked BOOLEAN NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (swiper_id, swipee_id)
);
-- Index for match detection: "Did Bob swipe right on Alice?"
CREATE INDEX idx_swipee_liked ON swipes(swipee_id, liked);
CREATE TABLE matches (
user_a BIGINT NOT NULL,
user_b BIGINT NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
PRIMARY KEY (user_a, user_b),
CHECK (user_a < user_b) -- canonical ordering
);
-- Index for "get my matches" query
CREATE INDEX idx_matches_user_b ON matches(user_b);The CHECK (user_a < user_b) constraint ensures each match stores in one canonical order. Alice-Bob and Bob-Alice both store as (Alice, Bob) where Alice's ID is smaller.
The choice between models ripples through match detection, feed queries, premium features, and sharding strategy. Understanding these effects is critical to choosing the right model.
Model Comparison
Now that we've seen how each model behaves in practice, here's the complete comparison:
| Aspect | Directed Swipes | Pair Table |
|---|---|---|
| Writing a swipe | INSERT one row | UPSERT, determine which column |
| Match check | Query separate row | Already in same row |
| Feed query | Single column scan | OR across two columns |
| "Who liked me" | Single column scan | OR across two columns |
| Sharding (feed) | Local | Scattered |
| Sharding (match) | Cross-shard | Local |
| Storage | 2 rows per mutual swipe | 1 row per pair |
The right choice depends on your access patterns. Neither model is universally better. Directed Swipes optimize for per-user queries: feed filtering, "who liked me", and user history. Pair Table optimizes for per-pair queries: match state and relationship invariants.
For most dating apps, Directed Swipes fits better. Feed queries run on every app open. Match checks run once per swipe. Per-user operations dominate. But if your app centers on relationship state (collaborative features, mutual visibility controls), Pair Table may be worth the per-user query complexity.
Deep Dive: Feed Generation
With the data model decided, let's tackle the next major challenge: feed generation. The feed is our heaviest read path—every app open triggers a feed request, potentially 100x more frequent than swipes. Getting this right is critical for system performance.
What must the feed guarantee?
Feed Semantics
For user U, valid candidates V must satisfy:
Include criteria:
- Location constraint (near U)
- U's preferences (age, gender, etc.)
- Optionally mutual compatibility (orientation matches)
Exclude criteria:
- U themselves
- Users already matched with U
- Users U already swiped on
- Blocked or banned users
Expressed as set operations:
Candidates(U) = EligibleProfiles
∩ Near(U)
∩ PrefsCompatible(U)
- {U}
- Matches(U)
- SwipedBy(U)
- Blocked(U)
Basic Implementation
The naive approach queries users with all filters applied at once:
candidates = users
.where(city = user.city)
.where(age in user.preferred_range)
.where(gender = user.preferred_gender)
.exclude(already_swiped)
.exclude(already_matched)
.limit(20)
This works at small scale. At millions of users, the exclusion checks become expensive—each runs once per candidate row. With 100,000 potential candidates in NYC, that's 100,000 lookups per exclusion table.
Think about what happens at scale. Alice opens the app at 9am. We scan 10M users in NYC, applying expensive location and preference filters. Five minutes later, she opens the app again. We scan 10M users again with the exact same filters. What changed? Nothing. Her location didn't change. Her preferences didn't change. Yet we're recomputing everything.
This is wasteful. The key insight: different filters change at different rates. Location and preferences change weekly. The exclusion list (users already swiped) changes every swipe. Why recompute expensive location filtering when results haven't changed?
The solution: separate slow-changing filters (precompute offline) from fast-changing filters (apply live).
Precomputed Candidate Pools
Offline batch jobs precompute the expensive filtering (location, age, gender, preferences). Live requests only handle the fast-changing exclusions (already swiped, already matched).
The straightforward approach: build a pool per user.
CREATE TABLE candidate_pools (
user_id BIGINT,
candidate_id BIGINT,
score FLOAT,
generated_at TIMESTAMP,
PRIMARY KEY (user_id, candidate_id)
);A batch job runs for each user:
-- Precompute Alice's candidate pool
INSERT INTO candidate_pools (user_id, candidate_id, score, generated_at)
SELECT :alice_id, u.id, compute_score(u, :alice_id), NOW()
FROM users u
WHERE u.city = :alice_city
AND u.age BETWEEN :alice_min_age AND :alice_max_age
AND u.gender = :alice_preferred_gender
AND NOT blocked(:alice_id, u.id);Feed requests become simple lookups:
def get_feed(user_id):
candidate_ids = db.query(
"SELECT candidate_id FROM candidate_pools WHERE user_id = ? LIMIT 1000",
user_id
)
# Filtering and ranking happens nextThe storage explosion problem: At 50M users × 1,000 candidates each = 50 billion rows. The index alone exceeds 1TB. This doesn't scale.
What went wrong? The per-user pool treats each user as unique. But consider Alice and Carol—both in NYC, both 25-30, both seeking males. Their candidate pools overlap 95%. We're storing nearly identical data millions of times.
The key observation: Users with identical preferences see nearly identical pools. Instead of pools per user, compute pools per preference segment.
Segment pools deduplicate this redundancy:
CREATE TABLE segment_pools (
segment_id VARCHAR, -- e.g. 'nyc:male:25-30'
candidate_id BIGINT,
score FLOAT,
generated_at TIMESTAMP,
PRIMARY KEY (segment_id, candidate_id)
);A batch job maintains pools per segment instead of per user:
-- Precompute for segment 'nyc:male:25-30'
INSERT INTO segment_pools (segment_id, candidate_id, score, generated_at)
SELECT 'nyc:male:25-30', u.id, compute_base_score(u), NOW()
FROM users u
WHERE u.city = 'nyc'
AND u.gender = 'male'
AND u.age BETWEEN 25 AND 30
AND u.last_active > NOW() - INTERVAL '30 days';Feed requests resolve user to segment:
def get_feed(user_id):
# Resolve user to segment
segment = f"{user.city}:{user.gender}:{user.age_bucket}"
# Pull candidate IDs from shared segment pool
candidate_ids = db.query(
"SELECT candidate_id FROM segment_pools WHERE segment_id = ?",
segment
)
# Filtering happens next (covered in sections below)Storage cost comparison:
| Approach | Rows | Storage | Reason |
|---|---|---|---|
| Per-user pools | 50B | 1.8 TB (indexes) | 50M users × 1,000 candidates |
| Segment pools | 10M | 360 MB | 10,000 segments × 1,000 candidates |
Segment pools are 5,000x smaller. Updates are cheaper. The entire index fits in memory.
Trade-off: Segment pools lose personalization. The score is generic (computed without knowing the specific user). This is acceptable for most users. Premium subscribers can pay for per-user pools with personalized scoring—keeping storage tractable by applying it to only 5% of users (2.5M rows instead of 50B).
Deep Dive: Scaling Swipe Writes
A dating app with 20M daily active users generates roughly 2 billion swipes per day—23,000 writes per second on average, spiking to 100,000+ during evening peaks. To scale to this volume, we must shard our database.
The sharding strategy depends on which data model you chose. Each model has a natural shard key that aligns with its structure.
The Core Tension
Every swipe involves two parties: the swiper (the person acting) and the swipee (the person being seen). Our two critical operations query different dimensions of this relationship.
Feed filtering asks "Who has Alice already seen?" This is a per-user query. To show Alice fresh candidates, we must exclude everyone she's already swiped on. This query runs on every app open—the most frequent read operation in the system.
Match detection asks "Did Bob already like Alice back?" This is a per-pair query. When Alice swipes on Bob, we check if Bob previously swiped on Alice. This query runs once per swipe—far less frequent than feed generation, but still critical for the instant match reveal.
These two operations pull in opposite directions. One groups data by user; the other groups by pair. We can't optimize both with a single shard key.
Deducing the Shard Key
We must choose one way to split the data.
The Verdict
Let's think through the trade-offs. We have two operations with conflicting needs:
- Feed queries: Every app open. Potentially 10-100x more frequent than swipes.
- Match checks: Once per swipe. Much less frequent.
When we can only optimize one access pattern with our shard key, we optimize for frequency. I'm going to shard by swiper_id. Here's why:
The 5-10ms cross-shard penalty on match checks is acceptable—users won't notice the difference between 15ms and 25ms for match detection. But making every feed query scatter-gather across all shards would destroy performance. Feed queries would go from 10ms to 200ms+ due to fan-out and tail latency. Users would notice that immediately.
This means:
- Feed exclusion: single-shard query (10ms, fast)
- Match detection: cross-shard read (+5-10ms added latency, acceptable)
- Write distribution: swipes scatter by swiper, avoiding hot spots on popular users
We're trading a small penalty on the less frequent operation for keeping the most frequent operation fast.
Match Consistency
Simultaneous swipes can cause race conditions—both users swipe right at the same moment, but neither transaction sees the other's write. Result: two swipes recorded, zero matches created.
The solution depends on the data model:
- Directed model: Use a background reconciliation job to catch missed matches, or split into two transactions so the second always sees the first.
- Pair model: No race possible—both swipes target the same row, so database locking ensures one waits for the other.
Both approaches need a PRIMARY KEY constraint with canonical ordering (user_a < user_b) to prevent duplicate matches.
Deep Dive: Feature Expansion
These extensions test whether your initial design accommodates new requirements. Each reveals different trade-offs.
Summary
The Tinder matching backend solves a deceptively simple problem: detect when two users like each other. The design choices ripple through every layer of the system.
Data Model: Choose between Directed Swipes (two rows per mutual swipe, optimizes per-user queries) and Pair Table (one row per pair, optimizes per-pair queries). The choice affects feed queries, match detection, sharding strategy, and race condition handling.
Feed Generation: Separate slow filtering (location, preferences) from fast filtering (already swiped). Precompute candidate pools offline, apply exclusions live. Lazy hydration fetches IDs before profiles. Bloom Filters compress exclusion checks. Hot users require fairness intervention to prevent feedback loops.
Scaling Swipe Writes: Shard by swiper_id to keep feed queries local. Accept cross-shard match checks as the less frequent operation. Hot swipees create read hot spots—handle with async match detection via Kafka.
Race Conditions: Simultaneous swipes create races in Directed Model. Fix with background reconciliation jobs or two-transaction approach. Pair Model eliminates races via row-level locking on UPSERT.
Feature Expansion: Premium features ("who liked me") reveal directional query requirements. Blocking affects multiple layers. Multi-region deployment uses home region for swipes with async cross-region match detection.
The architecture keeps services focused. Swipe service owns decisions. Match service owns matches. Candidate service owns feed generation. Each scales independently based on its access patterns.