Real-time Gaming Leaderboard System Design Interview Question
Problem: Design a real-time global leaderboard for a mobile game: top-100 worldwide, top-K by region, and a given player's rank with surrounding neighbors, all updated in near real-time.
Overview
A real-time gaming leaderboard looks cheap on the whiteboard and expensive in production. The reason is rank: you cannot just store a score, you have to know where that score falls among 100 million other scores, and you have to know it in milliseconds while tournaments run and scores stream in at tens of thousands per second. A plain SQL ORDER BY works for a leaderboard of a hundred friends and falls over at global scale, because each read is O(N log N) on the backing index. The standard move, and the one this design commits to, is a Redis Sorted Set (ZSET) that keeps all players in a skiplist keyed by score. ZADD is O(log N), ZREVRANGE is O(log N + M), and ZREVRANK is O(log N), which is exactly the shape we need for top-K, neighbors, and "where am I" queries. Durability lives in a separate relational store; Redis is the fast source of truth for rank, not the permanent record of scores.
Summary
A read-heavy ranking system built around a Redis Sorted Set (ZSET) as the hot in-memory source of truth for rank, backed by a durable SQL/KV store for history, snapshots, and cold reads. The dominant design choice is using Redis ZSET's skiplist — which gives O(log N) ZADD and O(log N + M) ZREVRANGE — instead of computing ranks from a SQL ORDER BY (which would be O(N log N) per read at 100M rows). The main tradeoff is that Redis is not authoritative storage: scores are persisted asynchronously to the durable DB, and a periodic snapshot job rebuilds the ZSET from the DB to guard against Redis loss. Realtime push via WebSocket is optional but essential for tournament UX — only fire on events that actually change top-K or the player's displayed neighborhood. Sized for ~100M monthly players, ~1M concurrent, ~50K score submissions/sec, ~500K leaderboard reads/sec.
Requirements
Functional
- Return the global top-100 leaderboard in ranked order
- Return top-K by region, country, or tournament bracket
- Return a given player's rank
- Return the N players immediately above and below a given player
- Accept score submissions and reflect them in the leaderboard within 1 second
- Support per-tournament leaderboards with independent lifecycles
Non-functional
- p99 read latency under 50ms for top-K and rank queries
- p99 write latency under 100ms for score submissions
- At least 99.95% availability on the read path
- Durable history of every submitted score for audit and replay
- Horizontal scalability by sharded ZSETs (per region or per tournament)
- Graceful degradation: reads continue even if the durable DB is unavailable
Capacity Assumptions
- 100M monthly active players, 10M daily, 1M concurrent at peak
- Each player submits ~5 scores per session, average session 10 min
- Leaderboard is viewed roughly 3x per session (entry screen + post-match + social tab)
- Top-100 global + 10 regional boards + 1 per-friend-group board
- Score freshness target: < 1 second from submit to visible in leaderboard
Back-of-Envelope Estimates
- Score writes: 10M * 5 / 86400 ≈ 580 QPS avg, peak ~50K/sec during tournaments
- Leaderboard reads: 10M * 3 / 86400 ≈ 350 QPS avg, peak ~500K/sec
- Redis ZSET size: 100M entries * ~80 bytes/entry ≈ 8 GB (fits easily on one node, sharded by region for HA)
- Durable DB storage: 10M daily * 5 submissions * 200 bytes * 365 ≈ 3.6 TB/year history
- WebSocket fan-out: 1M concurrent * 0.1 actively-viewing ≈ 100K live subscribers during a tournament
High-level architecture
The write path takes a signed score from the game client, validates it server-side, and performs two operations in parallel: a ZADD into the Redis ZSET for the relevant leaderboard and an INSERT into the durable scores table for history and audit. The two writes are intentionally not wrapped in a distributed transaction. The ZSET update is the one that users see; the durable write is how we reconstruct state if Redis is lost. Redis uses GT semantics so that a lower replayed score never overwrites a higher one, which makes the pipeline safely idempotent. The read path is simpler and is backed by three ZSET commands. ZREVRANGE with WITHSCORES for top-K, ZREVRANK for a single player's rank, and a paired ZREVRANK plus ZRANGE for the neighbors window around that player. A thin Redis-fronted JSON cache absorbs the hot top-100 reads, since the top of the board changes much slower than individual ranks below it. A separate snapshot job rebuilds the ZSET from the durable DB on a schedule and after Redis failover, so no player permanently loses rank just because a cache node died. Real-time push via WebSocket is optional but worth the complexity for live tournaments: we only publish a rank_change event when the ZADD moves a player into or out of the user's currently displayed window, so fan-out is bounded by subscriber count, not score throughput.
Architecture Components (9)
- Game Client (client) — Mobile / console / web game that submits scores after a match and renders leaderboards on the social screens.
- Load Balancer (lb) — L7 HTTPS load balancer distributing ingest and read traffic to separate API fleets.
- Score Ingest API (api) — Stateless service validating and persisting score submissions; writes to Redis ZSET + durable DB.
- Leaderboard Read API (api) — Stateless service returning top-K, region boards, and a player's rank + neighbors.
- Redis Sorted Set (ZSET) (cache) — In-memory skiplist-backed sorted set holding (user_id, score) — the hot, authoritative ranking structure.
- Scores DB (durable) (sql) — Source-of-truth SQL store holding every score submission for durability, audit, and snapshot rebuild.
- Redis Cache (read-side) (cache) — Short-TTL cache of pre-rendered leaderboard responses, fronting the ZSET to absorb read storms.
- Snapshot / Batch Job (worker) — Periodic reconciler that rebuilds Redis ZSETs from the durable DB and writes hourly snapshots for disaster recovery.
- WebSocket Push (worker) — Long-lived WebSocket server fanning out rank-change events to subscribed clients during live tournaments.
Operations Walked Through (3)
- submit — Player finishes a match. Client signs and POSTs the score. API verifies, writes to Redis ZSET and durable DB, and emits a WebSocket event if the rank actually changed inside a watched window.
- read-top-100 — User opens the leaderboard tab. Read API serves from a 1s-TTL cache; on miss it issues a ZREVRANGE and hydrates display names.
- read-user-rank — User taps 'Where am I?'. API fetches the user's rank, then a ±5 window around it. Two Redis ops, both O(log N).
Implementation
public class LeaderboardService {
private final RedisClient redis; // Jedis/Lettuce-style sync client
private final String key; // e.g. "leaderboard:global"
public LeaderboardService(RedisClient redis, String key) {
this.redis = redis; this.key = key;
}
/** Submit a score; GT ensures lower replays cannot overwrite a higher score. */
public void submitScore(String userId, long score) {
redis.zadd(key, score, userId, ZAddArgs.Builder.gt().ch());
}
/** Top-K players, descending by score, with their scores. */
public List<RankedEntry> getTopK(int k) {
List<ScoredValue<String>> rows = redis.zrevrangeWithScores(key, 0, k - 1);
List<RankedEntry> out = new ArrayList<>(rows.size());
long rank = 1;
for (ScoredValue<String> sv : rows) {
out.add(new RankedEntry(rank++, sv.getValue(), (long) sv.getScore()));
}
return out;
}
/** Zero-indexed ZSET rank translated to human 1-indexed rank. */
public OptionalLong getUserRank(String userId) {
Long idx = redis.zrevrank(key, userId);
return idx == null ? OptionalLong.empty() : OptionalLong.of(idx + 1);
}
public record RankedEntry(long rank, String userId, long score) {}
}
public class NeighborsQuery {
private final RedisClient redis;
private final String key;
public NeighborsQuery(RedisClient redis, String key) {
this.redis = redis; this.key = key;
}
/** Return `radius` players above and below userId, descending by score. */
public List<LeaderboardService.RankedEntry> getNeighbors(String userId, int radius) {
Long zeroBasedRank = redis.zrevrank(key, userId);
if (zeroBasedRank == null) return List.of();
long start = Math.max(0, zeroBasedRank - radius);
long end = zeroBasedRank + radius;
List<ScoredValue<String>> rows = redis.zrevrangeWithScores(key, start, end);
List<LeaderboardService.RankedEntry> out = new ArrayList<>(rows.size());
long rank = start + 1;
for (ScoredValue<String> sv : rows) {
out.add(new LeaderboardService.RankedEntry(rank++, sv.getValue(), (long) sv.getScore()));
}
return out;
}
}
public class ScoreIngest {
private final LeaderboardService lb;
private final ScoresRepository durable;
private final WebSocketBroadcaster ws;
private final ScoreValidator validator;
public ScoreIngest(LeaderboardService lb, ScoresRepository durable,
WebSocketBroadcaster ws, ScoreValidator validator) {
this.lb = lb; this.durable = durable; this.ws = ws; this.validator = validator;
}
public void ingest(ScoreSubmission s) {
validator.verifyOrThrow(s); // HMAC + server-side sanity bounds
// Durable record first: if Redis dies later, snapshot job can rebuild from this.
durable.insertScore(s.userId(), s.matchId(), s.score(), s.submittedAt());
// Hot path: updates the ZSET that the read API reads from.
lb.submitScore(s.userId(), s.score());
// Only notify if the new score actually moves the player into top-100.
OptionalLong newRank = lb.getUserRank(s.userId());
if (newRank.isPresent() && newRank.getAsLong() <= 100) {
ws.broadcast(new RankChangeEvent(s.userId(), newRank.getAsLong(), s.score()));
}
}
}
Key design decisions & trade-offs
- Primary rank store — Chosen: Redis ZSET (skiplist) as hot source of truth. O(log N) ranked operations match the read pattern exactly; a SQL ORDER BY would force a full index scan per top-K read at 100M rows.
- Durability — Chosen: Async dual-write to a relational scores table. Keeps Redis as the fast path while ensuring no score is permanently lost if a Redis node fails. Costs a periodic snapshot job to repair drift.
- Update semantics — Chosen: ZADD with GT (greater-than) and CH flags. Makes retries idempotent and prevents an out-of-order replay from lowering a player's visible score. Means we rely on GT semantics being available in the client.
- Sharding — Chosen: One ZSET per logical leaderboard (global, region, tournament). Natural partition boundary and keeps each ZSET small enough to fit comfortably in one Redis instance. Global board is the only one that grows unbounded and may need further partitioning.
- Live updates — Chosen: WebSocket push only on top-K-impacting writes. Keeps fan-out bounded by active viewers rather than by global write rate; the other 99% of score writes go silently to the ZSET.
Interview follow-ups
- Shard the global ZSET across multiple Redis instances for boards beyond a single node's memory
- Add a seasonal / tournament lifecycle so old ZSETs can be archived and retired
- Introduce anti-cheat: server-authoritative match validation before score acceptance
- Add a delta-encoded WebSocket protocol so reconnecting clients can catch up without a full replay
- Expose friends-only leaderboards backed by ZINTERSTORE against a social graph key