Twitter Trending Topics (Streaming Analytics) System Design Interview Question
Problem: Design a real-time trending topics / hashtag system over a 500M-tweets/day firehose.
Overview
Twitter-scale trending is a streaming analytics problem disguised as a list of ten hashtags. The firehose is half a billion tweets a day, peaking roughly three times higher during globally correlated events like a championship final or an election call, and the product contract is still just "show me what is popular right now". Exact counting at that volume is not free: a naive aggregator would need to track a counter for every distinct term across a rolling five-minute window, which on the long tail of a social network means hundreds of millions of keys. Worse, keeping those counts accurate under parallelism forces a shuffle on every single update. The canonical answer is to trade exact counts for bounded error. A Count-Min Sketch, combined with a MinHeap of the current heavy hitters, holds the entire window in a few megabytes per task, updates in constant time, and is correct enough for a human-visible trending panel. Freshness comes from Kafka plus a stream processor, not from a database query.
Summary
A streaming analytics pipeline that ingests tweets into Kafka, runs Flink jobs that maintain Count-Min Sketch + heavy-hitter Top-K per sliding window, and surfaces results via a small hot-topic cache behind a read API. The dominant design choice is approximate sketches (CMS + Top-K heap) over exact counting — exact counts would require O(unique-terms) memory per window and cross-node shuffles on every update, while CMS gives ε-bounded counts in kilobytes with single-pass updates. The main tradeoff is accuracy at the long tail: heavy hitters are correct within a few percent, but low-frequency terms have noisy counts that aren't trustworthy.
Requirements
Functional
- Ingest tweets from a Kafka firehose at 5.8K QPS sustained and 18K peak
- Extract hashtags and tokens per tweet and feed them into the sketch
- Maintain a 5-minute sliding window, refreshed every 10 seconds
- Expose a global top-K (K=50) and a per-country top-K (K=20)
- Serve trending reads from a hot cache with p99 under 100ms
- Allow on-demand inspection of the underlying term counts for debugging
Non-functional
- Approximate counts with ε ~ 0.1% error at the 99% confidence level
- At least 99.9% availability on the read path
- Bounded memory per stream task (CMS fits in a few MB per window)
- Horizontal throughput scalability by Kafka partition count
- Tolerate stream processor restarts with at most ~1 minute of re-read from Kafka
- Back-pressure on Kafka producer cannot block the user-visible tweet write path
Capacity Assumptions
- 500M tweets/day, avg 2 hashtags per tweet → 1B hashtag tokens/day
- Peak burst 3x sustained (Super Bowl, elections)
- Trending window: 5 min sliding, updated every 10s
- Top-K output: K=50 globally, K=20 per country
- Target trending-read latency < 100ms p99
Back-of-Envelope Estimates
- Ingest QPS: 500M / 86400 ≈ 5.8K tweets/sec sustained, ~18K peak
- Token QPS through Flink: 2x tweet rate ≈ 12K–36K tokens/sec
- CMS size per window: 4 hashes * 2^20 counters * 4B ≈ 16 MB — trivial per Flink task
- Hot topic cache: 50 topics * few KB = KB-scale; 10M reads/day → ~120 RPS
High-level architecture
Tweets land on a tweet-ingest API that publishes to a partitioned Kafka topic; this is the one boundary where we accept asynchronous durability in exchange for never blocking the tweet write path. A fleet of stream processors each own some subset of Kafka partitions and maintain two pieces of state per sliding window: a Count-Min Sketch keyed by (term, window) and a MinHeap of size K that remembers the current top-K terms and their estimated counts. For every tweet, the processor extracts hashtags and tokens, increments the CMS once per token, and probes the heap to decide whether the updated estimate displaces the current min. Every 10 seconds the processor snapshots the heap and writes it to a hot topic cache keyed by window_end_ts. The read API is almost trivial: GET /trending reads the latest snapshot from the cache and returns it with a short TTL and a generated_at timestamp so clients can reason about staleness. Sliding windows are managed by keeping several sketches in flight (one per recent window) and dropping the oldest as time advances; expiry is cheap because a sketch is small and flat. Global top-K is computed by merging per-partition heaps at the snapshot boundary. The separation between a tiny stateless read API and a stateful but bounded stream job is what keeps this design operable at Twitter scale without a database in the critical path.
Architecture Components (7)
- Client UI (client) — Web/mobile app that renders the trending panel.
- Tweet Ingest API (api) — Receives tweet events (from the core Twitter write path) and emits them to Kafka.
- Tweet Stream (Kafka) (stream) — Durable, partitioned tweet firehose feeding downstream analytics.
- Stream Processor (Flink) (stream-processor) — Consumes tweets, extracts tokens, maintains CMS + Top-K per 5-min sliding window.
- CMS + Top-K State (stream-processor) — Approximate counting structures kept as Flink operator state.
- Hot Topic Cache (cache) — Redis holding precomputed trending lists per country.
- Trending API (api) — Thin read API returning the trending list for a country.
Operations Walked Through (3)
- ingest — Tweet write path publishes to Kafka; does not block user write.
- process — Flink consumes, extracts hashtags, updates CMS + Top-K, periodically publishes to cache.
- read-trending — User opens trending panel; API serves precomputed list from Redis (falls back to reading the latest CMS snapshot from Flink state on cache miss).
Implementation
public class CountMinSketch {
private final int depth; // number of hash functions
private final int width; // counters per row (power of 2)
private final long[][] counters; // depth x width
private final long[] seeds;
public CountMinSketch(double epsilon, double delta) {
this.width = Integer.highestOneBit((int) Math.ceil(Math.E / epsilon)) << 1;
this.depth = (int) Math.ceil(Math.log(1.0 / delta));
this.counters = new long[depth][width];
this.seeds = new long[depth];
Random r = new Random(0xC0FFEEL);
for (int i = 0; i < depth; i++) seeds[i] = r.nextLong();
}
public void add(String term, long count) {
for (int i = 0; i < depth; i++) {
int idx = bucket(term, i);
counters[i][idx] += count;
}
}
/** Estimate is the min across rows; never under-counts, may over-count. */
public long estimate(String term) {
long min = Long.MAX_VALUE;
for (int i = 0; i < depth; i++) {
min = Math.min(min, counters[i][bucket(term, i)]);
}
return min;
}
private int bucket(String term, int row) {
long h = term.hashCode() * 0x9E3779B97F4A7C15L ^ seeds[row];
h ^= (h >>> 33);
return (int) (h & (width - 1));
}
}
public class TopKTracker {
private final int k;
private final PriorityQueue<Entry> heap; // min-heap by count
private final Map<String, Entry> byTerm; // O(1) lookup for updates
public TopKTracker(int k) {
this.k = k;
this.heap = new PriorityQueue<>(Comparator.comparingLong(e -> e.count));
this.byTerm = new HashMap<>();
}
/** Offer a term with its current CMS estimate; cheaper than scanning the heap. */
public void offer(String term, long estimate) {
Entry existing = byTerm.get(term);
if (existing != null) {
heap.remove(existing);
existing.count = estimate;
heap.offer(existing);
return;
}
if (heap.size() < k) {
Entry e = new Entry(term, estimate);
heap.offer(e); byTerm.put(term, e);
} else if (heap.peek().count < estimate) {
Entry evicted = heap.poll();
byTerm.remove(evicted.term);
Entry e = new Entry(term, estimate);
heap.offer(e); byTerm.put(term, e);
}
}
public List<Entry> snapshot() {
return heap.stream()
.sorted(Comparator.comparingLong((Entry e) -> e.count).reversed())
.toList();
}
public static final class Entry {
final String term; long count;
Entry(String t, long c) { this.term = t; this.count = c; }
}
}
public class StreamProcessor implements Runnable {
private final Consumer<String, Tweet> consumer; // Kafka-style consumer
private final CountMinSketch cms;
private final TopKTracker topK;
private final HotTopicCache cache;
private final Clock clock;
private Instant nextSnapshot;
public StreamProcessor(Consumer<String, Tweet> consumer, CountMinSketch cms,
TopKTracker topK, HotTopicCache cache, Clock clock) {
this.consumer = consumer; this.cms = cms; this.topK = topK;
this.cache = cache; this.clock = clock;
this.nextSnapshot = clock.instant().plusSeconds(10);
}
@Override public void run() {
while (!Thread.currentThread().isInterrupted()) {
ConsumerRecords<String, Tweet> batch = consumer.poll(Duration.ofMillis(200));
for (ConsumerRecord<String, Tweet> r : batch) {
Tweet t = r.value();
for (String tag : t.hashtags()) {
cms.add(tag, 1);
topK.offer(tag, cms.estimate(tag));
}
}
Instant now = clock.instant();
if (!now.isBefore(nextSnapshot)) {
cache.publish(now, topK.snapshot());
consumer.commitAsync();
nextSnapshot = now.plusSeconds(10);
}
}
}
}
Key design decisions & trade-offs
- Counting strategy — Chosen: Count-Min Sketch over exact counts. Fixed memory per window and single-pass updates with bounded over-counting. Exact counts would require a global counter per distinct term and cross-node shuffling, which does not scale to a firehose.
- Top-K maintenance — Chosen: MinHeap of size K keyed by estimated count. Amortized O(log K) per update and a trivial snapshot operation. Cost is a map plus heap per task, but K is small (tens) so the overhead is negligible.
- Freshness cadence — Chosen: 5-minute sliding window, 10-second refresh. Matches human perception of "trending" while keeping stream state small. Tighter windows increase snapshot pressure on the cache and the CMS error rate for short-lived spikes.
- Read-side delivery — Chosen: Hot topic cache keyed by window_end_ts. Keeps the read API stateless and lets the CDN cache the JSON for tens of seconds. The tradeoff is that any correction to a snapshot requires publishing a new key rather than mutating an existing one.
- Ingest durability — Chosen: Async producer with local disk spool on Kafka slowdown. Analytics must never block user-visible tweeting. Accepts the risk of dropping analytics events on extended Kafka outage in exchange for zero back-pressure on the write path.
Interview follow-ups
- Add per-language and per-country heavy-hitter streams with a merge step at read time
- Detect spam / bot amplification and exclude flagged accounts before they feed the sketch
- Introduce a personalized trending overlay by joining heavy hitters with a user's follow graph
- Replace the fixed 5-minute window with an adaptive window that shortens during bursty events
- Add a cold analytics path that writes exact counts to a warehouse for backfill and correctness audits