Pub/Sub vs Queue
Fanout vs point-to-point, consumer groups, delivery semantics.
This interactive explanation is built for system design interview prep: step through Pub/Sub vs Queue, watch the internal state change, and connect the concept to real distributed-system trade-offs.
Overview
Pub/Sub and work queues look similar from ten feet away — both accept messages from producers and hand them to consumers — but they solve different problems and get the design wrong if you confuse them. Pub/Sub is fan-out: one published message is delivered to every subscriber of the topic. Use it when multiple independent services need the same event (analytics, audit, cache invalidation, notifications). Work queues are fan-in to a competing-consumer pool: each message is handed to exactly one worker from a group. Use it when you have a unit of work that needs to be done once (send an email, resize an image, charge a card). The two models have different delivery semantics, different backpressure behaviors, and different failure recovery. Kafka gives you both on one substrate (topic + consumer group), but many systems still run RabbitMQ for queues and Kafka for streams because the operational shape is different. Getting the choice right is the difference between delivering a receipt twice and losing it entirely.
How it works
In Pub/Sub, the broker maintains a topic — a named log or channel — and a subscription per consumer. When a producer publishes, the broker appends the message to the topic and delivers a copy to every subscription. Subscribers track their position independently; one slow subscriber does not block the others. Delivery is typically at-least-once: the broker retries until the subscriber acks. Order is preserved per partition, not globally. Classic examples are Kafka topics with multiple consumer groups, or Google Cloud Pub/Sub with push subscriptions. In a work queue, the broker enqueues messages and workers pull them. When a worker picks a message, the broker hides it from other workers using a visibility timeout (SQS) or a consumer-group lock (Kafka). If the worker acks within the timeout, the message is deleted; if it crashes, the timeout expires and another worker retries. Multiple workers compete for messages, so throughput scales linearly with worker count until the broker becomes the bottleneck. Dead-letter queues catch poison messages after N retries. The distinguishing operational concerns are ordering (usually single-partition for strict order), idempotency (at-least-once means duplicates; consumers must handle them), and backpressure (slow consumers grow the queue until disk or retention limits kick in).
Implementation
public class Topic {
private final String name;
private final List<Subscriber> subscribers = new CopyOnWriteArrayList<>();
public Topic(String name) { this.name = name; }
public void subscribe(Subscriber s) { subscribers.add(s); }
public void publish(Message m) {
// Fan-out: each subscriber gets its own copy.
for (Subscriber s : subscribers) {
try { s.onMessage(m); }
catch (RuntimeException e) {
// At-least-once: broker retries via internal redelivery logic
s.enqueueRetry(m);
}
}
}
}
public interface Subscriber {
void onMessage(Message m);
default void enqueueRetry(Message m) { onMessage(m); }
}
class AnalyticsSubscriber implements Subscriber {
public void onMessage(Message m) { /* write to warehouse */ }
}
class AuditSubscriber implements Subscriber {
public void onMessage(Message m) { /* write to audit log */ }
}
public class Queue {
private final Deque<Envelope> inflight = new ArrayDeque<>();
private final Map<String, Envelope> leased = new ConcurrentHashMap<>();
private final long visibilityMs = 30_000;
public synchronized void send(Message m) {
inflight.add(new Envelope(UUID.randomUUID().toString(), m, 0));
notifyAll();
}
/** Poll one message; competing consumers each take a different one. */
public synchronized Envelope poll(long waitMs) throws InterruptedException {
long deadline = System.currentTimeMillis() + waitMs;
while (inflight.isEmpty()) {
long left = deadline - System.currentTimeMillis();
if (left <= 0) return null;
wait(left);
}
Envelope e = inflight.poll();
e.leaseExpiresAt = System.currentTimeMillis() + visibilityMs;
leased.put(e.id, e);
return e;
}
public void ack(String id) { leased.remove(id); }
/** Called by a timer: redeliver any lease that expired. */
public synchronized void reclaimExpiredLeases() {
long now = System.currentTimeMillis();
leased.values().removeIf(e -> {
if (e.leaseExpiresAt < now) { inflight.add(e); return true; }
return false;
});
notifyAll();
}
static class Envelope {
final String id; final Message msg; long leaseExpiresAt;
Envelope(String id, Message m, long t) { this.id = id; this.msg = m; this.leaseExpiresAt = t; }
}
}
Complexity
- publish:
O(subscribers) - queue poll:
O(1) amortized - visibility timeout redelivery:
O(leased) periodic - per-subscriber memory:
message backlog size
Key design decisions & trade-offs
- Pub/Sub vs Queue — Chosen: Pub/Sub for multi-consumer events, Queue for work units. Fan-out ensures every service reacts; competing consumers ensures one unit of work happens once.
- Delivery guarantee — Chosen: At-least-once with idempotent consumers. Exactly-once is expensive and fragile; idempotency keys at the consumer are cheaper and more robust.
- Ordering — Chosen: Per-key (partitioned) ordering. Global ordering forces a single partition and caps throughput; per-key ordering keeps causality where it matters.
Common pitfalls
- Treating a pub/sub topic as a queue and losing messages to consumer groups rebalancing
- Setting visibility timeout shorter than the p99 processing time (messages reappear)
- No dead-letter queue: poison messages loop forever and block the partition
- Ignoring backpressure: slow subscribers grow the broker's disk until retention drops data
Interview follow-ups
- Add consumer lag monitoring and autoscaling based on it
- Implement exactly-once semantics via transactional outbox + idempotency keys
- Add DLQ with replay tooling for poison messages
- Use sticky partitioning to preserve per-user ordering at high throughput
Recommended reading
- Alex Petrov, Database Internals — storage engines and distributed systems internals.
- Martin Kleppmann, Designing Data-Intensive Applications (DDIA) — data models, replication, partitioning, consistency.
- The System Design Primer — high-level design building blocks.
- Foundational networking + web-security references (TCP/IP, TLS 1.3, OWASP Top 10).