HTTP/1.1 · 2 · 3
Head-of-line blocking across 6 resources under packet loss.
This interactive explanation is built for system design interview prep: step through HTTP/1.1 · 2 · 3, watch the internal state change, and connect the concept to real distributed-system trade-offs.
Overview
HTTP has been rewritten twice in a decade and each version solves a specific problem with the one before it. HTTP/1.1 (1997) is a text protocol with one request per connection — or with keep-alive, one request at a time per connection. Browsers worked around this by opening 6 parallel connections per host, and the server took a TCP handshake per connection. HTTP/2 (2015) is a binary, multiplexed protocol on a single TCP connection: many requests and responses interleave as streams, with per-stream flow control and header compression (HPACK). HTTP/3 (2022) is HTTP/2 semantics over QUIC, a UDP-based transport with built-in TLS 1.3 and 0-RTT resumption. HTTP/3 eliminates TCP head-of-line blocking by running each stream on its own QUIC flow. Knowing which version is negotiated — via ALPN on TLS or Alt-Svc headers — and what the version's limits are is essential to sizing connection pools, setting timeouts, and debugging tail latency.
How it works
HTTP/1.1 request lifecycle: open TCP (1 RTT), TLS handshake if HTTPS (1 more RTT for TLS 1.3), send text request, receive text response, keep the connection open for the next request on the same host. Headers are verbose and repeated on every request. Pipelining exists in the spec but is effectively dead because any out-of-order response breaks the model. HTTP/2 runs on the same TCP + TLS substrate but the wire format is framed: each HTTP message is one or more frames (HEADERS, DATA, etc.) tagged with a stream ID. Many streams coexist on one connection; each is independently flow-controlled. Headers are compressed with HPACK — a stateful encoder that replaces repeated strings with indexes into a dynamic table — so a 700-byte cookie header costs 2 bytes on the second request. The catch: TCP is still one ordered stream, so a single lost packet pauses every HTTP/2 stream on that connection (head-of-line blocking). HTTP/3 fixes this by replacing TCP with QUIC, which runs over UDP and carries its own reliability, congestion control, and encryption. Each HTTP/3 stream is a QUIC stream with independent reliability; one lost packet only blocks its own stream. QUIC also includes 0-RTT resumption and connection migration (a phone switching from WiFi to cellular keeps the same connection ID). Java's built-in HttpClient negotiates HTTP/2 via ALPN automatically when you request HTTP_2; HTTP/3 requires third-party libraries today. Multiplexing changes how you tune servers: fewer connections, much higher stream concurrency per connection.
Implementation
public class Http2Client {
public static void main(String[] args) throws Exception {
HttpClient client = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_2) // negotiate via ALPN, fall back to 1.1
.connectTimeout(Duration.ofSeconds(5))
.build();
HttpRequest req = HttpRequest.newBuilder(URI.create("https://www.example.com/"))
.header("User-Agent", "http2-demo/1.0")
.header("Accept-Encoding", "gzip")
.GET().build();
HttpResponse<String> resp = client.send(req, HttpResponse.BodyHandlers.ofString());
System.out.println("negotiated: " + resp.version()); // HTTP_2 if both sides support it
System.out.println("status: " + resp.statusCode());
resp.headers().map().forEach((k, v) -> System.out.println(k + ": " + v));
}
}
public class Http2Parallel {
public static void main(String[] args) throws Exception {
HttpClient client = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_2)
.build();
List<URI> urls = List.of(
URI.create("https://api.example.com/a"),
URI.create("https://api.example.com/b"),
URI.create("https://api.example.com/c"),
URI.create("https://api.example.com/d")
);
// All four requests multiplex on a single TCP + TLS connection.
List<CompletableFuture<HttpResponse<String>>> futures = urls.stream()
.map(u -> HttpRequest.newBuilder(u).GET().build())
.map(r -> client.sendAsync(r, HttpResponse.BodyHandlers.ofString()))
.toList();
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
futures.forEach(f -> System.out.println(f.join().statusCode()));
}
}
Complexity
- HTTP/1.1 first request:
TCP + TLS = 2 RTT - HTTP/2 first request:
TCP + TLS = 2 RTT, then many streams - HTTP/3 first request:
QUIC handshake = 1 RTT (or 0-RTT resume) - header overhead 1.1:
full text every request - header overhead 2/3:
HPACK/QPACK, often 2-10 bytes
Key design decisions & trade-offs
- HTTP/1.1 vs 2 vs 3 — Chosen: HTTP/2 as default, HTTP/3 for mobile and lossy links. HTTP/2 wins on typical fixed-line traffic; HTTP/3 wins where TCP HoL blocking hurts.
- Connection pooling — Chosen: One HTTP/2 connection per origin, many per 1.1. Multiplexing makes extra connections wasteful on HTTP/2; 1.1 needs parallelism from multiple connections.
- Server push — Chosen: Avoid; use 103 Early Hints instead. Server push was hard to get right and is deprecated in major browsers; Early Hints gives the preload benefit without the complexity.
Common pitfalls
- Assuming HTTP/2 fixes all perf — TCP HoL blocking still hurts under packet loss
- Leaving keep-alive idle timeouts too long and exhausting server FDs
- Uppercasing headers in HTTP/2 — spec requires lowercase
- Not tuning SETTINGS_MAX_CONCURRENT_STREAMS — default 100 can bottleneck a single-connection client
- Using cleartext HTTP for anything non-trivial
Interview follow-ups
- Enable HTTP/3 via Alt-Svc on the server
- Tune connection pool size per origin
- Add HSTS and HTTPS-only rollout
- Use Brotli compression for text bodies
Recommended reading
- Alex Petrov, Database Internals — storage engines and distributed systems internals.
- Martin Kleppmann, Designing Data-Intensive Applications (DDIA) — data models, replication, partitioning, consistency.
- The System Design Primer — high-level design building blocks.
- Foundational networking + web-security references (TCP/IP, TLS 1.3, OWASP Top 10).