21-04-2026

How HTTP/2 Actually Works

Try the interactive lab for this article Take the quiz (6 questions · ~5 min)

HTTP/2 is often explained with one sentence: it multiplexes many requests over one TCP connection. That sentence is true, but it is not enough to explain why the protocol exists, why browsers and servers had to change so much to support it, or why HTTP/3 still replaced it on performance-critical paths a decade later.

HTTP/1.1 already supported persistent connections. It could reuse one TCP connection for many requests. The real problem was that the protocol still treated the connection as a serial byte stream with no native message interleaving. Browsers worked around that by opening several parallel TCP connections per origin, sharding static assets across hostnames, concatenating JavaScript, inlining CSS, and using image sprites to reduce request count. Those were not elegant optimisations. They were coping strategies for a protocol that made concurrency awkward.

HTTP/2 changed the shape of the problem. Instead of sending textual requests and responses directly on the TCP stream, it turned the connection into a framed transport with independent logical streams. Each stream carried one request-response exchange, and the bytes for different streams could be interleaved safely because every frame included a stream identifier. That let the browser fetch HTML, CSS, JavaScript, fonts, and images together without opening six parallel TCP connections to the same origin.

This article looks at the mechanics beneath that change: the binary framing layer, stream lifecycle, HPACK header compression, flow control, stream prioritisation, server push, ALPN negotiation, and the awkward fact that all of this still sits on top of one ordered TCP byte stream. We will also connect those internals to real operator behaviour, browser tuning, CDN deployments in London and Frankfurt, and the specific reasons HTTP/2 improved the web while still keeping one major performance pathology alive.

HTTP/1.1 Was Limited by Message Ordering, Not by a Lack of Reuse

HTTP/1.1 introduced persistent connections and request pipelining, so on paper it already knew how to reuse a TCP session. In practice, pipelining was rarely enabled because responses still had to come back in order. If the browser queued five requests and the first one generated a slow response, the following four were trapped behind it even if the server could have produced them immediately. This was application-layer head-of-line blocking.

A simple example shows the problem. Imagine a browser in Athens requesting:

/index.html
/app.css
/app.js
/logo.svg

On a single HTTP/1.1 connection, the server could not send the bytes for /app.css ahead of /index.html if the browser had pipelined requests and expected ordered responses. Operators therefore defaulted to multiple TCP connections, often six per origin in browsers, so slow resources did not block unrelated ones.

That workaround had costs:

more TCP handshakes
more TLS handshakes before TLS 1.3
more kernel socket state on both sides
worse congestion behaviour because each connection had its own congestion window
more pressure on ephemeral ports and middleboxes

The browser community spent years building performance guidance around these limitations. "Domain sharding" spread assets across img.example.eu, static.example.eu, and cdn.example.eu so the browser could legally open more parallel sockets. Bundling reduced request count but made cache invalidation worse. Inlining reduced round trips but bloated the HTML. None of this was ideal. It was just rational behaviour under HTTP/1.1's constraints.

HTTP/2's framing layer was designed to remove the need for these workarounds while preserving the basic HTTP semantics of methods, headers, status codes, URIs, and cache behaviour.

HTTP/2 Starts with a Binary Framing Layer

The most important design shift in HTTP/2 is that HTTP messages are no longer transmitted as plain textual request and response blocks on the wire. Instead, the connection carries a sequence of binary frames. Each frame has a fixed 9-byte header followed by a payload.

Conceptually, a frame looks like this:

+-----------------------------------------------+
| Length (24) | Type (8) | Flags (8)            |
+-----------------------------------------------+
| R | Stream Identifier (31)                    |
+-----------------------------------------------+
| Frame Payload (variable)                      |
+-----------------------------------------------+

Those fields give the protocol the machinery HTTP/1.1 lacked:

Length says how many payload bytes follow
Type says what kind of frame this is
Flags refine behaviour for that frame type
Stream Identifier says which logical stream this frame belongs to

The connection is therefore one byte stream at the TCP layer, but many logical conversations at the HTTP layer. A browser can send request headers on stream 1, request headers on stream 3, response body data on stream 1, and response headers on stream 5, all interleaved safely because every frame states its stream ID explicitly.

This is why HTTP/2 is sometimes described as a framing protocol plus familiar HTTP semantics. The method is still GET, the authority is still the host, the status code is still 200, but those fields now travel in structured frames rather than raw ASCII blocks.

The most common frame types are:

HEADERS for request or response header blocks
DATA for body bytes
SETTINGS for connection-level parameters
WINDOW_UPDATE for flow control credit
PING for liveness and RTT measurement
RST_STREAM to abort one stream
GOAWAY to close the connection gracefully

That frame vocabulary is the real foundation of HTTP/2. Multiplexing is a consequence of the framing model.

Streams Are Independent Logical Channels Inside One Connection

A stream is a bidirectional logical channel inside the shared connection. Every HTTP request-response exchange gets its own stream ID. Clients use odd-numbered stream IDs, servers use even-numbered ones when they initiate streams, which mostly mattered for server push.

The lifecycle is not complicated, but it is important:

the client opens a stream by sending HEADERS
either side may send DATA frames if the message has a body
END_STREAM marks one direction as finished
the stream transitions through open, half-closed, and closed states

For a simple GET, the browser might send:

HEADERS stream=1 END_HEADERS END_STREAM

The server might then reply:

HEADERS stream=1 END_HEADERS
DATA    stream=1
DATA    stream=1 END_STREAM

At the same time, the browser can already have streams 3, 5, and 7 open for other resources. The server can send bytes from all of them in an interleaved order such as:

HEADERS stream=1
HEADERS stream=3
DATA    stream=1
HEADERS stream=5
DATA    stream=3
DATA    stream=5
DATA    stream=1 END_STREAM

That interleaving is the practical performance win. If one response is large, the others do not have to wait for it to finish before the server emits their first bytes.

This model also changes server design. Under HTTP/1.1, one connection often mapped to one in-flight response. Under HTTP/2, one TLS session may contain dozens of concurrent streams, which means the HTTP stack, TLS stack, event loop, and prioritisation logic all need to coordinate correctly. Reverse proxies such as nginx, Envoy, HAProxy, and CDN edge servers had to learn stream scheduling instead of only socket scheduling.

The Connection Begins with a Preface and SETTINGS Exchange

HTTP/2 does not simply start speaking frames without agreement. The client sends a connection preface followed by a SETTINGS frame. For direct cleartext h2c testing, the preface string is literal:

PRI * HTTP/2.0\r\n\r\nSM\r\n\r\n

In ordinary HTTPS use, browsers negotiate HTTP/2 during the TLS handshake using ALPN, Application-Layer Protocol Negotiation. If ALPN agrees on h2, both sides know the encrypted connection will carry HTTP/2 frames.

Immediately after connection setup, each side exchanges SETTINGS. These advertise local preferences and limits such as:

maximum frame size
initial flow control window
maximum concurrent streams
whether server push is enabled
header table size for HPACK

That exchange matters because HTTP/2 is full of bounded state. A client does not want an origin to open unlimited streams. A server does not want a client to send arbitrarily large header blocks. Both sides need to know the initial credit for flow control before they start sending much data.

In practice, you can see the negotiation with tools like:

curl -I --http2 https://example.eu
openssl s_client -alpn h2 -connect example.eu:443

The ALPN result is one of the easiest ways to confirm whether a site is really serving HTTP/2 at the edge, especially when a CDN terminates TLS in Paris or Frankfurt and speaks some different protocol toward the origin behind it.

HPACK Solved Repetitive Headers Without Reintroducing Compression Side Channels Blindly

Header compression was one of the less visible but highly practical parts of HTTP/2. Web requests carry a lot of repetitive metadata:

:method
:scheme
:authority
user-agent
accept
cookie
cache-control

Under HTTP/1.1, every request sent these as plain text again and again. That wasted bandwidth, especially on high-latency or mobile links where every byte still matters during the first round trips.

HTTP/2 uses HPACK for header compression. HPACK combines:

a static table of common header names and values
a dynamic table built during the connection
Huffman encoding for literal strings

If the browser has already sent :method: GET and :scheme: https, later requests can refer to indexed entries instead of retransmitting the full strings. Repeated cookie prefixes and common response headers also compress well once both sides have shared table state.

A simplified example:

Request 1 headers:
  :method: GET
  :scheme: https
  :authority: assets.example.eu
  :path: /app.css
 
Request 2 headers:
  :method: GET
  :scheme: https
  :authority: assets.example.eu
  :path: /app.js

The second request can mostly reference the first request's entries and transmit only the changed path efficiently.

This reduced overhead is why HTTP/2 helped pages with many small requests. The benefit was not only fewer TCP connections. It was also less repeated header text on each request.

HPACK was also designed in the shadow of earlier compression side-channel concerns such as CRIME. The protocol separates header compression state from message bodies and gives implementations control over dynamic table size so they can limit memory and risk. Even so, operators still had to think carefully about cross-request compression behaviour, especially where attacker-controlled input and secrets could coexist in compressed header contexts.

Flow Control Exists at Both the Stream and Connection Level

Multiplexing introduces a new risk: one sender could overwhelm the receiver with data on one stream and starve memory or buffering for the rest. HTTP/2 therefore includes explicit flow control for DATA frames.

Two independent windows exist:

a per-stream window
a connection-wide window

Each side advertises how much DATA it is willing to receive. Sending consumes window credit. Receiving and processing data lets the receiver grant more credit with WINDOW_UPDATE.

Suppose a server starts with an initial window of 65,535 bytes for each stream and for the overall connection. If it sends 16 KB of response body on stream 3, both the stream-level and connection-level windows shrink. Once the client consumes those bytes, it may send WINDOW_UPDATE to enlarge the windows again.

This matters for real workloads:

a large video segment should not starve small CSS and JavaScript responses
a slow client should not force unbounded buffering in the server
one stalled stream should not permanently block the rest if scheduling is sane

Flow control is not congestion control. Congestion control belongs to TCP underneath. HTTP/2 flow control is an application-layer backpressure mechanism. It tells the peer how much data the HTTP stack is ready to accept, not what the network path can sustain.

That distinction is operationally useful. If a transfer is slow, the bottleneck might be:

TCP congestion window growth
receiver flow control window exhaustion
server prioritisation choices
CDN edge buffering
origin fetch latency

All of those can produce "slow HTTP/2" symptoms, but they live in different layers.

Prioritisation Looked Powerful on Paper and Messy in Production

HTTP/2 includes a stream prioritisation system. A client can express that one stream depends on another and can assign weights so the server knows which responses should receive bandwidth first. In theory, this lets the browser tell the server that CSS and blocking JavaScript matter more than below-the-fold images.

The model was a dependency tree with weights from 1 to 256. A stream could depend on another stream, optionally exclusively, and the server could schedule bytes accordingly.

A conceptual example:

stream 1: HTML
stream 3: CSS depends on 1, weight 220
stream 5: JS  depends on 1, weight 180
stream 7: IMG depends on 1, weight 20

In theory, the server should make fast progress on streams 3 and 5 before spending much effort on stream 7.

In practice, prioritisation was one of the least successful parts of HTTP/2. Reasons included:

browsers changed their prioritisation strategies over time
many servers and proxies implemented the tree only partially
some CDNs flattened or ignored priorities under load
origin fetch delays often dominated whatever clever scheduling the edge wanted to do

Operators discovered that "supports HTTP/2 prioritisation" did not guarantee that the whole delivery chain honoured it meaningfully. A browser in Berlin might send one dependency tree, the CDN edge might simplify it, the reverse proxy might buffer responses differently, and the application server might produce bytes in some unrelated order.

This is one reason modern performance work often focuses more on resource hints, critical CSS, caching, and transport-level latency reduction than on carefully tuning HTTP/2 priority trees.

Server Push Tried to Beat the Browser to the Next Request

Server push was one of HTTP/2's most ambitious features. The idea was simple: if the server knows the HTML response will immediately cause the browser to request /app.css, it can push that resource proactively without waiting for the browser to ask.

Mechanically, the server sends a PUSH_PROMISE on an existing stream, reserving a new stream that represents the pushed request, then follows with the pushed response headers and data.

The model seemed attractive, but it struggled in production:

the server often guessed wrong about what the browser actually needed
the browser cache might already hold the asset
push bandwidth could crowd out more important responses
CDN and proxy support varied
debugging was harder than explicit preload hints

A push that saves one round trip in theory can waste bandwidth in practice if the browser already has the object or if the user never renders the route that would have needed it. In an age of strong caching and increasingly sophisticated preload mechanisms, push often did more harm than good.

Browsers and server vendors gradually backed away from it for that reason. HTTP/2 server push is now effectively dead in mainstream web performance engineering. The lesson is useful: not every protocol feature that looks latency-friendly survives contact with real caches, CDNs, and user agents.

HTTP/2 Still Suffers from TCP Head-of-Line Blocking

HTTP/2 solved application-layer head-of-line blocking. It did not solve transport-layer head-of-line blocking.

All streams on one HTTP/2 connection still share one TCP session. TCP delivers bytes in order. If one TCP segment is lost, the receiver cannot pass later bytes up to the application until the missing bytes are retransmitted and delivered in sequence.

This matters because HTTP/2 interleaves frames for many streams on the same TCP byte stream. A lost segment may therefore delay progress for every active stream, not just the one whose data was conceptually "first".

Imagine stream 3 carries CSS and stream 7 carries an image. Their frames are interleaved:

segment A: DATA stream 3
segment B: DATA stream 7
segment C: DATA stream 3

If segment B is lost, TCP cannot present segment C to the HTTP/2 layer until B is retransmitted, even though stream 3 is logically independent. This is TCP head-of-line blocking.

That is the central reason HTTP/3 moved HTTP onto QUIC over UDP. QUIC keeps reliability, congestion control, and encryption, but it tracks loss per stream in a way that avoids forcing unrelated streams to wait behind one missing packet.

HTTP/2 was still a major improvement over HTTP/1.1 because it removed the need for many parallel TCP sessions. But on lossy mobile or Wi-Fi paths, one lost TCP segment can still stall the entire multiplexed connection. That was acceptable for a while, not ideal forever.

The Best Performance Gains Usually Came from Fewer Connections and Better Header Efficiency

When HTTP/2 launched, some commentary implied it would make websites automatically fast. That was never realistic. The practical wins came from specific mechanisms:

fewer TCP and TLS handshakes
better use of one warm congestion window
lower header overhead via HPACK
interleaving of small critical resources
removal of the need for many HTTP/1.1 workarounds

The biggest improvement often appeared on pages with many small assets. A site serving thirty objects from one origin over HTTPS could eliminate a lot of redundant setup and queuing. Large single-object downloads often changed less because they were already dominated by transfer size rather than request concurrency.

This is also why some old optimisation advice became harmful under HTTP/2:

domain sharding could reduce efficiency by forcing extra connections
aggressive bundling could hurt cache granularity
image sprites became less valuable

Performance teams had to unlearn some habits. A site tuned for HTTP/1.1 might need fewer hostnames and more natural asset separation once HTTP/2 arrived.

In practice, CDNs helped accelerate that transition. An edge platform in Amsterdam or Frankfurt could terminate HTTP/2 for browsers, reuse fewer but hotter origin connections, and hide some complexity from application teams. Many organisations "adopted HTTP/2" first at the CDN edge rather than by redesigning their entire origin stack.

Intermediaries Changed the Meaning of One End-to-End Connection

The web is full of intermediaries:

browser to CDN edge
CDN edge to regional shield
shield to origin proxy
proxy to application server

HTTP/2 is negotiated hop by hop, not magically end to end across every intermediary. A browser may speak HTTP/2 to a CDN edge in London. That edge may speak HTTP/1.1 to the origin, or HTTP/2, or gRPC over HTTP/2, depending on configuration. The user only sees the first hop.

This matters when people say "our site supports HTTP/2". The statement is true only for a specific segment unless you inspect the whole chain. The browser's experience depends heavily on the client-facing edge hop, so the claim is still useful, but it does not mean the entire backend stack is fully multiplexed.

Operators therefore need to think about buffering and protocol translation carefully. If the edge accepts many browser streams concurrently but serialises origin fetches poorly, the theoretical gain shrinks. If the CDN coalesces requests and caches effectively, the origin may never notice. If the edge proxies gRPC, long-lived streams behave differently again.

HTTP/2 improved the interface between browsers and edge infrastructure first. Backend adoption was more selective and workload-dependent.

HTTP/2 Has Its Own Operational Failure Modes

The protocol is mature, but not trivial. Common operational issues include:

Too Many Concurrent Streams

If clients or load testers open many streams at once, the server's advertised MAX_CONCURRENT_STREAMS becomes important. Too low and the browser queues unnecessarily. Too high and memory pressure rises.

Large Header Blocks

Modern cookies and tracing headers can become enormous. HPACK helps, but implementations still need limits for decompression state and total header size to avoid abuse and memory exhaustion.

Mis-tuned Proxies

Reverse proxies that buffer too aggressively, ignore priorities, or translate inefficiently can erase much of HTTP/2's benefit.

Long-Lived Streams and Fairness

Streaming responses, gRPC workloads, and large downloads can interact badly with smaller latency-sensitive requests if flow control and scheduling are not handled well.

Debugging Complexity

Raw packet captures are harder to eyeball than HTTP/1.1 because the wire format is binary and usually encrypted. Tools such as browser developer panels, nghttp, Envoy stats, and TLS key log files become more important.

Those tools show frame-level behaviour more clearly than ordinary access logs.

A Real Page Load Looks Different Under HTTP/1.1 and HTTP/2

The easiest way to understand HTTP/2's practical benefit is to walk through the same page load under both models.

Imagine a site with:

one HTML document
one CSS file
one JavaScript bundle
two font files
six images above and below the fold

Assume the browser is talking to an edge in Frankfurt from Athens and the RTT to that edge is 45 ms. Under HTTP/1.1, the browser might open six TCP connections to the same origin because that is how it avoids application-layer queuing. Each one needs:

a TCP handshake
a TLS handshake
request transmission
response delivery

Even if the browser pipelines little or nothing, the socket pool has to warm up and the congestion windows have to grow separately. The browser can spread objects across those sockets, but the distribution is approximate. One socket may get the large JavaScript bundle and spend much of its congestion window there, while another carries small font or CSS responses. The result is workable, not elegant.

Under HTTP/2, the browser usually wants one hot connection, sometimes two in corner cases, and then opens many streams on top of it. The setup cost is paid once. Every additional resource mostly adds:

a HEADERS frame
some response scheduling on the server
a stream-level lifecycle

That means the bottleneck shifts from "how many sockets are legal and warmed up" to "how well do both sides schedule frames on one shared connection". The HTML may still dominate discovery because the browser cannot ask for resources it has not parsed yet, but once it knows what is needed, it can burst many dependent requests quickly.

A rough comparison:

HTTP/1.1
  conn 1 -> HTML
  conn 2 -> CSS
  conn 3 -> JS
  conn 4 -> font 1
  conn 5 -> font 2
  conn 6 -> image 1
  later reuse -> image 2, 3, 4, 5, 6
 
HTTP/2
  conn 1
    stream 1  -> HTML
    stream 3  -> CSS
    stream 5  -> JS
    stream 7  -> font 1
    stream 9  -> font 2
    stream 11 -> image 1
    stream 13 -> image 2
    ...

This matters even more under TLS because handshakes are expensive relative to small objects. HTTP/2 reduces the proportion of page-load time spent on connection setup and leaves more of the budget for actual content transfer. It does not make bad origin latency disappear, but it does stop the browser from wasting so much effort reopening equivalent transport state.

Connection Coalescing Quietly Reduced the Need for Domain Sharding

One of the more subtle browser behaviours that HTTP/2 made possible is connection coalescing. If several origins resolve to the same IP address and the certificate covers them, the browser may reuse one HTTP/2 connection for more than one hostname.

This sounds minor. In practice it helped undo years of HTTP/1.1-era asset sharding habits.

Suppose a site serves:

www.example.eu
static.example.eu
img.example.eu

If all three point at the same CDN edge address and the certificate's SAN list covers them, the browser may treat one TLS connection as suitable for all of them. That means one warm socket, one congestion window, and one set of HTTP/2 streams rather than three mostly separate pools.

The constraints matter:

the certificate must authorise all relevant names
the server must be authoritative for those names
the browser must judge the connection reusable under its security rules

When that works, the old reason for deliberate domain sharding weakens sharply. Under HTTP/1.1, sharding was a way to trick the browser into opening more connections. Under HTTP/2, sharding can backfire by splitting work across several congestion windows and reducing the benefits of multiplexing and header compression.

This is one reason performance teams had to revisit long-lived folklore. Advice that was correct in 2013 could become counterproductive later. A stack optimised for HTTP/2 often prefers fewer origins, cleaner certificates, and simpler asset layout rather than a maze of parallel hostnames.

CDNs were especially important here. Edge platforms could serve many customer hostnames from the same anycast fleet and often from the same edge processes. That made coalescing practical and pushed more sites toward simpler delivery topology.

HTTP/2 Changed Backend RPC Design, Not Just Browsers

Public web pages drove the headlines, but HTTP/2 also changed internal service design. Once engineers had:

multiplexed streams
binary framing
bidirectional flow control
one long-lived TLS connection

it became natural to use the protocol for RPC systems. gRPC is the most obvious example. A client and server can keep one HTTP/2 connection open and run many RPC calls across separate streams, including server streaming and bidirectional streaming patterns.

This works well, but it also exposes parts of HTTP/2 that many website teams never notice. For example:

Long-Lived Streams

A streaming RPC can stay open for minutes or hours. That changes fairness and buffer management compared with short-lived asset requests.

Backpressure

HTTP/2 flow control becomes central. If one consumer is slow, the sender must honour the stream and connection windows rather than buffering endlessly.

Cancellation

RST_STREAM becomes part of normal application behaviour rather than an error oddity. Clients cancel RPC calls, deadlines expire, and the transport needs to tear down only the relevant stream without harming the rest of the connection.

Observability

Socket-level metrics are no longer enough because one connection may hide dozens or hundreds of independent calls. The operator needs stream-level insight.

This is one reason HTTP/2 grew beyond "the browser web protocol". It provided a structured transport that application designers could build on. The lessons learned there also explain why many backend teams care deeply about maximum concurrent streams, per-connection memory limits, and fairness under streaming workloads. Those are not theoretical concerns once the protocol becomes the substrate for service meshes and internal APIs.

HTTP/2 Resource Exhaustion Attacks Taught Operators That Streams Are Cheap, Not Free

When HTTP/2 shipped, some teams assumed that one TCP connection was automatically easier to defend than many. The reality was more nuanced. A multiplexed connection reduces handshake overhead, but it also concentrates a lot of logical work into one transport session. That means per-connection state matters more than before.

The clearest reminder came from the Rapid Reset attack pattern disclosed in 2023. The short version is that a client could open streams and cancel them very quickly with RST_STREAM, forcing servers or proxies to do significant stream setup and teardown work at high rates. The attack did not require extraordinary bandwidth. It exploited the fact that stream lifecycle handling still consumes CPU and memory.

The important lesson is not the exploit detail. The important lesson is architectural:

stream creation is not free
header decompression is not free
scheduler bookkeeping is not free
cancellations can still trigger useful work on the server side

A robust HTTP/2 stack therefore needs:

sane stream concurrency limits
fast cancellation paths
defensive header size limits
fair scheduler behaviour under abuse
edge filtering that recognises pathological stream churn

This is another place where intermediaries matter. A CDN or reverse proxy can absorb or reject malicious behaviour before it reaches the origin, but only if the intermediary's own HTTP/2 implementation is hardened as well.

The broader point is that HTTP/2 removed some costs and introduced new ones. It made application concurrency cleaner. It also created new places where cheap logical actions can trigger non-trivial state transitions. Mature deployments account for both facts.

Browser Scheduling Still Matters Because the Wire Is Not the Whole Story

A common mistake is to treat HTTP/2 performance as purely a server concern. The browser still decides:

which requests to issue first
when discovered resources are high priority
whether a preconnect or preload happens
whether a resource is render-blocking
which origin socket pool to reuse

HTTP/2 gave browsers more flexibility. It did not remove the need to spend that flexibility wisely.

Consider fonts. If the browser discovers a font late because the CSS arrived late, multiplexing alone will not save first render. Consider images. If the browser decides below-the-fold images are low priority, the server may never need to spend bytes on them early even though multiplexing would allow it. Consider service workers. Cached or intercepted resources may not hit the network at all, which changes the apparent value of transport-level optimisation.

This is why browser network panels remain so useful. They show that HTTP/2 is one layer in a wider scheduling system. If the waterfall still looks bad after enabling h2, the problem may be:

discovery order
origin compute delay
cache misses
render-blocking CSS
oversized JavaScript

HTTP/2 helps the transport path. It does not redesign the page architecture for you.

Browsers, CDNs, and Origins All Adapted Their Strategies Around HTTP/2

Browsers became more willing to keep one hot connection per origin and open streams opportunistically. CDNs tuned edge stacks to multiplex efficiently and terminate many browser sessions at scale. Origin operators revisited old asset strategies and sometimes discovered that their HTTP/1.1-era sharding and bundling rules now made performance worse.

The most successful HTTP/2 deployments usually shared a few characteristics:

ALPN and TLS configuration were correct and modern
CDN or reverse proxy support was mature
domain sharding was reduced
cache policy was clean so repeat requests stayed at the edge
large headers and cookies were kept under control

Put differently, HTTP/2 worked best when the organisation treated it as part of an end-to-end delivery design, not as a box to tick in a TLS configuration template.

Debugging HTTP/2 in Production Usually Means Correlating Several Views at Once

HTTP/1.1 problems were often easier to spot from plain text request logs and packet captures. HTTP/2 is more layered. The useful debugging workflow usually combines:

browser waterfall or netlog
edge proxy stream counters
origin latency metrics
TLS negotiation data
packet captures only when necessary

Typical questions include:

did the client negotiate h2 at all
how many streams were open concurrently
did one stream monopolise the connection
were flow-control windows exhausted
did a proxy downgrade or buffer oddly between edge and origin

For example, a user may report that CSS arrives late only on one network path. The browser panel may show the HTTP/2 connection is healthy, but the origin metrics may reveal that the CDN shield is serialising origin fetches under pressure. Another case may show good origin timings but a bad client experience because one lossy LTE path triggers TCP-level stalls on the shared connection. The HTTP layer and transport layer both have to be inspected.

This is the other reason HTTP/2 never became invisible infrastructure. It improved things enough that most users stopped thinking about it, but the engineers running large services still need to reason about frame scheduling, header pressure, and loss on one shared transport. That is a better problem set than HTTP/1.1 gave them, not a trivial one.

Not All Frame Types Matter Equally, but the Less Common Ones Still Shape Behaviour

Most explanations of HTTP/2 focus on HEADERS and DATA because those carry the visible request and response. In practice, several less glamorous frame types shape connection health and performance:

SETTINGS

This frame defines the local operating envelope. If one side allows only a small number of concurrent streams or advertises a small initial flow-control window, the entire connection behaves differently from the beginning.

WINDOW_UPDATE

Without these frames, high-throughput transfers would stall quickly. They are the credit-replenishment mechanism that lets the sender continue once the receiver has processed enough data.

RST_STREAM

Cancellation is normal in browsers. Users navigate away, speculative requests become unnecessary, and preloaded resources lose value. RST_STREAM lets the peer stop one logical exchange without tearing down the entire connection.

GOAWAY

This is the graceful connection shutdown signal. It tells the peer the highest stream ID that might have been processed and prevents needless retries on streams that were never accepted.

PING

Operators use this for liveness and RTT measurement at the HTTP/2 layer. It does not replace TCP keepalives or network telemetry, but it is often useful in long-lived application sessions.

Understanding these frames helps explain production behaviour. For example, graceful deploys at a reverse proxy often involve sending GOAWAY so existing streams can finish while new streams migrate elsewhere. Similarly, a high rate of RST_STREAM events in logs may not mean failure. It may mean a browser is reprioritising aggressively or a gRPC client is timing out calls quickly.

The frame model also explains why HTTP/2 implementations need careful state accounting. Even if the payload volume is low, the control-plane churn of streams opening, updating windows, being cancelled, and shutting down can still be significant.

Flow Control Becomes Easier to See with a Concrete Example

Flow control feels abstract until you put numbers on it. Suppose a server is sending:

a 1.8 MB JavaScript bundle on stream 5
a 24 KB CSS file on stream 3
a 12 KB font manifest on stream 7

Assume the client's initial stream window is 65,535 bytes and the connection window is also 65,535 bytes.

If the server starts by pouring data into stream 5 only, it can exhaust much of the available credit quickly. Until the client processes some bytes and sends WINDOW_UPDATE, the server's ability to send more data on any stream can shrink because the shared connection-level window is being consumed too.

That creates a subtle but important scheduling consequence. A sender that cares about latency should not dump all available credit into the largest object first. It should send enough to keep the pipe busy while still leaving room for smaller critical streams.

Flow control and prioritisation were always linked in practice for that reason even though they are distinct mechanisms in the spec. The sender has to decide not only what it is allowed to send, but what it should send with the currently available credit.

This is also why a badly implemented HTTP/2 stack can feel worse than HTTP/1.1 in edge cases. If the stack fills the connection window with one bulky stream and delays WINDOW_UPDATE processing poorly, small important responses can feel strangely sluggish even though multiplexing exists on paper. The protocol permits good behaviour. The implementation has to deliver it.

HTTP/2 Changed How People Think About "One Connection per Origin"

Before HTTP/2, one connection per origin sounded conservative. During HTTP/2 adoption, it started to sound efficient. In the HTTP/3 era, teams often run both and have to think carefully about protocol mix.

This matters because one origin may now expose:

HTTP/1.1 for very old clients
HTTP/2 for most HTTPS traffic over TCP
HTTP/3 for clients that support QUIC and prefer it

That means "connection strategy" is no longer one static rule. Browsers and edge systems decide dynamically based on:

ALPN results
previous protocol success
network conditions
certificate and origin coalescing rules
whether QUIC is blocked or degraded

HTTP/2 therefore became part of a broader lesson for transport engineers: the web works best when the client can keep a small number of hot, high-quality paths rather than constantly reopening new ones. That lesson outlived HTTP/2 itself and carried directly into QUIC design.

Many HTTP/1.1-Era Frontend Tricks Became Technical Debt Under HTTP/2

When a site migrates to HTTP/2, the old performance hacks do not disappear automatically. They often linger:

giant JavaScript bundles justified by "fewer requests"
CSS merged into monoliths
image sprites that are awkward to maintain
many sharded asset hostnames
brittle preload strategies designed around limited socket counts

These were rational optimisations once. Under HTTP/2 they can become liabilities.

A 2 MB JavaScript bundle that once saved ten request setups may now delay parsing, caching, and execution far more than it helps transport efficiency. A sprite sheet that once reduced request overhead may now hurt rendering and cache granularity. A thicket of static hostnames may defeat connection reuse and complicate certificate management.

This is one reason real HTTP/2 migrations sometimes disappointed teams at first. They enabled h2, saw only modest improvement, and concluded the protocol was overhyped. In reality, the application was still shaped by assumptions from an earlier transport world.

The transport can remove one class of bottleneck. The asset and page design still need to meet it halfway.

Why HTTP/2 Never Used Cleartext h2c on the Public Web

The spec allows a cleartext variant, usually called h2c, where HTTP/2 runs without TLS. In principle a client and server can upgrade from HTTP/1.1 or start with prior knowledge. In practice the public web almost never adopted that path.

There were several reasons.

First, HTTPS became the default expectation for almost every serious website. Browsers, search engines, and platform vendors steadily pushed the web toward encryption not only for privacy, but also for integrity. Once HTTPS was the baseline, the practical deployment path for HTTP/2 naturally ran through ALPN inside TLS.

Second, intermediaries were already difficult enough. Adding another public upgrade path with mixed cleartext behaviour created little benefit compared with the simplicity of "if it is modern and public, it is almost certainly HTTPS with ALPN".

Third, the biggest visible gains from HTTP/2 on the web were closely tied to encrypted transport anyway. Reducing repeated TLS setup, coalescing hot secure connections, and using one well-managed browser socket all fit naturally into the HTTPS model that the web was already converging on.

h2c survives mostly in specialised internal environments, test setups, and some proxy-to-proxy use cases rather than on ordinary sites for that reason. The important lesson is that HTTP/2 did not succeed as a generic framing layer in the abstract. It succeeded as the secure default transport for modern HTTPS delivery. The protocol and the web's wider security posture moved together.

A Concrete Frame Walkthrough Makes the Wire Format Easier to Trust

The binary framing layer sounds abstract until you watch one simple request-response exchange at frame level.

Suppose a browser requests:

GET /app.css HTTP/2
Host: static.example.eu

On the wire, the browser does not send those textual lines as an HTTP/1.1 block. It sends a HEADERS frame on a new stream. The pseudo-headers and ordinary headers are HPACK-encoded into a header block fragment, and the frame header declares:

type = HEADERS
stream ID = 3
flags = END_HEADERS | END_STREAM

The browser can set END_STREAM because a simple GET has no request body.

The server may answer with:

HEADERS stream 3, status 200
DATA stream 3, first bytes of CSS
DATA stream 3, remaining bytes with END_STREAM

If another object is needed at the same time, the server can insert its frames between those DATA frames. So a connection transcript might look like:

C -> S  HEADERS stream=1  GET /index.html
C -> S  HEADERS stream=3  GET /app.css
C -> S  HEADERS stream=5  GET /app.js
S -> C  HEADERS stream=1  :status 200
S -> C  DATA    stream=1  "<!doctype html>..."
S -> C  HEADERS stream=3  :status 200
S -> C  DATA    stream=3  "body{...}"
S -> C  HEADERS stream=5  :status 200
S -> C  DATA    stream=5  "(()=>{...})"

That one example explains most of the protocol's practical value. The browser can issue related requests immediately, and the server can begin each response as soon as it has useful bytes rather than waiting for the earlier response to complete fully.

This is also why observability tools that can decode frames are so valuable. Ordinary access logs tell you that the requests existed. Frame-aware tools tell you how the connection actually scheduled them.

SETTINGS Values Quietly Change the Feel of a Connection

The SETTINGS exchange rarely appears in casual discussions of HTTP/2, but small changes there can reshape performance dramatically.

Consider three especially important settings:

`SETTINGS_MAX_CONCURRENT_STREAMS`

This caps how many streams the peer should have open at once. If the server advertises a very low value, the browser is forced back into a more serial pattern and loses much of the protocol's concurrency benefit.

`SETTINGS_INITIAL_WINDOW_SIZE`

This controls how much DATA the sender may transmit on each stream before flow-control credit runs out. Too small and transfers become chatty and stop-start. Too large and buffering pressure rises.

`SETTINGS_HEADER_TABLE_SIZE`

This controls HPACK dynamic table memory. A larger table can improve compression for repetitive headers, but also consumes more state and potentially increases pressure under abuse.

These settings are one reason two sites can both "support HTTP/2" and still behave quite differently. The protocol version is the same. The local operating envelope is not.

Operators often discover this while load-testing APIs or media delivery. A reverse proxy with conservative defaults may be perfectly safe, but it can also force avoidable queuing or stop-and-go flow-control patterns. The fix is not "enable HTTP/2 harder". The fix is tuning the actual frame-level limits the peer experiences.

This is one more reason HTTP/2 is not merely a browser feature. It is a transport behaviour surface with parameters that infrastructure teams have to understand intentionally.

Error Handling Became More Precise Because One Bad Stream Should Not Kill the Whole Session

One overlooked improvement in HTTP/2 is that it separates stream errors from connection errors much more cleanly.

Under HTTP/1.1, a malformed or failed response often had ugly consequences because there was not much structure beyond the connection itself. Under HTTP/2, the protocol can say:

this stream is bad, reset it
this header block is too large, reject it
this flow-control rule was violated, treat it as a connection problem

That precision matters operationally. A cancelled image fetch should not kill the CSS response and the analytics request. A malformed stream should usually die alone unless it indicates the peer is seriously misbehaving.

Some common error scenarios:

Stream Reset

If the client no longer needs a resource, it can send RST_STREAM. The rest of the connection stays alive.

Graceful Shutdown

If the server is draining for deploy, it can send GOAWAY to stop new streams while finishing existing ones.

Protocol Violation

If the peer breaks framing or flow-control rules badly enough, the connection may have to close because the shared state can no longer be trusted.

This is part of why modern proxies and RPC stacks liked HTTP/2. They gained a more expressive failure model than "socket alive or socket dead". That makes retries, cancellation, graceful deploys, and long-lived sessions more manageable.

The Browser Waterfall Still Reflects Discovery Order, Dependency Chains, and Main-Thread Work

It is easy to give HTTP/2 too much credit for page-load outcomes. Even after the transport improves, the browser still has to:

parse HTML to discover subresources
parse CSS to discover fonts and images
execute JavaScript that may trigger additional requests
schedule rendering on the main thread

This means a page can negotiate HTTP/2 perfectly and still feel slow if:

CSS arrives late and blocks first paint
JavaScript is huge and monopolises the main thread
lazy loading is configured badly
the HTML itself is delayed by backend compute

The useful mental model is that HTTP/2 shrinks one important set of transport bottlenecks. It does not erase the rest of the browser's dependency graph.

For example, a site may improve from 1.8 seconds to 1.2 seconds first-contentful paint after moving from HTTP/1.1 sharded delivery to clean HTTP/2, but still remain slower than it should because:

the CSS is render-blocking and oversized
the font strategy causes layout instability
the origin HTML response is slow

Transport and application structure therefore have to be evaluated together. The README standard for this site is right to demand concrete detail here, because this is exactly where shallow protocol explanations become misleading. A faster wire format is helpful. It is not a substitute for good page architecture.

HTTP/2 Matters Today Mostly as the Baseline That Replaced HTTP/1.1 on the Public Web

HTTP/3 gets attention now, but HTTP/2 is still the operational baseline for much of the encrypted web. It cleaned up years of awkward client behaviour, made one-connection delivery practical, and taught the ecosystem how to build framed, multiplexed HTTP stacks.

Its limitations are also instructive. HTTP/2 showed that fixing message ordering above TCP helped a lot, but not enough to eliminate transport-level coupling. That experience directly informed QUIC and HTTP/3.

If you keep one mental model, use this one: HTTP/2 turns one TCP connection into many framed logical streams, compresses repetitive headers, adds explicit flow control, and lets the client and server interleave work far more efficiently than HTTP/1.1. It solves application-layer queuing. It does not solve TCP's in-order delivery rules.

That is how HTTP/2 actually works.