21-04-2026

How CDNs Actually Work

Try the interactive lab for this article Take the quiz (6 questions · ~5 min)

Most people learn that a content delivery network is "a cache near the user". That is true in the same way that calling an airport "a car park for planes" is technically true. It points in roughly the right direction and leaves out almost everything that matters.

A modern CDN is not one cache. It is a globally distributed traffic system that combines anycast routing, DNS steering, TLS termination, request normalisation, hierarchical caching, origin shielding, DDoS filtering, and increasingly edge compute. When someone in Athens loads a site accelerated by a major CDN, the path usually touches a nearby point of presence, not the origin server sitting in Dublin, Frankfurt, or a cloud region elsewhere. If the object is cached locally, the edge responds immediately. If it is not, the edge may fetch from a regional shield cache rather than from the origin directly. If the origin is under stress, the CDN may collapse duplicate requests, serve stale content briefly, or block abusive traffic at the edge before the origin sees it.

That stack exists because web performance and origin survivability are tightly linked. A user does not care whether a page was slow because of RTT, TCP slow start, packet loss on a transcontinental path, a congested origin, or a thundering herd of cache misses. The CDN exists to remove as many of those penalties as possible before they become user-visible.

This article looks at the mechanics under the hood: how anycast gets the user to a nearby edge, how cache keys and freshness rules decide whether the edge can answer, how tiered cache hierarchies protect the origin, how purges propagate globally, how TLS and HTTP versions terminate at the edge, and why modern CDNs have become programmable network platforms rather than static asset boxes.

The First Job of a CDN Is to Move the First Hop Closer to the User

The simplest win a CDN can deliver is geographic and topological proximity. If the origin is in Frankfurt and the user is in Athens, the browser must otherwise pay the RTT to Frankfurt for the first handshake and every object fetch. Even on a good European backbone, that is slower than talking to an edge in Athens or Sofia.

The speedup matters because modern web latency compounds quickly:

DNS lookup
TCP handshake
TLS handshake
request transmission
response first byte

If the browser talks to a nearby edge, the early round trips are shorter. TCP slow start also becomes less painful because the connection ramps up on a shorter path first. CDNs therefore improve not only throughput, but also time to first byte and page interactivity.

The improvement is often easiest to see with static assets. If a stylesheet and hero image are cached in an edge PoP close to the user, those objects no longer depend on origin distance at all for cache hits. The origin becomes a background refill source rather than the live responder for every request.

This is the operational difference between a site hosted "in Europe" and a site delivered "from the edge". Distance still exists. The CDN simply moves the expensive part of the interaction closer to the user.

Anycast Is How One IP Address Exists in Many Places at Once

Most large CDNs use anycast for their front door. The same IP prefix is advertised from many PoPs, and global BGP routing sends each client to the topologically closest or otherwise preferred location according to the network path.

This is the reason a single service IP can land one user in London, another in Frankfurt, and another in Madrid without any application-level redirect. The routing system does the steering.

A simplified model:

CDN advertises 203.0.113.0/24 from:
  Athens
  Frankfurt
  London
  Paris
 
ISP in Greece sees Athens as best path
ISP in Germany sees Frankfurt as best path
ISP in the UK sees London as best path

From the user's perspective, the hostname resolves to an IP and the TCP handshake just happens. From the CDN's perspective, that one IP is simultaneously present in many cities.

Anycast is not perfect. BGP follows policy and topology, not geography alone. A user in Thessaloniki might land in Athens, Sofia, or even Frankfurt depending on peering. But the general effect is strong: the first network hop into the CDN is much closer than the origin usually is.

This also helps resilience. If one PoP withdraws the route, traffic shifts to other PoPs without changing the client-visible address. The failure domain becomes smaller and easier to mask.

DNS and Anycast Work Together, Not as Rivals

People sometimes talk as if CDNs either use DNS steering or anycast. Large deployments often use both, but at different layers.

DNS decides which CDN hostname or service address the browser receives. Anycast decides which physical PoP handles traffic for that address. A typical flow looks like this:

www.example.eu is a CNAME to a CDN-managed hostname
the CDN returns an address for the edge service
that address is anycast from many PoPs
BGP delivers the user to the nearest or best-reachable edge

DNS still matters because:

different products may resolve to different anycast services
the CDN may steer by geography, load, or customer policy
TTLs influence how quickly steering changes propagate
some vendors use DNS more heavily for regional segmentation

Anycast handles the fast path for packets once the browser has the address. DNS handles naming, product selection, and some policy before the packets ever leave the client.

In practical operations, this means two control planes are in play:

the DNS control plane, which maps the hostname to an edge service
the BGP control plane, which maps the edge service to a nearby PoP

Understanding both is essential when troubleshooting. A request can be "on the CDN" and still land in an unexpected city because the anycast path differs from what someone expected from the DNS answer alone.

A Cache Hit Depends on the Cache Key, Not Just the URL

At the heart of a CDN is a cache, but a cache is only as good as its key. The key decides whether two requests are considered equivalent.

The naïve cache key is:

scheme + host + path

Real caches often need more nuance. Depending on the application, the key may also include:

selected query parameters
Accept-Encoding
device or image format variations
language selection
specific cookies, or no cookies at all

If the cache key is too broad, the CDN serves the wrong object to some users. If it is too narrow, cache hit ratio collapses and the origin sees unnecessary traffic.

Example:

GET /avatar?id=42&size=small
GET /avatar?id=42&size=large

If the CDN ignores size, the responses collide incorrectly. If it includes every random tracking query string, the cache fragments and becomes ineffective.

This is one reason CDN tuning is rarely "set and forget". The edge is only as useful as the cache key design, and the key must match the application's variation model exactly.

Freshness Rules Are Policy, Not Guesswork

Once the cache key identifies an object, the next question is whether the cached copy is still fresh. CDNs rely on explicit cache policy, mainly from HTTP headers such as:

Cache-Control: public, max-age=600, stale-while-revalidate=30
ETag: "build-4821"
Last-Modified: Tue, 21 Apr 2026 10:00:00 GMT

Common patterns include:

max-age for browser and edge freshness
s-maxage for shared caches specifically
stale-while-revalidate for serving slightly old content during background refresh
stale-if-error for origin failure resilience
validators such as ETag and Last-Modified for conditional revalidation

Freshness control is where performance and correctness meet. A long TTL boosts hit ratio and protects the origin, but makes updates harder. A short TTL improves freshness but increases origin traffic and RTT exposure.

The usual answer is not one universal TTL. Static versioned assets like /app.4f92c1.js can be cached for months because their names change on deployment. HTML usually needs much shorter lifetimes because it controls which assets the browser discovers next.

Strong deployment pipelines therefore pair CDN policy with asset naming strategy. Versioned immutable files and short-lived HTML are a far more stable combination than trying to purge everything on every deploy.

The Edge Cache Is Often Only the First Layer

Large CDNs rarely let every edge miss hit the origin directly. They use cache hierarchies.

The basic model is:

edge cache close to the user
regional or shield cache behind the edge
origin behind the shield

Suppose an object is requested from PoPs in Athens, Vienna, and Prague almost simultaneously. If every miss went straight to origin, the origin would receive three fetches immediately. With shielding, the edge PoPs ask a regional parent cache first. The parent may collapse those requests so only one origin fetch occurs.

This protects the origin from two common problems:

request amplification during global cache cold start
thundering herd on popular objects after expiry

A simplified path for a miss:

Client in Athens
  -> Athens edge
  -> Frankfurt shield
  -> origin in Dublin

The second request from another nearby edge may stop at the Frankfurt shield and never touch Dublin at all.

Shielding is especially useful for dynamic but cacheable content, large media objects, and origins with limited concurrency. It turns the CDN from a flat cache fleet into a layered request-absorption system.

Request Collapsing Prevents Many Identical Misses from Smashing the Origin

Imagine one HTML page references a new JavaScript bundle. At 09:00, thousands of users request that bundle, but it has just expired or been purged. If every edge thread fetched it independently, the CDN would behave like a fan-out amplifier against the origin.

Request collapsing, also called collapsed forwarding, avoids that. The first miss for a cache key becomes the active origin fetch. Later identical misses wait on that fetch instead of starting their own. Once the object arrives, all waiters are satisfied from the same response.

This matters under load spikes:

product launches
breaking news
software update releases
major sports events

The feature is easy to describe and incredibly valuable in practice. Many origin outages are not caused by steady traffic. They are caused by miss amplification after expiry or purge. A good CDN should absorb that pattern gracefully.

CDN architecture is therefore partly about latency and partly about origin economics. The better the edge fleet is at deduplicating work, the smaller and calmer the origin can be.

Purging Is a Distributed Invalidation Problem

Caching is easy compared with invalidation. The hard part is making stale content disappear everywhere when it must.

CDNs generally support two broad strategies:

Versioned Objects

Static assets get new filenames on each deploy. Old versions can remain cached until they age out naturally. This is the safest and cheapest invalidation strategy.

Active Purge

The operator explicitly tells the CDN to remove an object, a path prefix, a tag group, or an entire hostname's cache contents.

Global purging is hard because the CDN is distributed. The invalidation must propagate to many PoPs, many cache nodes, and often several cache layers. The control plane needs to be fast enough that users stop seeing stale content quickly, but robust enough that one slow PoP does not leave inconsistent state for too long.

Serious CDN platforms therefore invested heavily in internal pub/sub systems for invalidation. Purge latency is not a minor feature. It decides how quickly broken HTML, outdated JavaScript, or incorrect API responses disappear from user view.

Operationally, fast purge matters most when:

a bad deploy must be rolled back
confidential content was cached accidentally
pricing or legal text changed and must update quickly
a CMS publishes frequently changing pages

A CDN with excellent hit ratio but slow invalidation will still create painful incidents. Freshness control and purge propagation are as important as cache fill.

TLS Usually Terminates at the Edge, Which Changes Everything Behind It

When a browser connects to a CDN, the TLS handshake usually terminates at the edge PoP. The edge presents the certificate, negotiates ALPN, and handles HTTP/2 or HTTP/3 for the client. Behind the edge, the CDN may:

fetch from origin over TLS
fetch from origin over plain HTTP on a private network
reuse a smaller pool of long-lived origin connections
speak a different HTTP version to the origin

This has several consequences:

Performance

The expensive handshake happens close to the user. That cuts setup latency.

Security

The CDN sits in the trust path. It can inspect headers, apply WAF rules, normalise requests, and cache responses because it terminates encryption.

Operational Simplification

Origins often handle fewer direct client connections. The CDN becomes the public transport layer and security perimeter.

Protocol Translation

The browser might speak HTTP/3 to the edge while the edge speaks HTTP/2 or HTTP/1.1 to the origin.

"Supports HTTP/3" often means "supports HTTP/3 at the edge". The origin does not need to speak QUIC for the user to get the benefit. The CDN absorbs the complexity.

Dynamic Acceleration Is About Connection Reuse and Path Optimisation, Not Magical Caching

Modern CDNs do more than cache static files. They also accelerate dynamic requests that cannot be cached safely.

They do this through mechanisms such as:

keeping hot origin connections open
reducing handshake repetition
optimising congestion and retransmission on long-haul paths
choosing good backbone routes between edge and origin
normalising headers and buffering uploads sensibly

For a user in Lisbon hitting an origin in Warsaw, the edge may accept the request locally, then forward it over the CDN's own backbone or well-engineered transit path to the origin. Even if the response is fully dynamic and not cached, the browser still avoids a long direct handshake to the origin.

This is one reason CDNs evolved from "static content networks" into general application delivery platforms. Once the edge is already the user's first stop, it can improve more than cache hits.

Edge Compute Turned the CDN into a Programmable Runtime

The latest phase of CDN evolution is not just better caching. It is programmable edge execution. Instead of only serving or fetching objects, the edge can now run logic such as:

URL rewrites
bot checks
header manipulation
A/B experiment assignment
image transformation
lightweight personalisation
access control decisions

Platforms expose this through JavaScript isolates, WebAssembly, or proprietary workers models. The reason this became attractive is architectural. The CDN already sees every request very early. If a cheap decision can be made at the edge, the origin avoids that work entirely.

Examples:

reject abusive requests before they touch origin
resize images at the edge rather than storing many variants
choose the nearest API region from request metadata
inject security headers consistently for all responses

This changes the CDN from "cache in front of the origin" to "programmable control plane for request handling".

The tradeoff is complexity. Once logic exists at the edge, debugging and consistency become harder. The operator now has to reason about code running in hundreds of PoPs, distributed rollouts, and interactions between cache state and runtime behaviour.

CDNs Protect the Origin by Filtering, Rate Limiting, and Absorbing Traffic Spikes

Performance is the public story. Origin protection is just as important.

The edge can:

terminate floods close to ingress
challenge suspicious clients
rate-limit abusive paths
cache popular responses to reduce backend load
hide origin IP addresses from casual scanners

CDNs are often the first line of defence for DDoS mitigation. Large anycast fleets spread attack traffic across many PoPs, and edge filtering reduces the volume that reaches the customer's infrastructure.

A CDN does not make the origin invincible. It moves the fight outward. Instead of one application cluster absorbing every request directly, a global edge network absorbs and filters a large share first.

From an infrastructure perspective, that is often the difference between "origin overloaded instantly" and "incident contained at the edge".

Bad Cache Policy Can Make a CDN Worse Than Having None

CDNs fail badly when operators assume they are simple.

Common mistakes include:

caching HTML too aggressively and serving stale deployments
including volatile query strings in the cache key unnecessarily
varying on cookies that do not actually affect content
failing to version static assets
purging too often instead of using immutable asset names
exposing private origin behaviour through inconsistent edge rules

A misconfigured CDN can cause:

low hit ratio and high cost
stale content after deploys
origin overload from cache fragmentation
broken localisation
accidental caching of user-specific responses

The presence of an edge fleet does not guarantee good results. The application's cache semantics still need to be designed carefully. A CDN amplifies both good and bad policy.

A Cache Miss Is Still a Structured Workflow, Not a Blind Origin Fetch

When an edge does not have a fresh object, several decisions still happen before the origin is touched.

The edge may check:

whether the object is stale but still temporarily serveable
whether revalidation with If-None-Match or If-Modified-Since is possible
whether another edge request for the same object is already in flight
whether the request should go to a shield cache first
whether the origin is currently marked degraded

That means a miss path often looks like this:

client asks Athens edge for /app.4f92c1.js
edge checks local cache and freshness metadata
edge sees object expired but has an ETag
edge asks Frankfurt shield, which may already have a fresher copy
if needed, shield revalidates with origin in Dublin
origin returns 304 Not Modified or a new object body
shield updates its metadata
edge updates its local cache and answers the client

If the origin returns 304, the data path is cheap because the old body remains valid. If the origin returns a new object, the CDN must replace stored state and may need to stream the new bytes to many waiting clients.

Validators matter a lot. A CDN without strong revalidation metadata has to choose between:

shorter TTLs and more expensive full refetches
longer TTLs and more stale-risk

With ETag or Last-Modified, the edge can ask a much cheaper question: "has this changed?" rather than "please send the whole object again." For frequently accessed HTML or API responses that are cacheable for short periods, this can be the difference between a healthy origin and one buried under avoidable refill traffic.

Large Objects and Range Requests Change Cache Behaviour Substantially

Static websites are the simple case. Large media objects, software installers, and video segments create a different class of CDN problem.

If a client asks for part of a large object using a byte range, the CDN needs policy for whether it:

caches the whole object
caches only requested ranges
forwards the range upstream each time
coalesces ranges into larger cached chunks

Video delivery is the most common example. Adaptive bitrate streaming usually splits media into short segments, which is cache-friendly. Large downloads and scrubbing inside media players are less tidy.

Suppose a user requests:

Range: bytes=1048576-2097151

The edge then has to decide whether fetching only that range from origin is better than pulling more of the file and storing it for later use. The answer depends on:

object size
request popularity
storage cost
expected reuse pattern
origin bandwidth constraints

This is one reason CDNs are not generic transparent proxies. They embody policy about what kinds of traffic are worth storing and how aggressively to do it. The ideal policy for a 20 KB CSS file is not the ideal policy for a 4 GB software image or a long video archive.

Operators also have to think about egress economics. A range-heavy workload can produce surprising origin amplification if the CDN is configured poorly. Good large-object handling is therefore part performance engineering and part cost control.

Origin Shielding Is Also About Failure Containment

Shield caches are often described as an origin-protection feature for cache misses, but their role in incidents is just as important.

Imagine the origin begins returning elevated latency because a database pool is saturated. Without a shield, dozens of edge PoPs may all continue probing or refetching independently. With a shield, many of those decisions collapse into one regional layer. That reduces the number of actors touching the failing origin and makes backoff logic more effective.

Shielding helps in at least three ways:

fewer duplicate miss fetches
easier origin rate limiting and retry control
smaller blast radius when origin health degrades

In mature CDN designs, the shield becomes the main consumer of origin capacity and the edges become consumers of shield capacity. That hierarchy gives operators more control over failure behaviour. They can tune:

timeouts between edge and shield
timeouts between shield and origin
retry budgets
stale-on-error behaviour
origin circuit breaking

At large scale, these controls matter more than the simplistic hit-ratio story. A CDN earns its place not only when everything is healthy, but when one dependency becomes slow and the edge network still protects users from seeing the full impact immediately.

Multi-CDN Strategies Exist Because One Edge Fleet Is Not the Whole Internet

Large organisations sometimes use more than one CDN. The reason is not fashion. It is risk management and performance diversity.

A multi-CDN design can help with:

resilience against one provider outage
regional performance optimisation
pricing leverage
product specialisation, for example one CDN for video and another for API acceleration

But it also introduces complexity:

cache state is no longer shared
purges must hit more than one control plane
logs and request IDs split across vendors
DNS steering becomes more complicated
certificate deployment and origin access control must stay consistent

In a failover event, the second CDN may have colder caches and a different anycast footprint. That means "failover works" is not enough. The operator also needs to know what user performance looks like during failover, whether purges remain coherent, and whether the origin can survive the changed miss pattern.

This is another reminder that CDN architecture is traffic engineering, not merely caching. The operator is choosing how the public edge of the application should exist across the internet's real topology, commercial relationships, and failure modes.

Logs and Request IDs Matter Because the Edge Adds Another Whole System Layer

Once a CDN sits in front of the origin, one user request often generates multiple internal events:

browser to edge request
edge cache decision
possible shield request
possible origin request
possible revalidation
possible WAF decision

If those layers cannot be correlated, debugging becomes painful.

Good CDN operations rely on:

per-request identifiers
cache status fields such as hit, miss, stale, revalidated
PoP or colo identifiers
origin timing metrics
purge event logs

A useful access log line often needs to say more than "status 200". It should also say:

where the request landed
whether it was a hit
whether the response came from shield or origin
how long origin fetch took if used

Without that, teams end up arguing from incomplete evidence. The browser says the site was slow. The origin says its own CPU was fine. The missing truth may be that one regional edge cluster had poor cache key design and kept missing unnecessarily.

The better the request correlation, the easier it is to separate:

edge latency
routing issues
shield pressure
origin slowness
cache-policy mistakes

Observability is part of CDN architecture, not an optional dashboard layer.

CDNs Also Change the Economics of Application Design

One under-discussed reason CDNs became so dominant is that they reshape what the origin needs to be good at.

If the edge handles:

most static delivery
much of TLS termination
some rate limiting
some request normalisation
some image transformation

then the origin can focus more narrowly on:

correctness
business logic
private data access
cacheable response generation where appropriate

That changes cost structure. Instead of scaling the origin for every single request from every geography, teams can scale the origin for:

cache misses
personalised traffic
writes
low-hit or uncacheable reads

This is one reason application teams tolerate CDN complexity. The alternative is often to push more global delivery, DDoS resilience, and TLS edge engineering back into the application stack itself. For most organisations, that is slower and more expensive.

A team therefore rarely chooses "a CDN" in the abstract. It chooses a mix of:

reach
purge model
cache tooling
security controls
edge programmability

That mix then shapes how comfortably the rest of the platform can evolve.

Request Normalisation and Cache Poisoning Defence Are Part of the Edge Job

Caching only works safely if semantically equivalent requests are treated consistently and semantically dangerous ambiguity is reduced. That means a CDN often normalises parts of the request before caching or forwarding it.

Examples include:

lowercasing or canonicalising certain header handling paths
deciding whether query parameter order matters
choosing whether duplicate headers are legal or suspicious
collapsing repeated slashes or dot segments in URLs if policy allows
stripping tracking parameters from the cache key

This matters for performance, but it also matters for security. Ambiguous request interpretation between:

browser and edge
edge and origin
cache key logic and origin routing

can create cache poisoning opportunities. If one layer thinks two requests are equivalent and another does not, the wrong response may be stored or reused.

Mature CDNs invest heavily in canonicalisation and parser consistency for that reason. The edge is not only trying to answer quickly. It is trying to ensure that what it caches is the object the origin really intended for that key. A cache that can be poisoned is worse than a cache miss.

In production, this often means performance and security teams end up working on the same controls. Header normalisation, query string policy, and method restrictions are not isolated concerns. They shape both hit ratio and correctness.

Serving Stale on Error Is One of the Quietest and Most Valuable Features

One of the most useful CDN behaviours is the ability to serve stale content temporarily when the origin is failing. This is often controlled by policy such as stale-if-error, platform-specific edge rules, or emergency operator overrides.

Why is it so valuable?

Because a slightly outdated page is often much better than a hard outage.

Consider a news front page cached for 60 seconds. The origin starts timing out because a backend search cluster is overloaded. If the edge has a recently expired copy and policy allows stale-on-error, the CDN can keep serving the old page briefly while:

the origin recovers
the operator rolls back
cache fill pressure is reduced

This smooths incidents in a way users rarely notice explicitly. They do not know the page was 45 seconds older than ideal. They only know the site did not disappear.

Of course, stale serving is not universally safe. It is poor for:

account balances
checkout state
one-time tokens
highly personalised responses

But for public content, product pages, docs, or many catalog views, it is often exactly the right tradeoff under failure.

This is another reason CDN policy is never just about static speed. It is about how the system behaves under stress when freshness, availability, and backend health pull in different directions.

Real-Time Logging and Edge Analytics Changed How Operators Debug the Web

Before CDNs became dominant, many operators debugged web performance mainly from origin logs and some packet captures. With a CDN in front, the most useful first-party evidence often lives at the edge:

cache status
PoP identifier
edge response time
shield response time
origin response time
WAF action
bot score or challenge outcome

That changes incident response. If users in Greece report slowness while users in Germany do not, the operator can look for:

one troubled PoP
one regional transit issue
one shield cluster with poor hit ratio
one specific cache key fragmenting unexpectedly

This is a radically different debugging model from the older "check the web server" instinct. The web server may be completely healthy while one edge region is having trouble. Or the edge may be healthy while the origin fetch path is unstable. Or everything may be healthy except one purge event that invalidated too much content at once.

The better the edge analytics, the faster teams can separate:

routing problems
cache-policy problems
security filtering mistakes
true origin slowness

That separation is what turns a CDN from a black box into infrastructure the team can actually operate confidently.

Purge Strategy and Content Tagging Become Editorial Infrastructure

As soon as a site publishes frequently, cache invalidation stops being an infrastructure detail and becomes part of the content workflow.

Teams often need to purge by:

exact URL
prefix
hostname
surrogate key or content tag
deployment version

The more mature the site, the more important surrogate tags become. Suppose one article appears:

on its own page
in the homepage feed
in a topic archive
in a "latest posts" widget
in a JSON feed consumed elsewhere

If an editor updates the article, invalidating only the article URL may not be enough. Purging by content tag lets all related cached representations be expired together. This is one of the reasons modern CDN integration often reaches deeply into the CMS or deployment pipeline.

Without good tagging, teams fall back to broad purges that:

erase too much warm cache state
spike origin load
make deploys feel slower and riskier

With good tagging, they can invalidate precisely and preserve most of the fleet's useful cached content. Again, the CDN stops being "just a network thing" and becomes part of how publishing actually works.

Multi-Layer Edge Logic Means You Need a Clear Mental Model of Where Decisions Happen

A request can be transformed at several points:

DNS steering before the TCP handshake
edge request normalisation before cache lookup
WAF and rate limiting before forwarding
shield cache logic before origin fetch
edge compute after origin response but before client delivery

If a team does not know where each decision happens, debugging becomes guesswork. Strong CDN operating practice usually documents:

the cache key definition
where headers are added or removed
where authentication is checked
where redirects are generated
where image or HTML transformations happen

Otherwise, the edge fleet becomes a pile of partly overlapping logic spread across products and control planes. The result may still work, but it will be fragile to change.

Edge Storage Is Finite, So Eviction Policy Shapes Real Performance

A CDN edge cache is not infinite. Every PoP has finite SSD or memory budget, and the edge has to decide which objects remain warm.

That means eviction policy matters just as much as fill policy. Popular small objects may stay resident for a long time. Large low-reuse objects may be evicted quickly even if they were expensive to fetch. The edge is constantly deciding which stored bytes are likely to pay for themselves in future latency reduction.

This has direct consequences for content design:

a handful of heavily reused versioned assets usually cache beautifully
one-off personalised HTML is a poor fit for long retention
giant infrequently used media files may churn storage and displace hotter content

Cache hit ratio therefore depends on more than TTLs. It also depends on object popularity distribution and object size distribution. A site with excellent cache headers can still perform poorly at the edge if its traffic mix keeps thrashing storage with large, low-reuse objects.

This is one reason video and software download products often receive specialised CDN handling. Their object-size economics differ sharply from ordinary website assets. Good CDN design therefore starts from workload shape, not from generic assumptions about "one cache policy for everything".

Origin Access Control Matters Because the CDN Should Be the Public Front Door

Once a CDN sits in front of the origin, the origin should usually stop behaving like a general public endpoint. If attackers or scrapers can bypass the CDN and hit the origin directly, several benefits weaken immediately:

WAF filtering at the edge is bypassed
anycast absorption is bypassed
origin hiding disappears
cache offload no longer protects the backend path

Good deployments therefore use origin access controls such as:

allowlists for CDN egress ranges
mutual authentication between edge and origin
signed origin pull requests
private networking where possible

This is not only a security concern. It is also architectural hygiene. The CDN is meant to be the controlled public interface. The origin should trust the edge deliberately rather than hoping clients always arrive through the intended path.

Teams that skip this step often discover later that a supposedly protected origin is still publicly reachable and much easier to overload than the edge. At that point the CDN is only part of the perimeter rather than the real perimeter.

A Single Dynamic Request Often Touches More Infrastructure Than People Expect

To see why CDNs became full delivery platforms, it helps to follow one dynamic request end to end.

Imagine a logged-out visitor in Athens loads /pricing for a SaaS site:

DNS resolves the hostname to a CDN-managed address
anycast lands the TCP or QUIC connection at an Athens or nearby edge
the edge terminates TLS and negotiates HTTP/2 or HTTP/3
the edge normalises the request and runs WAF checks
the edge checks whether /pricing is cacheable and currently fresh
if stale or absent, the edge asks the shield
the shield may revalidate with the origin in Frankfurt
the origin returns the current HTML
the shield stores or refreshes it
the edge sends the response to the client and may store it for a short period

Nothing in that path is conceptually difficult, but the operational implication is huge. What used to be "one request hit my web server" is now:

one request passed through several policy engines
one response may have been served from several possible layers
one cache decision may affect the next thousand users

CDN operations demand good request IDs and cache telemetry for that reason. Without them, the path becomes hard to reason about.

HTML Caching Is Harder Than Static Asset Caching Because the Stakes Are Higher

Versioned static assets are the easy case. HTML is harder because:

it controls asset discovery
it may contain user-specific or geo-specific elements
it changes more frequently
stale HTML can point at missing or old bundles

Many mature sites therefore separate cache strategy sharply:

Static Assets

long TTL
versioned names
immutable cache semantics

HTML

shorter TTL
strong validators
careful stale-on-error policy
explicit purge on deployment if necessary

If teams get this wrong, a common failure mode appears: new JavaScript or CSS is deployed, but some edges still serve old HTML pointing at no-longer-valid assets. The site then looks randomly broken by geography or by PoP.

The CDN is not misbehaving in that situation. The caching model is inconsistent. The edge cache is therefore part of deployment design, not only post-deployment acceleration.

Cache Revalidation Can Save an Origin More Than a Full Hit Ratio Number Suggests

Hit ratio is useful, but it hides an important middle category: revalidated responses.

If an edge cache object is stale, the CDN may avoid a full refetch by sending:

If-None-Match: "build-4821"
If-Modified-Since: Tue, 21 Apr 2026 10:00:00 GMT

If the origin returns:

304 Not Modified

the edge can keep the body it already has. That saves:

origin egress bandwidth
origin CPU spent serialising the body again
shield bandwidth
edge fill time

For frequently checked but infrequently changed content, this is a major optimisation. A dashboard page that many users hit every minute may not be a full cache hit forever, but if it mostly revalidates cheaply rather than redownloading the body, the origin still benefits greatly.

Good validators are so valuable because they create a spectrum between pure cache hit and full miss. Without them, the CDN has far fewer graceful options.

Edge Platforms Also Standardised Modern Image and Media Delivery

One reason CDNs became more than caches is that many assets can be transformed safely at the edge. Images are the clearest example.

An edge service may:

resize
crop
transcode
convert format based on Accept
tune quality for device or bandwidth class

For example, one source image might produce:

AVIF for browsers that support it
WebP for others
JPEG fallback where needed

That work is valuable because it combines:

lower origin storage complexity
lower edge-to-client transfer size
better device fit

But it also complicates cache keys. If the result varies by width, quality, or accepted format, the CDN must reflect that variation correctly or risk serving the wrong representation.

This is a recurring pattern in CDN design. New edge features usually create two simultaneous effects:

better performance or simpler origin logic
more cache-key and policy complexity

The platform becomes more capable and more stateful at the same time.

Traffic Steering During Incidents Is One of the Hardest Real CDN Problems

Under healthy conditions, steering traffic to a nearby PoP is relatively straightforward. During incidents, several harder questions appear:

should one overloaded PoP shed traffic to a farther PoP
should one region bypass a degraded shield
should one origin pool receive less traffic even if it remains technically reachable
should DNS steering move new sessions while anycast still handles existing ones

The answer depends on what is failing:

PoP Capacity Issue

Traffic may need to shift to a neighbouring city.

Shield Cluster Problem

Edges may need a different parent path.

Origin Problem

The CDN may need more aggressive stale serving, tighter request collapsing, or stricter WAF behaviour.

This is one reason very large CDNs invest so heavily in control planes and internal observability. At scale, the hard part is not serving static files from cache. The hard part is moving traffic gracefully when some part of the global system is degraded without causing a worse secondary problem elsewhere.

For users, a good incident looks invisible. For operators, that invisible experience often depends on a large amount of live traffic engineering at the edge.

CDN Security Layers Work Best When They Are Close to Normal Request Semantics

One reason CDNs became attractive security platforms is that they see traffic after TLS termination but before origin execution. That position lets them reason about:

HTTP methods
headers
hostnames
paths
request rates
cookies and tokens, if policy allows

This is more useful than filtering only on raw packets because the edge can distinguish:

one legitimate GET /docs/http2
one flood of malformed requests
one credential stuffing burst
one scraper cycling through catalogue URLs too quickly

The security actions may include:

rate limiting
challenge pages
bot scoring
request body limits
header sanitisation
geographic restrictions

The important architectural point is that these protections live in the same place as caching and request steering. That creates leverage. If the edge already handles the first HTTP hop, it can often reject hostile or wasteful traffic before the origin spends any compute on it.

This is also why origin bypass is such a serious mistake. The moment traffic can skip the edge, the operator loses not only caching, but also the normalisation and filtering layer that made the public interface manageable in the first place.

CDN Adoption Usually Changes Deployment Practice, Not Just Runtime Performance

A team that moves behind a CDN often finds that deployments start behaving differently:

cacheable assets can be pushed with long immutable lifetimes
HTML invalidation becomes a planned workflow
certificate rollout may move from the origin to the edge platform
traffic spikes after deploy no longer hit the origin in the same way
request logs become split between edge and backend views

This changes operational rituals. For example, a safe deploy checklist may now include:

publish versioned static assets
validate edge cache headers on a staging hostname
confirm purge tags or HTML invalidation rules
monitor edge hit ratio and origin request rate after release
roll back by switching HTML and purge scope if necessary

Without the CDN, deployment may have been mostly about the origin serving the right files. With the CDN, deployment is partly about managing global cached state. Teams that understand this early usually have calmer release processes than teams that treat the edge as an opaque acceleration box.

CDN selection therefore ends up being partly a product decision. Different platforms vary in:

geographic reach
cache and purge tooling
security depth
edge programmability
observability quality

Because the edge becomes part of the application's public boundary, those differences shape how comfortably the rest of the system can evolve.

Another quiet consequence is that teams start designing with edge capability in mind from the beginning. A feature that is cheap when decided at the edge, for example a redirect, cache tag, image variant, or request challenge, may be much more expensive if pushed all the way back to the origin on every request. That architectural pull is one of the reasons CDNs kept expanding their product surface over time.

CDN architecture discussions therefore quickly become wider than "how fast is the cache". The edge influences rollout safety, security posture, observability, and even which application features are economical to implement cleanly.

That broader influence is one reason CDNs became central infrastructure rather than optional acceleration. Once the edge owns enough of the public request path, it inevitably shapes how the whole application is built and operated.

The Useful Mental Model Is "A Distributed Traffic and State System"

If you keep one mental model, use this one: a CDN is a distributed traffic and state system that moves the public edge of the application closer to the user, absorbs repeated work through hierarchical caching, and protects the origin by reducing distance, duplication, and exposure.

That explains why the stack includes:

anycast for nearby ingress
DNS for service steering
cache keys for object identity
TTLs and validators for freshness
shield layers for origin protection
purge propagation for correctness
TLS termination for protocol control
edge compute for early decisions

Calling all of that "a cache near the user" is not wrong, but it hides the real engineering. The CDN is where routing, HTTP semantics, TLS, storage policy, and traffic defence meet.

That is how CDNs actually work.