How CDNs Actually Work
Try the interactive lab for this articleTake the quiz (6 questions · ~5 min)Most people learn that a content delivery network is "a cache near the user". That is true in the same way that calling an airport "a car park for planes" is technically true. It points in roughly the right direction and leaves out almost everything that matters.
A modern CDN is not one cache. It is a globally distributed traffic system that combines anycast routing, DNS steering, TLS termination, request normalisation, hierarchical caching, origin shielding, DDoS filtering, and increasingly edge compute. When someone in Athens loads a site accelerated by a major CDN, the path usually touches a nearby point of presence, not the origin server sitting in Dublin, Frankfurt, or a cloud region elsewhere. If the object is cached locally, the edge responds immediately. If it is not, the edge may fetch from a regional shield cache rather than from the origin directly. If the origin is under stress, the CDN may collapse duplicate requests, serve stale content briefly, or block abusive traffic at the edge before the origin sees it.
That stack exists because web performance and origin survivability are tightly linked. A user does not care whether a page was slow because of RTT, TCP slow start, packet loss on a transcontinental path, a congested origin, or a thundering herd of cache misses. The CDN exists to remove as many of those penalties as possible before they become user-visible.
This article looks at the mechanics under the hood: how anycast gets the user to a nearby edge, how cache keys and freshness rules decide whether the edge can answer, how tiered cache hierarchies protect the origin, how purges propagate globally, how TLS and HTTP versions terminate at the edge, and why modern CDNs have become programmable network platforms rather than static asset boxes.
The First Job of a CDN Is to Move the First Hop Closer to the User
The simplest win a CDN can deliver is geographic and topological proximity. If the origin is in Frankfurt and the user is in Athens, the browser must otherwise pay the RTT to Frankfurt for the first handshake and every object fetch. Even on a good European backbone, that is slower than talking to an edge in Athens or Sofia.
The speedup matters because modern web latency compounds quickly:
- DNS lookup
- TCP handshake
- TLS handshake
- request transmission
- response first byte
If the browser talks to a nearby edge, the early round trips are shorter. TCP slow start also becomes less painful because the connection ramps up on a shorter path first. This is why CDNs improve not only throughput, but also time to first byte and page interactivity.
The improvement is often easiest to see with static assets. If a stylesheet and hero image are cached in an edge PoP close to the user, those objects no longer depend on origin distance at all for cache hits. The origin becomes a background refill source rather than the live responder for every request.
This is the operational difference between a site hosted "in Europe" and a site delivered "from the edge". Distance still exists. The CDN simply moves the expensive part of the interaction closer to the user.
Anycast Is How One IP Address Exists in Many Places at Once
Most large CDNs use anycast for their front door. The same IP prefix is advertised from many PoPs, and global BGP routing sends each client to the topologically closest or otherwise preferred location according to the network path.
This is the reason a single service IP can land one user in London, another in Frankfurt, and another in Madrid without any application-level redirect. The routing system does the steering.
A simplified model:
CDN advertises 203.0.113.0/24 from:
Athens
Frankfurt
London
Paris
ISP in Greece sees Athens as best path
ISP in Germany sees Frankfurt as best path
ISP in the UK sees London as best pathFrom the user's perspective, the hostname resolves to an IP and the TCP handshake just happens. From the CDN's perspective, that one IP is simultaneously present in many cities.
Anycast is not perfect. BGP follows policy and topology, not geography alone. A user in Thessaloniki might land in Athens, Sofia, or even Frankfurt depending on peering. But the general effect is strong: the first network hop into the CDN is much closer than the origin usually is.
This also helps resilience. If one PoP withdraws the route, traffic shifts to other PoPs without changing the client-visible address. The failure domain becomes smaller and easier to mask.
DNS and Anycast Work Together, Not as Rivals
People sometimes talk as if CDNs either use DNS steering or anycast. Large deployments often use both, but at different layers.
DNS decides which CDN hostname or service address the browser receives. Anycast decides which physical PoP handles traffic for that address. A typical flow looks like this:
www.example.euis a CNAME to a CDN-managed hostname- the CDN returns an address for the edge service
- that address is anycast from many PoPs
- BGP delivers the user to the nearest or best-reachable edge
DNS still matters because:
- different products may resolve to different anycast services
- the CDN may steer by geography, load, or customer policy
- TTLs influence how quickly steering changes propagate
- some vendors use DNS more heavily for regional segmentation
Anycast handles the fast path for packets once the browser has the address. DNS handles naming, product selection, and some policy before the packets ever leave the client.
In practical operations, this means two control planes are in play:
- the DNS control plane, which maps the hostname to an edge service
- the BGP control plane, which maps the edge service to a nearby PoP
Understanding both is essential when troubleshooting. A request can be "on the CDN" and still land in an unexpected city because the anycast path differs from what someone expected from the DNS answer alone.
A Cache Hit Depends on the Cache Key, Not Just the URL
At the heart of a CDN is a cache, but a cache is only as good as its key. The key decides whether two requests are considered equivalent.
The naïve cache key is:
scheme + host + pathReal caches often need more nuance. Depending on the application, the key may also include:
- selected query parameters
Accept-Encoding- device or image format variations
- language selection
- specific cookies, or no cookies at all
If the cache key is too broad, the CDN serves the wrong object to some users. If it is too narrow, cache hit ratio collapses and the origin sees unnecessary traffic.
Example:
GET /avatar?id=42&size=small
GET /avatar?id=42&size=largeIf the CDN ignores size, the responses collide incorrectly. If it includes every random tracking query string, the cache fragments and becomes ineffective.
This is one reason CDN tuning is rarely "set and forget". The edge is only as useful as the cache key design, and the key must match the application's variation model exactly.
Freshness Rules Are Policy, Not Guesswork
Once the cache key identifies an object, the next question is whether the cached copy is still fresh. CDNs rely on explicit cache policy, mainly from HTTP headers such as:
Cache-Control: public, max-age=600, stale-while-revalidate=30
ETag: "build-4821"
Last-Modified: Tue, 21 Apr 2026 10:00:00 GMTCommon patterns include:
max-agefor browser and edge freshnesss-maxagefor shared caches specificallystale-while-revalidatefor serving slightly old content during background refreshstale-if-errorfor origin failure resilience- validators such as
ETagandLast-Modifiedfor conditional revalidation
Freshness control is where performance and correctness meet. A long TTL boosts hit ratio and protects the origin, but makes updates harder. A short TTL improves freshness but increases origin traffic and RTT exposure.
The usual answer is not one universal TTL. Static versioned assets like /app.4f92c1.js can be cached for months because their names change on deployment. HTML usually needs much shorter lifetimes because it controls which assets the browser discovers next.
This is why strong deployment pipelines pair CDN policy with asset naming strategy. Versioned immutable files and short-lived HTML are a far more stable combination than trying to purge everything on every deploy.
The Edge Cache Is Often Only the First Layer
Large CDNs rarely let every edge miss hit the origin directly. They use cache hierarchies.
The basic model is:
- edge cache close to the user
- regional or shield cache behind the edge
- origin behind the shield
Suppose an object is requested from PoPs in Athens, Vienna, and Prague almost simultaneously. If every miss went straight to origin, the origin would receive three fetches immediately. With shielding, the edge PoPs ask a regional parent cache first. The parent may collapse those requests so only one origin fetch occurs.
This protects the origin from two common problems:
- request amplification during global cache cold start
- thundering herd on popular objects after expiry
A simplified path for a miss:
Client in Athens
-> Athens edge
-> Frankfurt shield
-> origin in DublinThe second request from another nearby edge may stop at the Frankfurt shield and never touch Dublin at all.
Shielding is especially useful for dynamic but cacheable content, large media objects, and origins with limited concurrency. It turns the CDN from a flat cache fleet into a layered request-absorption system.
Request Collapsing Prevents Many Identical Misses from Smashing the Origin
Imagine one HTML page references a new JavaScript bundle. At 09:00, thousands of users request that bundle, but it has just expired or been purged. If every edge thread fetched it independently, the CDN would behave like a fan-out amplifier against the origin.
Request collapsing, also called collapsed forwarding, avoids that. The first miss for a cache key becomes the active origin fetch. Later identical misses wait on that fetch instead of starting their own. Once the object arrives, all waiters are satisfied from the same response.
This matters under load spikes:
- product launches
- breaking news
- software update releases
- major sports events
The feature is easy to describe and incredibly valuable in practice. Many origin outages are not caused by steady traffic. They are caused by miss amplification after expiry or purge. A good CDN should absorb that pattern gracefully.
This is why CDN architecture is partly about latency and partly about origin economics. The better the edge fleet is at deduplicating work, the smaller and calmer the origin can be.
Purging Is a Distributed Invalidation Problem
Caching is easy compared with invalidation. The hard part is making stale content disappear everywhere when it must.
CDNs generally support two broad strategies:
Versioned Objects
Static assets get new filenames on each deploy. Old versions can remain cached until they age out naturally. This is the safest and cheapest invalidation strategy.
Active Purge
The operator explicitly tells the CDN to remove an object, a path prefix, a tag group, or an entire hostname's cache contents.
Global purging is hard because the CDN is distributed. The invalidation must propagate to many PoPs, many cache nodes, and often several cache layers. The control plane needs to be fast enough that users stop seeing stale content quickly, but robust enough that one slow PoP does not leave inconsistent state for too long.
This is why serious CDN platforms invested heavily in internal pub/sub systems for invalidation. Purge latency is not a minor feature. It decides how quickly broken HTML, outdated JavaScript, or incorrect API responses disappear from user view.
Operationally, fast purge matters most when:
- a bad deploy must be rolled back
- confidential content was cached accidentally
- pricing or legal text changed and must update quickly
- a CMS publishes frequently changing pages
A CDN with excellent hit ratio but slow invalidation will still create painful incidents. Freshness control and purge propagation are as important as cache fill.
TLS Usually Terminates at the Edge, Which Changes Everything Behind It
When a browser connects to a CDN, the TLS handshake usually terminates at the edge PoP. The edge presents the certificate, negotiates ALPN, and handles HTTP/2 or HTTP/3 for the client. Behind the edge, the CDN may:
- fetch from origin over TLS
- fetch from origin over plain HTTP on a private network
- reuse a smaller pool of long-lived origin connections
- speak a different HTTP version to the origin
This has several consequences:
Performance
The expensive handshake happens close to the user. That cuts setup latency.
Security
The CDN sits in the trust path. It can inspect headers, apply WAF rules, normalise requests, and cache responses because it terminates encryption.
Operational Simplification
Origins often handle fewer direct client connections. The CDN becomes the public transport layer and security perimeter.
Protocol Translation
The browser might speak HTTP/3 to the edge while the edge speaks HTTP/2 or HTTP/1.1 to the origin.
This is why "supports HTTP/3" often means "supports HTTP/3 at the edge". The origin does not need to speak QUIC for the user to get the benefit. The CDN absorbs the complexity.
Dynamic Acceleration Is About Connection Reuse and Path Optimisation, Not Magical Caching
Modern CDNs do more than cache static files. They also accelerate dynamic requests that cannot be cached safely.
They do this through mechanisms such as:
- keeping hot origin connections open
- reducing handshake repetition
- optimising congestion and retransmission on long-haul paths
- choosing good backbone routes between edge and origin
- normalising headers and buffering uploads sensibly
For a user in Lisbon hitting an origin in Warsaw, the edge may accept the request locally, then forward it over the CDN's own backbone or well-engineered transit path to the origin. Even if the response is fully dynamic and not cached, the browser still avoids a long direct handshake to the origin.
This is one reason CDNs evolved from "static content networks" into general application delivery platforms. Once the edge is already the user's first stop, it can improve more than cache hits.
Edge Compute Turned the CDN into a Programmable Runtime
The latest phase of CDN evolution is not just better caching. It is programmable edge execution. Instead of only serving or fetching objects, the edge can now run logic such as:
- URL rewrites
- bot checks
- header manipulation
- A/B experiment assignment
- image transformation
- lightweight personalisation
- access control decisions
Platforms expose this through JavaScript isolates, WebAssembly, or proprietary workers models. The reason this became attractive is architectural. The CDN already sees every request very early. If a cheap decision can be made at the edge, the origin avoids that work entirely.
Examples:
- reject abusive requests before they touch origin
- resize images at the edge rather than storing many variants
- choose the nearest API region from request metadata
- inject security headers consistently for all responses
This changes the CDN from "cache in front of the origin" to "programmable control plane for request handling".
The tradeoff is complexity. Once logic exists at the edge, debugging and consistency become harder. The operator now has to reason about code running in hundreds of PoPs, distributed rollouts, and interactions between cache state and runtime behaviour.
CDNs Protect the Origin by Filtering, Rate Limiting, and Absorbing Traffic Spikes
Performance is the public story. Origin protection is just as important.
The edge can:
- terminate floods close to ingress
- challenge suspicious clients
- rate-limit abusive paths
- cache popular responses to reduce backend load
- hide origin IP addresses from casual scanners
This is why CDNs are often the first line of defence for DDoS mitigation. Large anycast fleets spread attack traffic across many PoPs, and edge filtering reduces the volume that reaches the customer's infrastructure.
The point is not that the CDN makes the origin invincible. The point is that it moves the fight outward. Instead of one application cluster absorbing every request directly, a global edge network absorbs and filters a large share first.
From an infrastructure perspective, that is often the difference between "origin overloaded instantly" and "incident contained at the edge".
Bad Cache Policy Can Make a CDN Worse Than Having None
CDNs fail badly when operators assume they are simple.
Common mistakes include:
- caching HTML too aggressively and serving stale deployments
- including volatile query strings in the cache key unnecessarily
- varying on cookies that do not actually affect content
- failing to version static assets
- purging too often instead of using immutable asset names
- exposing private origin behaviour through inconsistent edge rules
A misconfigured CDN can cause:
- low hit ratio and high cost
- stale content after deploys
- origin overload from cache fragmentation
- broken localisation
- accidental caching of user-specific responses
The presence of an edge fleet does not guarantee good results. The application's cache semantics still need to be designed carefully. A CDN amplifies both good and bad policy.
A Cache Miss Is Still a Structured Workflow, Not a Blind Origin Fetch
When an edge does not have a fresh object, several decisions still happen before the origin is touched.
The edge may check:
- whether the object is stale but still temporarily serveable
- whether revalidation with
If-None-MatchorIf-Modified-Sinceis possible - whether another edge request for the same object is already in flight
- whether the request should go to a shield cache first
- whether the origin is currently marked degraded
That means a miss path often looks like this:
- client asks Athens edge for
/app.4f92c1.js - edge checks local cache and freshness metadata
- edge sees object expired but has an
ETag - edge asks Frankfurt shield, which may already have a fresher copy
- if needed, shield revalidates with origin in Dublin
- origin returns
304 Not Modifiedor a new object body - shield updates its metadata
- edge updates its local cache and answers the client
If the origin returns 304, the data path is cheap because the old body remains valid. If the origin returns a new object, the CDN must replace stored state and may need to stream the new bytes to many waiting clients.
This is why validators matter so much. A CDN without strong revalidation metadata has to choose between:
- shorter TTLs and more expensive full refetches
- longer TTLs and more stale-risk
With ETag or Last-Modified, the edge can ask a much cheaper question: "has this changed?" rather than "please send the whole object again." For frequently accessed HTML or API responses that are cacheable for short periods, this can be the difference between a healthy origin and one buried under avoidable refill traffic.
Large Objects and Range Requests Change Cache Behaviour Substantially
Static websites are the simple case. Large media objects, software installers, and video segments create a different class of CDN problem.
If a client asks for part of a large object using a byte range, the CDN needs policy for whether it:
- caches the whole object
- caches only requested ranges
- forwards the range upstream each time
- coalesces ranges into larger cached chunks
Video delivery is the most common example. Adaptive bitrate streaming usually splits media into short segments, which is cache-friendly. Large downloads and scrubbing inside media players are less tidy.
Suppose a user requests:
Range: bytes=1048576-2097151The edge then has to decide whether fetching only that range from origin is better than pulling more of the file and storing it for later use. The answer depends on:
- object size
- request popularity
- storage cost
- expected reuse pattern
- origin bandwidth constraints
This is one reason CDNs are not generic transparent proxies. They embody policy about what kinds of traffic are worth storing and how aggressively to do it. The ideal policy for a 20 KB CSS file is not the ideal policy for a 4 GB software image or a long video archive.
Operators also have to think about egress economics. A range-heavy workload can produce surprising origin amplification if the CDN is configured poorly. Good large-object handling is therefore part performance engineering and part cost control.
Origin Shielding Is Also About Failure Containment
Shield caches are often described as an origin-protection feature for cache misses, but their role in incidents is just as important.
Imagine the origin begins returning elevated latency because a database pool is saturated. Without a shield, dozens of edge PoPs may all continue probing or refetching independently. With a shield, many of those decisions collapse into one regional layer. That reduces the number of actors touching the failing origin and makes backoff logic more effective.
This is why shielding helps in at least three ways:
- fewer duplicate miss fetches
- easier origin rate limiting and retry control
- smaller blast radius when origin health degrades
In mature CDN designs, the shield becomes the main consumer of origin capacity and the edges become consumers of shield capacity. That hierarchy gives operators more control over failure behaviour. They can tune:
- timeouts between edge and shield
- timeouts between shield and origin
- retry budgets
- stale-on-error behaviour
- origin circuit breaking
At large scale, these controls matter more than the simplistic hit-ratio story. A CDN earns its place not only when everything is healthy, but when one dependency becomes slow and the edge network still protects users from seeing the full impact immediately.
Multi-CDN Strategies Exist Because One Edge Fleet Is Not the Whole Internet
Large organisations sometimes use more than one CDN. The reason is not fashion. It is risk management and performance diversity.
A multi-CDN design can help with:
- resilience against one provider outage
- regional performance optimisation
- pricing leverage
- product specialisation, for example one CDN for video and another for API acceleration
But it also introduces complexity:
- cache state is no longer shared
- purges must hit more than one control plane
- logs and request IDs split across vendors
- DNS steering becomes more complicated
- certificate deployment and origin access control must stay consistent
In a failover event, the second CDN may have colder caches and a different anycast footprint. That means "failover works" is not enough. The operator also needs to know what user performance looks like during failover, whether purges remain coherent, and whether the origin can survive the changed miss pattern.
This is another reminder that CDN architecture is traffic engineering, not merely caching. The operator is choosing how the public edge of the application should exist across the internet's real topology, commercial relationships, and failure modes.
Logs and Request IDs Matter Because the Edge Adds Another Whole System Layer
Once a CDN sits in front of the origin, one user request often generates multiple internal events:
- browser to edge request
- edge cache decision
- possible shield request
- possible origin request
- possible revalidation
- possible WAF decision
If those layers cannot be correlated, debugging becomes painful.
Good CDN operations rely on:
- per-request identifiers
- cache status fields such as hit, miss, stale, revalidated
- PoP or colo identifiers
- origin timing metrics
- purge event logs
A useful access log line often needs to say more than "status 200". It should also say:
- where the request landed
- whether it was a hit
- whether the response came from shield or origin
- how long origin fetch took if used
Without that, teams end up arguing from incomplete evidence. The browser says the site was slow. The origin says its own CPU was fine. The missing truth may be that one regional edge cluster had poor cache key design and kept missing unnecessarily.
The better the request correlation, the easier it is to separate:
- edge latency
- routing issues
- shield pressure
- origin slowness
- cache-policy mistakes
This is why observability is part of CDN architecture, not an optional dashboard layer.
CDNs Also Change the Economics of Application Design
One under-discussed reason CDNs became so dominant is that they reshape what the origin needs to be good at.
If the edge handles:
- most static delivery
- much of TLS termination
- some rate limiting
- some request normalisation
- some image transformation
then the origin can focus more narrowly on:
- correctness
- business logic
- private data access
- cacheable response generation where appropriate
That changes cost structure. Instead of scaling the origin for every single request from every geography, teams can scale the origin for:
- cache misses
- personalised traffic
- writes
- low-hit or uncacheable reads
This is one reason application teams tolerate CDN complexity. The alternative is often to push more global delivery, DDoS resilience, and TLS edge engineering back into the application stack itself. For most organisations, that is slower and more expensive.
A team therefore rarely chooses "a CDN" in the abstract. It chooses a mix of:
- reach
- purge model
- cache tooling
- security controls
- edge programmability
That mix then shapes how comfortably the rest of the platform can evolve.
Request Normalisation and Cache Poisoning Defence Are Part of the Edge Job
Caching only works safely if semantically equivalent requests are treated consistently and semantically dangerous ambiguity is reduced. That means a CDN often normalises parts of the request before caching or forwarding it.
Examples include:
- lowercasing or canonicalising certain header handling paths
- deciding whether query parameter order matters
- choosing whether duplicate headers are legal or suspicious
- collapsing repeated slashes or dot segments in URLs if policy allows
- stripping tracking parameters from the cache key
This matters for performance, but it also matters for security. Ambiguous request interpretation between:
- browser and edge
- edge and origin
- cache key logic and origin routing
can create cache poisoning opportunities. If one layer thinks two requests are equivalent and another does not, the wrong response may be stored or reused.
Mature CDNs invest heavily in canonicalisation and parser consistency for that reason. The edge is not only trying to answer quickly. It is trying to ensure that what it caches is the object the origin really intended for that key. A cache that can be poisoned is worse than a cache miss.
In production, this often means performance and security teams end up working on the same controls. Header normalisation, query string policy, and method restrictions are not isolated concerns. They shape both hit ratio and correctness.
Serving Stale on Error Is One of the Quietest and Most Valuable Features
One of the most useful CDN behaviours is the ability to serve stale content temporarily when the origin is failing. This is often controlled by policy such as stale-if-error, platform-specific edge rules, or emergency operator overrides.
Why is it so valuable?
Because a slightly outdated page is often much better than a hard outage.
Consider a news front page cached for 60 seconds. The origin starts timing out because a backend search cluster is overloaded. If the edge has a recently expired copy and policy allows stale-on-error, the CDN can keep serving the old page briefly while:
- the origin recovers
- the operator rolls back
- cache fill pressure is reduced
This smooths incidents in a way users rarely notice explicitly. They do not know the page was 45 seconds older than ideal. They only know the site did not disappear.
Of course, stale serving is not universally safe. It is poor for:
- account balances
- checkout state
- one-time tokens
- highly personalised responses
But for public content, product pages, docs, or many catalog views, it is often exactly the right tradeoff under failure.
This is another reason CDN policy is never just about static speed. It is about how the system behaves under stress when freshness, availability, and backend health pull in different directions.
Real-Time Logging and Edge Analytics Changed How Operators Debug the Web
Before CDNs became dominant, many operators debugged web performance mainly from origin logs and some packet captures. With a CDN in front, the most useful first-party evidence often lives at the edge:
- cache status
- PoP identifier
- edge response time
- shield response time
- origin response time
- WAF action
- bot score or challenge outcome
That changes incident response. If users in Greece report slowness while users in Germany do not, the operator can look for:
- one troubled PoP
- one regional transit issue
- one shield cluster with poor hit ratio
- one specific cache key fragmenting unexpectedly
This is a radically different debugging model from the older "check the web server" instinct. The web server may be completely healthy while one edge region is having trouble. Or the edge may be healthy while the origin fetch path is unstable. Or everything may be healthy except one purge event that invalidated too much content at once.
The better the edge analytics, the faster teams can separate:
- routing problems
- cache-policy problems
- security filtering mistakes
- true origin slowness
That separation is what turns a CDN from a black box into infrastructure the team can actually operate confidently.
Purge Strategy and Content Tagging Become Editorial Infrastructure
As soon as a site publishes frequently, cache invalidation stops being an infrastructure detail and becomes part of the content workflow.
Teams often need to purge by:
- exact URL
- prefix
- hostname
- surrogate key or content tag
- deployment version
The more mature the site, the more important surrogate tags become. Suppose one article appears:
- on its own page
- in the homepage feed
- in a topic archive
- in a "latest posts" widget
- in a JSON feed consumed elsewhere
If an editor updates the article, invalidating only the article URL may not be enough. Purging by content tag lets all related cached representations be expired together. This is one of the reasons modern CDN integration often reaches deeply into the CMS or deployment pipeline.
Without good tagging, teams fall back to broad purges that:
- erase too much warm cache state
- spike origin load
- make deploys feel slower and riskier
With good tagging, they can invalidate precisely and preserve most of the fleet's useful cached content. Again, the CDN stops being "just a network thing" and becomes part of how publishing actually works.
Multi-Layer Edge Logic Means You Need a Clear Mental Model of Where Decisions Happen
A request can be transformed at several points:
- DNS steering before the TCP handshake
- edge request normalisation before cache lookup
- WAF and rate limiting before forwarding
- shield cache logic before origin fetch
- edge compute after origin response but before client delivery
If a team does not know where each decision happens, debugging becomes guesswork. This is why strong CDN operating practice usually documents:
- the cache key definition
- where headers are added or removed
- where authentication is checked
- where redirects are generated
- where image or HTML transformations happen
Otherwise, the edge fleet becomes a pile of partly overlapping logic spread across products and control planes. The result may still work, but it will be fragile to change.
Edge Storage Is Finite, So Eviction Policy Shapes Real Performance
A CDN edge cache is not infinite. Every PoP has finite SSD or memory budget, and the edge has to decide which objects remain warm.
That means eviction policy matters just as much as fill policy. Popular small objects may stay resident for a long time. Large low-reuse objects may be evicted quickly even if they were expensive to fetch. The edge is constantly deciding which stored bytes are likely to pay for themselves in future latency reduction.
This has direct consequences for content design:
- a handful of heavily reused versioned assets usually cache beautifully
- one-off personalised HTML is a poor fit for long retention
- giant infrequently used media files may churn storage and displace hotter content
The result is that cache hit ratio is not only about TTLs. It is also about object popularity distribution and object size distribution. A site with excellent cache headers can still perform poorly at the edge if its traffic mix keeps thrashing storage with large, low-reuse objects.
This is one reason video and software download products often receive specialised CDN handling. Their object-size economics differ sharply from ordinary website assets. Good CDN design therefore starts from workload shape, not from generic assumptions about "one cache policy for everything".
Origin Access Control Matters Because the CDN Should Be the Public Front Door
Once a CDN sits in front of the origin, the origin should usually stop behaving like a general public endpoint. If attackers or scrapers can bypass the CDN and hit the origin directly, several benefits weaken immediately:
- WAF filtering at the edge is bypassed
- anycast absorption is bypassed
- origin hiding disappears
- cache offload no longer protects the backend path
Good deployments therefore use origin access controls such as:
- allowlists for CDN egress ranges
- mutual authentication between edge and origin
- signed origin pull requests
- private networking where possible
This is not only a security concern. It is also architectural hygiene. The CDN is meant to be the controlled public interface. The origin should trust the edge deliberately rather than hoping clients always arrive through the intended path.
Teams that skip this step often discover later that a supposedly protected origin is still publicly reachable and much easier to overload than the edge. At that point the CDN is only part of the perimeter rather than the real perimeter.
A Single Dynamic Request Often Touches More Infrastructure Than People Expect
To see why CDNs became full delivery platforms, it helps to follow one dynamic request end to end.
Imagine a logged-out visitor in Athens loads /pricing for a SaaS site:
- DNS resolves the hostname to a CDN-managed address
- anycast lands the TCP or QUIC connection at an Athens or nearby edge
- the edge terminates TLS and negotiates HTTP/2 or HTTP/3
- the edge normalises the request and runs WAF checks
- the edge checks whether
/pricingis cacheable and currently fresh - if stale or absent, the edge asks the shield
- the shield may revalidate with the origin in Frankfurt
- the origin returns the current HTML
- the shield stores or refreshes it
- the edge sends the response to the client and may store it for a short period
Nothing in that path is conceptually difficult, but the operational implication is huge. What used to be "one request hit my web server" is now:
- one request passed through several policy engines
- one response may have been served from several possible layers
- one cache decision may affect the next thousand users
CDN operations demand good request IDs and cache telemetry for that reason. Without them, the path becomes hard to reason about.
HTML Caching Is Harder Than Static Asset Caching Because the Stakes Are Higher
Versioned static assets are the easy case. HTML is harder because:
- it controls asset discovery
- it may contain user-specific or geo-specific elements
- it changes more frequently
- stale HTML can point at missing or old bundles
This is why many mature sites separate cache strategy sharply:
Static Assets
- long TTL
- versioned names
- immutable cache semantics
HTML
- shorter TTL
- strong validators
- careful stale-on-error policy
- explicit purge on deployment if necessary
If teams get this wrong, a common failure mode appears: new JavaScript or CSS is deployed, but some edges still serve old HTML pointing at no-longer-valid assets. The site then looks randomly broken by geography or by PoP.
The CDN is not misbehaving in that situation. The caching model is inconsistent. The edge cache is therefore part of deployment design, not only post-deployment acceleration.
Cache Revalidation Can Save an Origin More Than a Full Hit Ratio Number Suggests
Hit ratio is useful, but it hides an important middle category: revalidated responses.
If an edge cache object is stale, the CDN may avoid a full refetch by sending:
If-None-Match: "build-4821"
If-Modified-Since: Tue, 21 Apr 2026 10:00:00 GMTIf the origin returns:
304 Not Modifiedthe edge can keep the body it already has. That saves:
- origin egress bandwidth
- origin CPU spent serialising the body again
- shield bandwidth
- edge fill time
For frequently checked but infrequently changed content, this is a major optimisation. A dashboard page that many users hit every minute may not be a full cache hit forever, but if it mostly revalidates cheaply rather than redownloading the body, the origin still benefits greatly.
This is why good validators are so valuable. They create a spectrum between pure cache hit and full miss. Without them, the CDN has far fewer graceful options.
Edge Platforms Also Standardised Modern Image and Media Delivery
One reason CDNs became more than caches is that many assets can be transformed safely at the edge. Images are the clearest example.
An edge service may:
- resize
- crop
- transcode
- convert format based on
Accept - tune quality for device or bandwidth class
For example, one source image might produce:
- AVIF for browsers that support it
- WebP for others
- JPEG fallback where needed
That work is valuable because it combines:
- lower origin storage complexity
- lower edge-to-client transfer size
- better device fit
But it also complicates cache keys. If the result varies by width, quality, or accepted format, the CDN must reflect that variation correctly or risk serving the wrong representation.
This is a recurring pattern in CDN design. New edge features usually create two simultaneous effects:
- better performance or simpler origin logic
- more cache-key and policy complexity
The platform becomes more capable and more stateful at the same time.
Traffic Steering During Incidents Is One of the Hardest Real CDN Problems
Under healthy conditions, steering traffic to a nearby PoP is relatively straightforward. During incidents, several harder questions appear:
- should one overloaded PoP shed traffic to a farther PoP
- should one region bypass a degraded shield
- should one origin pool receive less traffic even if it remains technically reachable
- should DNS steering move new sessions while anycast still handles existing ones
The answer depends on what is failing:
PoP Capacity Issue
Traffic may need to shift to a neighbouring city.
Shield Cluster Problem
Edges may need a different parent path.
Origin Problem
The CDN may need more aggressive stale serving, tighter request collapsing, or stricter WAF behaviour.
This is one reason very large CDNs invest so heavily in control planes and internal observability. At scale, the hard part is not serving static files from cache. The hard part is moving traffic gracefully when some part of the global system is degraded without causing a worse secondary problem elsewhere.
For users, a good incident looks invisible. For operators, that invisible experience often depends on a large amount of live traffic engineering at the edge.
CDN Security Layers Work Best When They Are Close to Normal Request Semantics
One reason CDNs became attractive security platforms is that they see traffic after TLS termination but before origin execution. That position lets them reason about:
- HTTP methods
- headers
- hostnames
- paths
- request rates
- cookies and tokens, if policy allows
This is more useful than filtering only on raw packets because the edge can distinguish:
- one legitimate
GET /docs/http2 - one flood of malformed requests
- one credential stuffing burst
- one scraper cycling through catalogue URLs too quickly
The security actions may include:
- rate limiting
- challenge pages
- bot scoring
- request body limits
- header sanitisation
- geographic restrictions
The important architectural point is that these protections live in the same place as caching and request steering. That creates leverage. If the edge already handles the first HTTP hop, it can often reject hostile or wasteful traffic before the origin spends any compute on it.
This is also why origin bypass is such a serious mistake. The moment traffic can skip the edge, the operator loses not only caching, but also the normalisation and filtering layer that made the public interface manageable in the first place.
CDN Adoption Usually Changes Deployment Practice, Not Just Runtime Performance
A team that moves behind a CDN often finds that deployments start behaving differently:
- cacheable assets can be pushed with long immutable lifetimes
- HTML invalidation becomes a planned workflow
- certificate rollout may move from the origin to the edge platform
- traffic spikes after deploy no longer hit the origin in the same way
- request logs become split between edge and backend views
This changes operational rituals. For example, a safe deploy checklist may now include:
- publish versioned static assets
- validate edge cache headers on a staging hostname
- confirm purge tags or HTML invalidation rules
- monitor edge hit ratio and origin request rate after release
- roll back by switching HTML and purge scope if necessary
Without the CDN, deployment may have been mostly about the origin serving the right files. With the CDN, deployment is partly about managing global cached state. Teams that understand this early usually have calmer release processes than teams that treat the edge as an opaque acceleration box.
CDN selection therefore ends up being partly a product decision. Different platforms vary in:
- geographic reach
- cache and purge tooling
- security depth
- edge programmability
- observability quality
Because the edge becomes part of the application's public boundary, those differences shape how comfortably the rest of the system can evolve.
Another quiet consequence is that teams start designing with edge capability in mind from the beginning. A feature that is cheap when decided at the edge, for example a redirect, cache tag, image variant, or request challenge, may be much more expensive if pushed all the way back to the origin on every request. That architectural pull is one of the reasons CDNs kept expanding their product surface over time.
That is also why CDN architecture discussions quickly become wider than "how fast is the cache". The edge influences rollout safety, security posture, observability, and even which application features are economical to implement cleanly.
That broader influence is one reason CDNs became central infrastructure rather than optional acceleration. Once the edge owns enough of the public request path, it inevitably shapes how the whole application is built and operated.
The Useful Mental Model Is "A Distributed Traffic and State System"
If you keep one mental model, use this one: a CDN is a distributed traffic and state system that moves the public edge of the application closer to the user, absorbs repeated work through hierarchical caching, and protects the origin by reducing distance, duplication, and exposure.
That explains why the stack includes:
- anycast for nearby ingress
- DNS for service steering
- cache keys for object identity
- TTLs and validators for freshness
- shield layers for origin protection
- purge propagation for correctness
- TLS termination for protocol control
- edge compute for early decisions
Calling all of that "a cache near the user" is not wrong, but it hides the real engineering. The CDN is where routing, HTTP semantics, TLS, storage policy, and traffic defence meet.
That is how CDNs actually work.