MEDIUM·2 min read·Mar 2026·medium

What Actually Happens when Cache TTL Expires?

Caching is one of the most important performance mechanisms in modern systems.
But many assume that when a cache TTL expires, the cached object is simply deleted and fetched again.

When the next client request arrives, the cache has two choices.

Option 1 — Fetch the full object again

The cache sends a normal request to the origin:

GET /index.html

The origin responds:

200 OK
(full response body)

The cache stores the new copy.

This works but wastes bandwidth and increases latency.

Large systems try to avoid this whenever possible.

Option 2 — Conditional GET (What Production Systems Prefer)

Instead of downloading the entire object again, the cache asks:

Has this resource actually changed?

The request looks like this:

GET /index.html
If-None-Match: "abc123"

or

If-Modified-Since: Tue, 10 Mar 2026 10:00:00 GMT

The origin checks the resource.

If it hasn’t changed, the server responds with:

304 Not Modified

No response body is sent.

The cache simply refreshes the TTL and serves the existing object.

This saves:

bandwidth
origin load
response time

Option 3 — Advanced Production Caching Strategies

These are stale caching strategies that allow systems to serve slightly outdated content under controlled conditions.

Two commonly used mechanisms are stale-while-revalidate and stale-if-error.

Stale-while-revalidate

This directive allows the cache to serve stale content immediately while refreshing the cache in the background.

Example:

Cache-Control: max-age=60, stale-while-revalidate=300

Behavior:

The object is considered fresh for 60 seconds
After 60 seconds, the object becomes stale
If a request arrives during the next 300 seconds, the cache may:
immediately serve the stale content to the client
start a background request to the origin server to refresh the cache
next client request receives the refreshed content.

This approach prevents users from waiting for the origin server while the cache updates itself.

Result:

faster response times
reduced latency spikes
smoother traffic patterns during cache refreshes

Stale-if-error

This directive improves system resilience by allowing caches to serve stale content if the origin server fails.

Example:

Cache-Control: stale-if-error=600

Behavior:

If the cache tries to revalidate the resource and the origin server:
returns a 5xx error
times out
becomes temporarily unavailable
the cache can serve the stale response for up to 600 seconds

This prevents users from seeing errors during temporary outages.

Result:

higher availability
better user experience during backend failures
protection against sudden origin downtime

Subscriber only

Continue reading this essay

This essay is for subscribers of Deploying Clarity. Enter your subscriber email to unlock, or subscribe free below.

or

Subscribe free — get access to all essays

Originally published on Medium

View original →