Googlebot caches aggressively in order to reduce network requests and resource usage. This includes ignoring caching headers such as max-age.
This may lead WRS to use outdated JavaScript or CSS resources (source).
The business impact is that an incomplete or inaccurate rendering affects what is indexed.
The business solution is Conditional Caching, the practice of cache busting only when the resource has changed.
SME Details
Erik Hendriks, Software Engineer at Google, provided additional context in the panel Rendering at WMConf MTV '19. He stated:
“When we crawl your page with Googlebot, we go fetch the content and then we give it to chrome. Then Chrome runs all the scripts. It loads additional content.
Once everything's loaded we take a snapshot of the page and that's the content that actually gets indexed.”
Rendering is costly. For Google, it increases fetching 20X.
Google sees 50-60 resource fetches when rendering a page (while obeying robots.txt). In these fetches, there is a 60-70% cache hit rate.
To get around this business problem, Google cuts corners by:
- Not obeying caching rules
- They advise: Make content cacheable. It's better for users, too.
- Not fetching all resources
- They advise: Reduce how many fetches are requires to build your page.
- Get as much possible content available as there is always a risk of fetch failure.
A business solution to the challenge of Google's caching is Conditional Caching which busts the cache if the resource has changed.
ETag (or Entity Tag) allows the server to identify if the cached contents of the resource are different to the most recent version.
Alternatively, If-Modified-Since headers can be used.
User Story
As a crawl-effective site, I want to use Etag or If-Modified-Since headers in addition to setting an expiry date or a maximum age in the HTTP headers of 7 days or greater for resources in order to instruct the search engines to load previously downloaded resources from local disk rather than over the network as this will improve crawl budget utilization.
Acceptance Criteria
-
Extended
max-ageis used on resources. Recommended value for resources ismax-age=2592000(30 days) -
Each resource should specify an explicit caching policy that answers the following questions: whether the resource can be cached and by whom, for how long, and if applicable, how it can be efficiently revalidated when the caching policy expires.
-
When the server returns a response it must provide the
Cache-ControlandETag(orIf-Modified-Since)headers:-
Cache-Controldefines how, and for how long the individual response can be cached by the browser and other intermediate caches. -
ETagprovides a revalidation token that is automatically sent by the browser to check if the resource has changed since the last time it was requested. -
If-Modified-Sincemakes the request conditional: the server will send back the requested resource, with a 200 status, only if it has been last modified after the given date. If the request has not been modified since, the response will be a 304 without any body.
-
Testing Strategy
-
Open Chrome developer tools to the Network tab
-
Enter in URL
-
Select the resource
-
Click ‘Headers’ (screenshot)
-
If the
Max-AgeandEtag(orIf-Modified-Since) headers contain the age and hash values denoted by dev team, QA pass.- Else, fail.
Published on 1/2/2026 by Jamie Indigo