Introducing Static Asset Cache Buster
Everyone who has ever built a Drupal site has run into the issue where if you delete a file and upload a replacement with the same name, you wind up with my_file_0.txt instead of my_file.txt. As annoying as this can be, it does sidestep another issue: static asset caching.
Static Asset Caching
Static assets, such as images and PDFs, are served by the web server and not by Drupal. The .htaccess file that ships with Drupal instructs Apache to cache responses for two weeks (unless they're being served by PHP).
What happens if I use a module like Media Entity File Replace to prevent the _0 issue? Static asset caching comes into play. When that asset is served up by Apache, the Cache-Control
header is sent with a max-age
value of two weeks (1209600 seconds). When you replace the file, from the visitor's perspective, nothing changes. The external caches (browser, Varnish, CDN) will honor that max-age. Firstly, their browser will serve up the cached, old version. If the visitor clears browser history, then any CDN will serve up the cached, old version. You can clear the CDN cache, but there's no way to clear browser cache.
Solution A: Cache-Control Headers + External Cache Invalidation
One solution is to use Cache-Control
headers to set max-age
and s-maxage
values.
For example, a developer could set max-age
to no-cache
and s-maxage
to 86400
(1 day). (Note: max-age
behaves like no-cache
once the TTL has expired.) Browsers would always check upstream before serving from cache. If the asset has changed, the server will respond with a 200 HTTP status code and the asset payload, which the browser will serve. If the asset has not changed, then the server will respond with a 304 HTTP status code, and the browser will serve from cache. (Note: Your web server must be configured to send either Last-Modified
and/or ETag
headers for this approach to work.)
BUT for this strategy to work, any non-browser external cache (Cloudflare, Akamai, Varnish, etc.) will need to be invalidated. There are Drupal modules for each of these.
Solution B: Cache Busting
Another method for dealing with static asset caching is to use cache busting query strings. This is what Drupal core does with aggregated CSS and JS files.
Please allow me to introduce you to the Static Asset Cache Buster module.
This module seeks to address the issue by appending a query string to asset URLs, which acts as a cache buster. The value of the 'cb' query parameter is a hashed/truncated value from the file entity's 'changed' timestamp.
For example, 2022-annual-report.pdf becomes 2022-annual-report.pdf?cb=64227fd1. When the file is updated, the value of the 'cb' query string parameter will change.
This approach isn’t 100% fool proof, however. By default, URLs with old cache busters will still be cached. If Ted in Accounting copies the URL to 2022-annual-report.pdf?cb=64227fd1 and sends it out in an email, people will still get the old version when the file is updated.
Additional Considerations
The best method for addressing static asset caching is situational. It probably depends.
If you use the Media: Acquia DAM module, which updates files with the same file name, the Static Asset Cache Buster module may fit the bill just fine.
If you work for a highly-regulated financial services company, immediate cache invalidation for PDFs may be a priority. The Cache-Control header approach may work better in this situation.
At present, the Static Asset Cache Control module is a minimum viable product. It only explicitly supports Drupal core and isn’t configurable. In an ideal world, it would be configurable on a per-field formatter basis. If that’s something that interests you, good news! There’s already an issue for that: Make configurable at field formatter.
Add new comment