Introducing Static Asset Cache Buster

Everyone who has ever built a Drupal site has run into the issue where if you delete a file and upload a replacement with the same name, you wind up with my_file_0.txt instead of my_file.txt. As annoying as this can be, it does sidestep another issue: static asset caching.

Static Asset Caching

Static assets, such as images and PDFs, are served by the web server and not by Drupal. The .htaccess file that ships with Drupal instructs Apache to cache responses for two weeks (unless they're being served by PHP).

# Requires mod_expires to be enabled.
<IfModule mod_expires.c>
  # Enable expirations.
  ExpiresActive On

  # Cache all files and redirects for 2 weeks after access (A).
  ExpiresDefault A1209600

  <FilesMatch \.php$>
    # Do not allow PHP scripts to be cached unless they explicitly send cache
    # headers themselves. Otherwise all scripts would have to overwrite the
    # headers set by mod_expires if they want another caching behavior. This may
    # fail if an error occurs early in the bootstrap process, and it may cause
    # problems if a non-Drupal PHP file is installed in a subdirectory.
    ExpiresActive Off
  </FilesMatch>
</IfModule>

What happens if I use a module like Media Entity File Replace to prevent the _0 issue? Static asset caching comes into play. When that asset is served up by Apache, the Cache-Controlheader is sent with a max-agevalue of two weeks (1209600 seconds). When you replace the file, from the visitor's perspective, nothing changes. The external caches (browser, Varnish, CDN) will honor that max-age. Firstly, their browser will serve up the cached, old version. If the visitor clears browser history, then any CDN will serve up the cached, old version. You can clear the CDN cache, but there's no way to clear browser cache.

Solution A: Cache-Control Headers + External Cache Invalidation

One solution is to use Cache-Controlheaders to set max-ageand s-maxagevalues.

For example, a developer could set max-ageto no-cacheand s-maxageto 86400(1 day). (Note: max-agebehaves like no-cacheonce the TTL has expired.) Browsers would always check upstream before serving from cache. If the asset has changed, the server will respond with a 200 HTTP status code and the asset payload, which the browser will serve. If the asset has not changed, then the server will respond with a 304 HTTP status code, and the browser will serve from cache. (Note: Your web server must be configured to send either Last-Modifiedand/or ETagheaders for this approach to work.)

BUT for this strategy to work, any non-browser external cache (Cloudflare, Akamai, Varnish, etc.) will need to be invalidated. There are Drupal modules for each of these.

Solution B: Cache Busting

Another method for dealing with static asset caching is to use cache busting query strings. This is what Drupal core does with aggregated CSS and JS files.

Please allow me to introduce you to the Static Asset Cache Buster module.

This module seeks to address the issue by appending a query string to asset URLs, which acts as a cache buster. The value of the 'cb' query parameter is a hashed/truncated value from the file entity's 'changed' timestamp.

For example, 2022-annual-report.pdf becomes 2022-annual-report.pdf?cb=64227fd1. When the file is updated, the value of the 'cb' query string parameter will change.

This approach isn’t 100% fool proof, however. By default, URLs with old cache busters will still be cached. If Ted in Accounting copies the URL to 2022-annual-report.pdf?cb=64227fd1 and sends it out in an email, people will still get the old version when the file is updated.

Additional Considerations

The best method for addressing static asset caching is situational. It probably depends.

If you use the Media: Acquia DAM module, which updates files with the same file name, the Static Asset Cache Buster module may fit the bill just fine.

If you work for a highly-regulated financial services company, immediate cache invalidation for PDFs may be a priority. The Cache-Control header approach may work better in this situation.

At present, the Static Asset Cache Control module is a minimum viable product. It only explicitly supports Drupal core and isn’t configurable. In an ideal world, it would be configurable on a per-field formatter basis. If that’s something that interests you, good news! There’s already an issue for that: Make configurable at field formatter.

Add new comment