Apache Performance Tuning: Use a far future expires header

Using a far future expires header

 

By using a far future expires header you can efficiently control how assets are cached on the client, which results in improved performance. Here’s how to set it up correctly with Apache, and some pointers on how to refresh users cache when you modify files.

YSlow

Most of you probably already know YSlow, the plugin created by Yahoo’s exceptional performance team. These guys have done some awesome work on performance, and therules behind YSlow (summed up in Steve Souders’ book) are a gold mine for people who wants to improve performance on their web applications. If you don’t already have it, install it now!

In their research, the exceptional performance team found that two of the most effective steps you can take to improve performance is 1) to reduce the number of HTTP requests, and 2) using a far future expires header, which is what we’re talking about today.

Benefits of a far future expires header

A far future expires header is used for content that infrequently changes. Usually, our applications static assets (ie graphics, stylesheets and scripts) are a good fit. Most of todays sites use a JavaScript library, and together with their own code they easily have 100k of JavaScript. In addition to this, there’s a few 10ks of stylesheets and probably atleast 10k of (design) graphics.

These assets only change when we modify our sites design and/or behaviour, and so it should not be necessary for our users to download these files more than once per deployment. Unfortunately, for most sites you actually download all these files on every request. By using an expires header far into the future, the browser will not download these files again before the given date. This means you can save a few hundred kb of assets on every request!

Problems with far future expires headers

The problem with the far future expires header is exactly the same as the benfit: the bowserwill not download files again if the header says they haven’t changed. Even if you roll out new versions that contain important bug fixes and improvements.

The way to solve this problem is to have filenames change when their contents change. Unfortunately this yields more work, and you should seek to find a way of automating this. I’ll get back to this towards the end of this post.

Apache and expires headers

There’s two ways of getting there with Apache. mod_expires yields the most flexible solution, but mod_headers will work too, if you cannot use mod_expires for some reason.

mod_expires

First, enable the mod_expires module. For systems using apt (Debian, Ubuntu and others) you’ll do this like follows:

# Check if module is already enabled $ ls /etc/apache2/mods-enabled | grep expires # If it wasn't $ sudo ln -s /etc/apache2/mods-available/expires.load /etc/apache2/mods-enabled/ # Restart Apache $ sudo /etc/init.d/apache2 restart

Now you can use mod_expires. There are several ways to configure a future expires header (as per the documentation), I’ll use ExpiresDefault inside a FilesMatch directive:

<FilesMatch "\.(ico|pdf|flv|jpe?g|png|gif|js|css|swf)$"> ExpiresActive On ExpiresDefault "access plus 1 year" </FilesMatch>

This configuration will cause all graphics, PDFs, CSS, JavaScript and flash movies to be sent out with an expires header one year from the date they were requested. This means clients always receive a date into the future, and instead use their cached copies instead.

If you’d like more fine grained control, you can use the ExpiresByType directive.

The FilesMatch directive matches files based on a regular expression and can be tweaked to your liking. For instance, if you’d like these settings only to apply to your design assets (and not content images) you could do this instead:

<FilesMatch "/design/.*\.(ico|pdf|flv|jpe?g|png|gif|js|css|swf)$"> ExpiresActive On ExpiresDefault "access plus 1 year" </FilesMatch>

The FilesMatch directive can go in your VirtualHost configuration.

mod_headers

If you cannot use mod_expires for some reason, you can achieve a similar effect using mod_headers. This module allows you to set arbitrary headers. Using this we can manually set an Expires header as such:

<FilesMatch "\.(ico|pdf|flv|jpe?g|png|gif|js|css|swf)$"> Header set Expires "Thu, 15 Apr 2010 20:00:00 GMT" </FilesMatch>

This needs occasional updating. Update: As Samuli points out in the comments, you should not set an expires header more than one year into the future (according to the HTTP 1.1 RFC). This means you’ll need to tend to this solution atleast once a year, should you choose to do things this way.

Busting caches

As I mentioned, you need a mechanism for updating users cache once you start using the far future expires header. When you roll out changes to your static assets, their filenames (or atleast URLs) need to change in order to be sure everyone gets the latest and greatest.

The simplest way to update file names is to attach a query parameter to your asset URLs. This is automated by a few server side frameworks, like Rails. The timestamp of last modification date is appended to the URL to a file, resulting in something like this:/stylesheets/mystyles.css?1234567890. When something is added to a URL to avoid client side caching we sometimes refer to it as a “cache buster”.

There is a problem with the above cache buster, though. The default configuration of the popular caching proxy Squid used to make Squid not consider known URLs with changed query parameters as new URLs. Effectively, users behind a Squid proxy would not have their caches refreshed by the above cache busters.

“Hard” cache busters

I like to refer to our first shot at fixing the update problem as “soft” cache busters. I consider them soft since they require no extra configuration or work on your part, other than generating the URLs.

The hard cache buster approach requires more work, but is more robust in that it will always cause clients to regard the URLs as unique and new. Instead of appending the mtimetimestamp on the URL, we include it right inside the URL, ie: /stylesheets/mysite-cb1234567890.css.

In order for this to work we need to either change our filenames on each deploy, or be a little clever with mod_rewrite. The following RewriteRule will cause the above URL, and all others like it (ie <path>-cb<mtime>.<suffix>), to seemlessly redirect to the original files (ie &le;path>.<suffix>):

RewriteRule (.*)-cb\d+\.(.*)$ $1.$2 [L]

So there you go, a “use as is” setup for improving performance by ways of the Expiresheader:

<VirtualHost *> # Usual config <FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$"> ExpiresActive On ExpiresDefault "access plus 1 year" </FilesMatch> RewriteEngine On RewriteRule (.*)-cb\d+\.(.*)$ $1.$2 [L] </VirtualHost>

Thanks to Samuli for pointing out that the expires header should not exceed one year into the future.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s