Because we’re always on the look out for ways to speed up our web application, one of my favorite tools for optimization is the YSlow Firefox extension. Based on rules created by research done by Yahoo engineer, Steve Souders (his book High Performance Web Sites is a must read for anyone interested in front end engineering), the tool hooks into Firebug and helps you diagnose issues that can shave seconds off your pages’ load times. While we were able to implement most of the suggestions fairly easily, Rule #3, which specifies adding a far futures Expires header required a bit of elbow grease that some of you might be interested in.
Rule #3 recommends that you use set an Expires header on your static files (images, CSS and JavaScript) very far into the future (like 10 years) so that your browser’s cache is used to load those elements rather than making another HTTP request, which is costly when it comes to page load times. Implementing this is pretty easy. In your .htaccess file, you can use the following code:
#Far Future Expires Header
<FilesMatch "\.(gif|png|jpg|js|css|swf)$">
ExpiresActive On
ExpiresDefault "access plus 10 years"
</FilesMatch>
However, Steve makes a little note about using this technique:
Keep in mind, if you use a far future Expires header you have to change the component’s filename whenever the component changes. At Yahoo! we often make this step part of the build process: a version number is embedded in the component’s filename, for example, yahoo_2.0.6.js.
We, of course, didn’t have a built in build process that added the version number to our static files. Obviously, we weren’t interested in changing version numbers by hand or having tons of different versioned files lying around in our SVN depository. And so motivated by a goal (increasing our Y Slow score) and sloth (not doing something manually), we figured out the following automated solution.
The first thing we did was set up some mod rewrite rules to allow version numbers in our file names. In our .htaccess file, we added the following lines:
#Rules for Versioned Static Files
RewriteRule ^(scripts|css)/(.+)\.(.+)\.(js|css)$ $1/$2.$4 [L]
What this does is quietly redirects any files located in our \scripts\
or \css\
folders with version numbers in between the file name and the extension back to just the filename and extension. For example, I could now rewrite the url /css/structure.css as /css/structure.1234.css and Apache would see those as the exact same files. We only do versioned files for our JavaScript and CSS, but you could easily adapt the rule for images as well, like so:
#Rules for Versioned Static Files
RewriteRule ^(scripts|css|images)/(.+)\.(.+)\.(js|css|jpg|gif|png)$ $1/$2.$4 [L]
Once that was in place, we wrote a tiny PHP function that would look at the last modified date of the file and automatically rewrite the url with that unix timestamp as the version number. Here’s that PHP function:
<?php
function autoVer($url){
$path = pathinfo($url);
$ver = '.'.filemtime($_SERVER['DOCUMENT_ROOT'].$url).'.';
echo $path['dirname'].'/'.str_replace('.', $ver, $path['basename']);
}
?>
Then, in our PHP documents we would include the function and then call it like so in the HTML markup:
include($_SERVER['DOCUMENT_ROOT'].'/path/to/autoVer.php');<link rel="stylesheet" href="<?php autoVer('/css/structure.css'); ?>" type="text/css" />
<script type="text/javascript" src="<?php autoVer('/scripts/prototype.js'); ?>"></script>
When the pages load, our script would request the file modified timestamp and insert them in like this:
<link rel="stylesheet" href="/css/structure.1194900443.css" type="text/css" />
<script type="text/javascript" src="/scripts/prototype.1197993206.js"></script>
It’s a great little system and required very little effort on our end and resulted in a noticeably faster browsing experience for our clients that frequented certain pages often, because their browsers were taking full advantage of their primed caches rather than calling our servers every time they loaded a page. The best part is that when we make a change to a CSS or JavaScript file, we don’t have to worry about tracking or managing version numbers or multiple files.
What a smart tip. Thanks very much!
or just load the js with:
/scripts/prototype.js?v=1234
Nice tip, especially the PHP-Code. But i would like to second Stan. Wouldn’t it be easier (as in: no need for htacces, which slows down the server a bit - yes i know, only a bit) if you just append the versin info in a query string? As long as the querystring doesn’t change the browser should be able to cache the files anyway
We haven’t tested whether browsers would consider different query strings sufficient for proper caching, but I don’t see why using version numbers in a query string wouldn’t be a valid alternative approach. You could easily adjust the code so as to avoid the htaccess rules and even simply the PHP function. I think we’re just neat freaks when it comes to code and so this is why we approached it the way we did.
UPDATE! : As Anup pointed out in a comment below, urls with query strings (ie. script.js?v=1234) are NOT cached by the browser and so the method above should be used, because it inserts the version number into the filename.
This is a great tip. Thank you very much!
To Stan and Flo: Maybe you have to change the file name explicitly just to be sure the browser realizes that this is a new file. But I’m not sure right now…
Whooh! I posted „-1 years ago”. Beam me up, Scottie! :-)
Alright, just a temporary break of space-time continuum. Excuse me.
Wouldn’t the query string approach require you to change the URL in every file that it is used? If I am understanding it correctly, that is a bit of a drawback. With the approach Kevin took, you just edit a file as normal with your new code, and everything else takes care of itself.
— On second thought, you could use the same PHP to append the query string, and not need the htaccess part. My bad.
Kevin: Thank you very much! What a great way of getting a discussion going…and making us all think about page load optimization.
Stan: You have almost answered a question I had for some time now. How does this “prototype.js?v=1234” work? I’m sure I must have missed the memo, but I’ve noticed similar things in code from other developers. Would you be able to point me to a link that explains this a little bit more in-depth?
Querystrings in the URL won’t guarantee caching. According to Cal Henderson, “According the letter of the HTTP caching specification, user agents should never cache URLs with query strings. While Internet Explorer and Firefox ignore this, Opera and Safari don’t - to make sure all user agents can cache your resources, we need to keep query strings out of their URLs.” — http://www.thinkvitamin.com/features/webapps/serving-javascript-fast
Thanks. This is great!
Sorry, this is only mildly related, and I hate to come across as a blithering idiot, but does anyone know what this does?
<script src="/scripts/somefile.js?1634528" . . .
Any ideas? Links? Examples?
@Anup : Ah, thanks a lot for clearing that up. I must have sub-consciously remembered reading that article by Cal.
@WebGyver: I believe the querystring is used either for URL rewriting, or, when — in this instance — the js extension is mapped to be handled by something else (e.g. php) which outputs the right JavaScript, if needed.
I’ve worked with images and a query string to avoid caching and it works just fine. You should theoretically be able to do the same with scripts.
That doesn’t work with e.g.
because the file “/scripts/scriptaculous.js?load=effects” won’t exist. To fix it you could either move the query string outside the PHP chunk, or use parse_url() to parse the $url.
Very cool. Good work guys.
Interesting idea.
Anup: Query strings won’t prevent caching (even in Safari and Opera) if freshness information is specified in the headers. See http://www.mnot.net/blog/2006/05/11/browser_caching
very good writeup. thanks a bunch
Fantastic article. I also love the way you display these comments!
Aren’t CSS, JS, and images cached by browsers anyway? Isn’t that the point of browser cache? Tell me I’m stupid, please.
@Andrew: I think that the point is to not cache these files, so that when you put up a new version, you’re sure that your browser fetches it instead of serving up the stale older version
Very nice. I have always done the file.css?49579384 thing, but I like the way yours looks better.
@Mark: you are wrong. The point is to cache these files UNTIL you put up a new version. And then have the software automatically invalidate the cache.
Rails automatically does this by appending the timestamp to the filepath when you use the helper functions, but I didn’t like the idea of checking a bunch of files for their timestamp on each request, so I coded my own site to use and then automatically substitute “?VERSION” by the file’s svn revision when I deploy to production. So in production it’s all static.
grumble html got stripped sigh …coded my own site to use <script src=”/scripts/somefile.js?VERSION” %gt;
All the people who talk of the file.ext?93399393 fix, need to read all the comments. It’s been stated that this does not work as HTTP specifies that user agents should not cache resources with query strings (though some still do anyway). Kevin’s approach should work in all cases because it actually is a unique filename. Great tip Kevin!
As long as freshness (Expires/Cache-Control:max-age) is provided (as it is in this technique), URLs with query strings are indeed cacheable and this is specified more explicitly in the updated HTTP/1.1 drafts:
Another article states that:
From http://www.mnot.net/blog/2006/05/11/browser_caching
While the specifications do state that query strings can be cached, these are updated drafts—meaning they’re not necessarily followed by the browsers and so in practice, creating an optimization strategy on something that might work in the future (especially on browser feature development) seems odd.
Also, I’m not really sure why there’s such a big push to avoid the htaccess rules. We use them on Wufoo and the processing time is negligible on our servers.
WordPress uses the query string method for caching their js/css, at least in the admin pages (e.g. …/jquery.js?ver=1.1.4). Wonder if they have done any tests on the efficacy.
Thanks Kevin, I’m currently looking at implementing the solution, with a little tweaking : I’m thinking I can write a script that will search for JS and CSS files (for example) within the application folder tree, get the timestamp, and generate a configuration file for the templating engine with the “version number ” that came from the timestamp. The difference would be to request timestamps only at build time, and not during execution (though the fileaccess for timestamp may be minimal, I don’t know for sure).
Hey Farbrice, that’s an awesome way to do it. We use a three stage push method and svn and we didn’t want to have to have someone responsible for pushing a “crawl” code before every push. As I mentioned, we do this on Wufoo and one of the things we do is limit it to a maximum of two javascript files and two css files on any html page. The CPU processing for filemtime is very minor and the tradeoff in terms of not having to worry about the process at all and the apparent speed increase is definitely worth it for us.
Umm, there are a couple of performance problems in your solution.
.htaccess - the moment you turn on htaccess in apache, you’re slowing down your webserver. you’re basically forcing apache to do a stat on every directory from your resource up to your docroot just to figure out if a .htaccess file exists or not. stat is slow
include($_SERVER[‘DOCUMENT_ROOT’].’/path/to/autoVer.php’); even though DOCUMENT_ROOT might be constant, php doesn’t know that, and this makes your include filename variable. What that means, is that apc can’t cache it well, and that causes further slowdowns at run time.
Both of these affect the response time of your server, so they won’t actually affect your YSlow grade at all, but they will slow you down.
Regarding always using the same filename for different versions… you could run into problems if you have multiple pages/apps that depend on different versions of the file. With YUI, for example, since the same URLs are used by many properties, and many products outside of Yahoo!, the URLs cannot be changed without breaking a lot of apps.
Regarding using the file’s mtime as a version, you’ll run into problems when you hit one of the following two scenarios: 1. You have more than one server for load balancing (get around this by using the svn version number, or the mtime of the file in svn) 2. The file changes more frequently than once a second (though in this case, you probably want it cached only if you’re getting a few hundred thousand requests a second).
Nice idea. I will have to check your solution. I’m impressed :)
Hi again. After implementing something similar on my site I noticed a big drop in bandwidth, which is great. However, the “Hits” reported is the same. And when I observe with Firebug, I still get >0 ms fetch time on the css ,js etc. which tells me the browser is still fetching, despite the files now having “Expires” around January 2018, and the “Cache-control max-age=315360000”. Did you find something similar ? Why isn’t the browser caching the files? hmm..
Fabrice, check the HTTP request and response headers (using Firebug or LiveHTTPHeaders). Is there an Etags in the response headers? Is the Last-Modified header getting sent? Are the Expires and Cache-control headers really getting sent? Is it HTTP/1.1 or 1.0?
Just wanted to say this is a great article. Oh, and thanks for the YSlow tip, I didn’t know about that plugin.
Hey, nice job! You might also check with jake@ - he came up with a system (recently) to automatically version our far future YCS files at video.yahoo based on cvs versions.
Cheers and nice job.
great article, good info.
Everyone needs a hug.
Thanks this may certainly come in handy!
Will certainly be using some of this info!
Will certainly be using some of this info!
Interesting…
Thanks this is usefull info
This comes in handy
Thank you Kevin.
Everyone needs a hug.
Thanks for the tip! Gonna use it from now.
Everyone needs a hug.
Man, you are the best! I will try it.
How come you do not filter out comments that only say “Everyone needs a hug.”
Can anyone think of a way of making this automatically apply to all images, and maybe other files without having to stick the PHP into their paths? I’d like to use this in WordPress, but I can’t really embed the PHP into each image’s path as the editor won’t allow it, etc.
What about versioning images inside css files? When I have something like this: background-image: url(../../image/admin/list_toolbar_bg.gif);
Everyone needs a hug.great article, you are the best! I will try it.
I dont know why everyone say its a great article. I think you just want to say its grait because i can put my backlink here for free. So i am haunest. Thnx for the good website and thnx for the pr juice
He has nofollow on the links so you aren’t going to get any PR at all.
hi.your desing is perfect.
somnambulic electrotonicity ovispermary stratonic promodernistic epiphanous crucial doubtably The Nickelodeon Online Treehouse http://aleph0.clarku.edu/~djoyce/mathhist/
I am speechless !!
@ voetbalnieuws
Your juice? These are no follows… So you’d better do some real linkbuilding ;-)
Ask some other expert at http://www.hypotheekindex.nl and remember, just like at particletree; sharing information is a great way to achieve your final goals!
Tank you for this perfect information. I will try it in WordPress
I revised autoVer to squash a couple of bugs. First, if the filename had more than one period in it, the number would be inserted multiple times. Second, if the file were in the root directory, it wouldn’t be handled properly.
function autoVer($url){ $name = explode(‘.’,$url); $lastext = array_pop($name) ; array_push($name,filemtime($_SERVER[‘DOCUMENT_ROOT’].$url),$lastext); $fullname = implode(‘.’,$name) ; echo $fullname ;
Let’s try that again, with formatting.
<?phpfunction autoVer($url){ $name = explode('.',$url); $lastext = array_pop($name) ; array_push($name,filemtime($_SERVER['DOCUMENT_ROOT'].$url),$lastext); $fullname = implode('.',$name) ; echo $fullname ;}?>
How do you deploy? If you use a tool like Ant, you may include a tag into HTML files, and replace it with version value when deploying app. For instance,
And your ant file may include the following code in deploy task: