Creating a Cache-aware HTTP/2 Server Push Mechanism
If you’ve been reading at all about HTTP/2, then you’ve likely heard about server push. If not, here’s the gist of it: Server push lets you preemptively send an asset when the client requests another. To use it, you need an HTTP/2-capable web server, and then you just set a Link
header for the asset you want to push like so:
Link: </css/styles.css>; rel=preload
If this rule is set as a response header for an HTML resource, say index.html
, the server will not only transmit index.html
, but also styles.css
in reply. This eliminates the return trip latency from the server, meaning that the document can render faster. At least in this scenario when CSS is pushed. You can push whatever your little heart desires.
One issue with server push that some developers have speculated over is that it may not be cache-aware in all situations, depending on any number of factors. Browsers do have the capability to reject pushes, and some servers have their own mitigation mechanisms. For example, Apache’s mod_http2
module has the H2PushDiarySize
directive which attempts to address this problem. H2O Server has a thing called “Cache-aware Server Push” that stores a fingerprint of the pushed assets in a cookie. This is great news, but only if you can actually use H2O Server, which may not be an option for you, depending on your application requirements.
If you’re using an HTTP/2 server that hasn’t solved this problem yet, don’t sweat it. You can easily solve this problem on your own with a little back end code.
A super basic cache-aware server push solution
Let’s say you have a website running on HTTP/2 and you’re pushing a couple assets, like a CSS file and a JavaScript file. Let’s also say that this content rarely changes, and these assets have a long max-age
time in their Cache-Control
header. If this describes your situation, then there’s this quick and dirty back end solution you can use:
if (!isset($_COOKIE["h2pushes"])) {
$pushString = "Link: </css/styles.css>; rel=preload,";
$pushString .= "</js/scripts.js>; rel=preload";
header($pushString);
setcookie("h2pushes", "h2pushes", 0, 2592000, "", ".myradwebsite.com", true);
}
This PHP-centric example will check for the existence of a cookie named h2pushes
. If the visitor is not a known user, the cookie check will predictably fail. When that happens, the appropriate Link
headers will be created, and sent with the response using the header
function. After the headers have been set, setcookie
is used to create a cookie that will prevent potential redundant pushes should the user return. In this example, the expiry time for the cookie is 30 days (2,592,000 seconds). When the cookie expires (or is deleted), the process reoccurs.
This isn’t strictly “cache-aware” in the sense that the server knows for sure if the asset is cached on the client side, but the logic follows. The cookie is only set if the user has visited the page. By the time it’s set, assets have been pushed, and caching policies set by the Cache-Control
header are in effect. This works great. Great, that is, until you have to change an asset.
A more flexible cache-aware server push solution
What if you run a website that uses server push, but assets change frequently? You want to ensure that redundant pushes don’t occur, but you also want to push assets if they have changed, or maybe you want to push additional assets later on. This requires a bit more code than our previous solution:
function pushAssets() {
$pushes = array(
"/css/styles.css" => substr(md5_file("/var/www/css/styles.css"), 0, 8),
"/js/scripts.js" => substr(md5_file("/var/www/js/scripts.js"), 0, 8)
);
if (!isset($_COOKIE["h2pushes"])) {
$pushString = buildPushString($pushes);
header($pushString);
setcookie("h2pushes", json_encode($pushes), 0, 2592000, "", ".myradwebsite.com", true);
} else {
$serializedPushes = json_encode($pushes);
if ($serializedPushes !== $_COOKIE["h2pushes"]) {
$oldPushes = json_decode($_COOKIE["h2pushes"], true);
$diff = array_diff_assoc($pushes, $oldPushes);
$pushString = buildPushString($diff);
header($pushString);
setcookie("h2pushes", json_encode($pushes), 0, 2592000, "", ".myradwebsite.com", true);
}
}
}
function buildPushString($pushes) {
$pushString = "Link: ";
foreach($pushes as $asset => $version) {
$pushString .= "<" . $asset . ">; rel=preload";
if ($asset !== end($pushes)) {
$pushString .= ",";
}
}
return $pushString;
}
// Push those assets!
pushAssets();
Okay, so maybe it’s more than just a bit of code, but it’s still grokkable. We start by defining a function named pushAssets
that will drive the cache-aware push behavior. This function begins by defining an array of assets we want to push. Because we want to re-push assets if they change, we need to fingerprint them for comparison later on. For example, if you’re serving a file named styles.css
, but you change it, you’ll version the asset with a query string (e.g., /css/styles.css?v=1
) to ensure that the browser won’t serve a stale version of it. In this case, we’re using the md5_file
function to create a checksum of the asset based on its contents. Because md5 checksums are 32 bytes, we use substr
to shorten it to 8. Whenever these assets change, the checksum will change, which means that assets will automatically be versioned.
Now for the main event: Like before, we’ll check for the presence of the h2pushes
cookie. If it doesn’t exist, we’ll use the buildPushString
helper function to build the Link
header string from the assets we’ve specified in the $pushes
array, and set the headers with the header
function. Then we’ll create the cookie, but this time we’ll create a storable representation of the $pushes
array with the json_encode
function, and store that value in the cookie. We could serialize
this value, but this presents a potentially serious security risk when we unserialize
it later, so we should stick with something safer like json_encode
.
Now comes the interesting part: What to do with returning visitors. If it turns out that the visitor is returning and has an h2pushes
cookie, we json_encode
the $pushes
array and compare the value of this JSON-encoded array to the one stored in the h2pushes
cookie. If there’s no difference, we do nothing else and merrily go on our way. If there’s a discrepancy, though, we need to find out what has changed. To do this, we’ll use the json_decode
function to convert the h2pushes
cookie value back into an array, and use array_diff_assoc
to find the differences between the $pushes
array and the JSON-decoded $oldPushes
array.
With the differences returned from array_diff_assoc
, we use the buildPushString
helper function to once again build a string of resources to push again. The headers are sent, and the cookie value is updated with the JSON-encoded contents of the $pushes
array. Congratulations. You just learned how to create your own cache-aware server push mechanism!
Conclusion
With a bit of ingenuity, it’s not too difficult to push assets in a way that minimizes redundant pushes for repeat visitors. If you don’t have the luxury of being able to use a web server like H2O, this solution may work well enough for your purposes. It’s currently in use on my own website, and it seems to work pretty well. It’s very low maintenance, too. I can change assets on my site, and with the fingerprinting mechanism used, asset references update themselves, and pushes adapt to changes in assets without me having to do any extra work.
One thing to remember is that as browsers mature, they will likely become better at recognizing when they should reject pushes, and serve from the cache. If browsers fail at perfecting this behavior, HTTP/2 servers will likely implement some cache-aware pushing mechanism for the user much like H2O does. Until that day comes, however, this may be something for you to consider. While written in PHP, porting this code to another back-end language should be trivial.
Happy pushing!
Jeremy Wagner is the author of Web Performance in Action, an upcoming title from Manning Publications. Use coupon code csstripc
to save 38% off it, or any other Manning book.
Check him out on Twitter: @malchata
Creating a Cache-aware HTTP/2 Server Push Mechanism is a post from CSS-Tricks