Just in Time On-Device Caching with ETAG

In my previous article about on-device data caching with HTTP request header values, I alluded to a more complex, but more powerful way to cache data from your REST calls on the client. This, too, provides a free, zero-code caching solution, and again requires your REST API to provide specific HTTP response headers.

This technique involves using the HTTP response header named "ETag". While the format of the value isn't specified in the standard, it's generally thought of as an large-ish integer that represents a hash of the REST response payload (or something similar).

The Server Rules (again)

As with the Cache-Control header, the server is responsible for sending the ETag value to the client. The basic idea of the ETag is for the client to ask the API server "do you have newer data than I do?" In order to communicate the idea of "newer", the server sends a value is that it is a hash (or something else) unique to this particular payload, which is the value of the ETag header. In short, think of ETag as "version number" for the rest of this article, because that is how it functions.

When using ETag values instead of Cache-Control, there's a slightly different negotiation between the client and the API server:

For the first request of the data, the client requests the data from the API
The API server calculates the ETag (think: version!) value in some way.
The client stores the response in a similar manner to when the Cache-Control header is received, which is along with all of the header data from the server's response.
For any subsequent request of the same API data, the client checks the local cache of the previous HTTP request, and uses the last value it received for ETag as a request (outgoing) header named "If-None-Match".
If the API server is about to send the same response as before ("If-None-Match" value is the same as the Etag of the response), instead of actually responding with the payload, it does the following:

Responds with HTTP status code 304, meaning "Not Modified"
Does not send any payload body!

If the client sees a 304 response code, it serves the response from cache

Data is returned from the networking layers as if there had been a successful 2xx family HTTP response - there is no indication that this was retrieved from cache or of the 304 response status code
Even though the server returned a 304 status code, you will not be able to step through this response in your code - only the successful 200 response is returned

If instead the API server has new data instead of the same payload as before, it does the following:

Responds with HTTP status code 200
Sends a complete new payload body
Send the new, different value for the ETag header, as if to say "the version you are receiving is now this new version"

Wow, this seems way more complicated than the good old, simple Cache-Control header! Luckily, I have some code that you can use for iOS and Android alike to handle this complex interaction:

  
  /**
   * Ok, maybe you forgot the first paragraph of this article already,
   * but as I said, it takes no code to do this in your app!
  */

Yep, in both iOS and in Android (when caching is enabled), the ETag interaction is completely taken care of by the out of the box networking APIs. Instead, the complexity is actually on the API server side. (I have some advice for backend devs later in the article). For now, just know that the front-end complexity when using ETag is just about zero.

Here is how a typical multiple-request scenario looks, using curl to retrieve an image from Wikipedia:

  
tom@toms-mac-mini% echo "Initial Request" && \
  curl -v https://en.wikipedia.org/static/images/icons/wikipedia.png 2>&1 \
  | grep -e "etag" -e "< HTTP/2" -e "length"
Initial Request
< HTTP/2 200 
< etag: "3484-617c82943b221"
< content-length: 13444
tom@toms-mac-mini% echo "Request with If-None-Match" && \
 curl -v https://en.wikipedia.org/static/images/icons/wikipedia.png \
  -H "If-None-Match: \"3484-617c82943b221\"" 2>&1 \
  | grep -e "etag" -e "< HTTP/2" -e "length"
Request with If-None-Match
< HTTP/2 304 
< etag: "3484-617c82943b221"

Note that no "Content-Length" header is returned for the response that returns an HTTP status code of 304 - no content was returned.

While using Cache-Control, data is cached on the client in terms of time, Etag data is cached in terms of version. For Etag data, there is no amount of time after which the client will receive new data if the same version is available from the server, even for days/months/weeks/years!

So, Why Bother?

Yes, using ETag seems more complex, so why bother using it? If you remember from the previous article, I included a lot of advice about talking to your business users, who might be uncomfortable with having potentially stale data showing up in their app's screens. This was due to how the Cache-Control and Date headers worked: the client never contacts the server to see if new data is available before the expiration date of the response - even if updated data is available on the server. By contrast, when using ETag, the client always contacts the server to see if fresh data is available. This greatly shrinks the possibility of showing out of date cached data to the app users.

Why not always use ETag? While the front end devs have almost no work to do when using Etag, we have to remember your poor backend devs, who will have to do much more work to implement an ETag flow. They will also receive many, many more hits to the backend API server(s) so the client can ask "is it changed yet? is it changed yet? is it changed yet?" Here's a quick table that can help with deciding whether to use Etag or Cache-Control for your API response:

Using ETag	Using Cache-Control
Payload "freshness" based on payload itself.	Payload "freshness" based on time
Content is cached only as long as it is fresh	Data is cached for a fixed time
Data source must be version-able*	Nearly any data
Business tolerance for stale data is low	Business can tolerate stale data for some period of time
API server incurs a hit every time the client wants data (more hits)	API server incurs a hit only after fixed time is reached (fewer hits)
Serving even cached data is slower, since requires a network request	If data is cached, nearly instant response

What Data is Version-able? (advice for backend devs)

The entire point of using Etag is to know when some piece of data has been changed and to be able to calculate some version number to represent that change. Out in the wild, it's very common to see resources that are managed in a Web Content Management System (CMS) being published using the Etag technique. This makes sense, since CMS systems have a native concept of versioning when users make changes to pieces of content: when the content is published, there is a discreet publishing event and a version number is published as metadata of the piece of content. We can also see that a lot of images and packaged JS scripts are served using an ETag header, as well. Once again, these have discreet publishing (or packaging and publishing) events, so it is not difficult to tell when these have been updated and a hash can be calculated.

Versioned Entities?

By comparison, let's think about something like a Product catalog service that returns Product data for a single product, given its ID. Although you don't think about there being a "version number" associated with a Product, it may be possible to calculate one with little risk. For the data in a Product catalog, there may be a single system that controls and publishes Product data, but other times this data is assembled from different sources. If changes to your Product data are centralized, and the updates are controlled, then you may be able know the "version" of the product data for a particular Product. Examples of "uncontrolled" data updates are things like batch database updates, upload tools that do not use a service interface or similar ad-hoc updates that are applied directly to the data store. If this sounds like your organization, then your "architecture" deserves those air quotes and could use some work:

This would be a good working definition of version-able entities for your backend:

There is a reliable, centralized update process that all data updates run through
The rate of change is relatively low, say a handful of changes per day

If the rate of change is any higher, you risk never sending any cached responses, but you've done all the work to implement the Etag scheme: a lose-lose proposition.

The data is composed of simple string, integer, date or boolean data types

These fields will be used to compute a version number/ Etag value!

The system of record publishes change events to this entity's data

Why this last criteria? If your service code must contact the true source of entity data each time your client app makes a request, then there is not much value in using Etag, since you would be required to calculate a version number with each request - another lose-lose. However, if the system of record publishes "data has changed" events, your backend could locally cache the latest version number for a Product ID, say, and compare against the incoming "If-None-Match" header value - without contacting the system of record (see the demo backend code for a very simplified example).

In your backend, you should only cache the identifier for the entity (like "Product"), and the Etag value you calculated when you last returned that entity from your API. There is no need to cache the entire entity's data, since you will not be returning the entity in your response, only an HTTP status code of 304. This means you can cache tons of entities, because the identifier and hash are fairly small, simple string or integer data points.

Calculating Etag Values

For backend devs looking to calculate an Etag value for an entity, there are a few techniques:

Concatenate all properties (or all "important" properties) and hash the resulting string

Pros: easy to perform
Cons: easy to screw up if your entity definition grows by a few properties every once in awhile

(Java) use Lombok to generate a toString() value and hash the resulting string

Pros: easy

(JS) serialize to JSON and calculate a hash based on that value

Pros: easy
Cons: requires you to use an external library to calculate the hash

Use a UUID or timestamp. While it's customary to generate an Etag value based on hashing the payload, it's not required. If your entity's system of record is sending you events telling you that the entity has changed, it doesn't really matter what the hash is: generating a UUID definitely makes it so that each new update causes a new Etag value to be created - do we need more than that?

Pros: easy peasy
Cons: not a hash, so not in the spirit of what an Etag represents (does anyone care?)

Once your backend API code contains some kind of method call to a hashing function, the very stupid and time-wasting security scan software your company uses will light up with a high priority "security-related" finding 🚨. This is because this very stupid software assumes you are generating a hash for some security related use case. So beware that you might have to update from a simple hash generation scheme to using SHA-512 or something equally as unnecessary so your very stupid and time-wasting security scan software will be happy.

Demo

I have updated the sample projects for frontend and backend on Github. These are available in the branch named "etag". Note that the UI has changed to allow you to use the "Etag" version of the product API. Here are the links:

DiveInto Mobile

Search This Blog