How to choose the right cache expiry lifetime

Caching is one of two hard things in computing. At its most primitive form, caching starts with setting how long a cached item should live for. This is most commonly known as the time-to-live or TTL.

Setting a longer TTL will improve computing efficiency as more time increases the frequency the cached computing can be used and reduces the frequency the uncached computed must be re-executed.

However, when underlying inputs to your cached item change, the item becomes out-of-sync, stale and should be refreshed. With advanced cache invalidation systems like Drupal's cache tag invalidation feature, out-of-date cache can be invalidated when the input data changes. But in absence of such systems, shorter TTLs increase the frequency in which cached items are recomputed and synced with source inputs.

So how long should you cache an item for?

There is no one TTL that suits all situations. The right TTL to use is about striking the right balance between performance, compute efficiency and cache freshness. Lets look at these in more detail:

Performance

The time it takes to retrieve a cached item should always be faster than the time it takes to compute the item uncached. The more times you can use the cached item instead of the uncached item, the better your application will perform as it can complete a task faster and allocate computing resources to un-cacheable areas. This is often expressed in the Hitrate: the percentage of times the output was retrieved from cache.

Compute efficiency

To make performance gains without increasing caching (or other application optimizations) would be to increase resources. Increasing CPU cores, for example, would help scale parallel tasks (such as web requests). The time a task takes is a rough indication of CPU consumption. So summing the total time taken to perform that task over a period of time will tell you how much of a CPU core that task would consume in that period of time. E.g. 60 requests in a day to a task that takes 10 seconds to complete would request in 10 minutes of task time over a 1 day period which is roughly 0.69% utilisation of a single CPU core.

Cache Freshness

How long your cache can be considered "fresh" is subjective. But ultimately, business requirements will dictate cache lifetime tolerances. For example, a brochureware website that is not updated often could tolerate cache lifetimes of a day or possibility even more while a news and media website may only tolerate 1-5 minutes of cache before it must be revalidated or computed again. Cache freshness may dictate whether you can gain performance through increasing a hitrate or resources.

Which TTL is best for you?

Below is a graph tool that shows the effectiveness of different cache expiry TTLs for a single URL over a given period of time. These models assume a linear and uniform traffic pattern with the URL being requested at a set frequency (Request Interval).

In real world scenarios, URL request frequency is subjected to user demand patterns that factor in popularity and time of day which result in oscillations of the cache hirate. However, over longer time periods (Request Period) these can be averaged out to where a consistent request interval can be representative of more realistic traffic patterns.

You can play with the inputs below to get results on how varying lengths of TTLs would impact your hitrate, performance and resource consumption.

Trade-offs with content freshness

The longer content is cached for, the greater the probability the content source has changed making the cached content out-of-date. Lower TTLs therefore, increase content update frequencies and improve the accuracy of the cached content. This is called content freshness.

The optimal TTL should balance performance with content freshness. As this graph shows, performance improvements from increasing TTL eventually diminish. For example, if the performance improvements between a TTL of a day and a month are marginal, then it would be better to have a shorter TTL to improve content freshness.

Conversely, business requirements may drive more accurate content and force a degradation of performance. This graph can be used to quantify that cost to ensure the business are informed of the increased hosting cost to gain the content freshness level they require.

Taking compute efficiency out of the equation

On Acquia Cloud, resources scale on-demand as they're needed. Our customers pay for page views and visits rather than allocated compute power. This means your caching strategy is not about resource utilisation but about the content freshness and performance you want to obtain from it. This way, our customers can focus on metrics that are oriented towards user experience rather than IT management or consumption.