Introduction to Caching

Vishnu G
10 min readOct 5, 2022

Caching, Cache Pattern, Cache Invalidation & Eviction

What is Cache?

Cache is a hardware or software component which helps in serving the data which is either frequently requested or is expensive to compute on, so caching allows you to efficiently reuse previously retrieved or computed data and helps in saving the expensive operations.

Cache is based on the principle of locality. Two Different Types of Locality:

  • Temporal Locality (Locality in Time): If an item is referenced, it will tend to be referenced again soon.
  • Spatial Locality (Locality in Space): If an item is referenced, items whose addresses are close by tend to be referenced soon.

Usually cache is designed with limited amount of space but is typically faster than the original data source and contains the most recently accessed items.

Caches can be applied and leveraged throughout various layers of technology including Operating Systems, Networking layers including Content Delivery Networks (CDN) and DNS, web applications, and Databases. You can use caching to significantly reduce latency and improve IOPS for many read-heavy application workloads, such as search engines, social networking, gaming, media sharing, etc.

Working of Cache

Before we dive deep into the cache pattern and policies lets make a general understanding of how caching works.

Working of Cache

The above diagram illustrate how cache works in general. When a request comes to a system, there can be two scenarios. If a copy of the data exists in cache it’s called a cache hit, and when the data is absent and has to be fetched from the primary data store it’s called a cache miss. The performance of a cache is measured by the number of cache hits out of the total number of requests.

Cache Invalidation & Eviction

Cache eviction and cache invalidation are two key properties to be considered while designing a caching system or choosing a caching system for your application.

There are only two hard things in Computer Science: cache invalidation and naming things.

— Phil Karlton

Cache Invalidation

The data that is stored in cache is not intended to be kept there for ever, it is volatile, that means in order to maintain the relevance of cached data, based on time, user or other variables, the data will have to be invalidated or removed from the cache.

Ok! now let us try to understand what is cache invalidation and why it is needed. Suppose you have a client which shows the daily active users. Client will send a request to fetch daily active users from the application, the backend service will fetch or compute daily active users from the database as N at time t. This data will be send back to the client and updated in the cache as well. Now, suppose after some time if the request comes again the cached result will be used instead of re-computing. This is perfectly fine and this approach will increase the overall performance. But what if the same request come to the application after a long time say, by next day. Here if you use the cached data, the overall data integrity will be lost so at some point of time you have to replace or remove the cached data. This process is termed as cache invalidation.

Cache expiration can get really complex. Unfortunately, there is no silver bullet for this problem. But there are a few simple strategies that you can use:

  • Expiration time, always apply a time to live (TTL) to all of your cache keys, except those you are updating by write-through caching. Here the application must give a optimum TTL, after this time, the data should be removed from the cache causing a “cache miss” in a subsequent request. You can even use a long time, say hours or even days; however this may result in data inconsistency or bug since application relay on old data unless it get removed/replaced.
  • Freshness caching verification, where the application executes a lightweight procedure to determine if the data is still valid every time the data is retrieved. The downside of this alternative is that it produces some execution overhead.
  • Active application invalidation, where the application actively invalidates the data in the cache, normally when some state change is identified.

Cache Eviction

Suppose even after setting an expiry there are a lot of keys that are available in your cache and as you know every cache solution has a limit to the number of keys it can hold. In such a scenario, what happen when a new key has to be added? Here any of the existing keys need to be evicted to make room for the new key. This is called cache eviction. That is Eviction refers to the process by which old, relatively unused, or excessively voluminous data can be dropped from the cache, allowing the cache to remain within a memory budget.

A cache eviction algorithm is a way of deciding which element to evict when the cache is full. Some of the popular cache eviction algorithms are:

  • Random Replacement (RR): Randomly selects a candidate item and discards it to make space when necessary.
  • First In First Out (FIFO): This caching system evicts the first entry in the cache regardless of how many times it was called.
  • Least Recently Used (LRU): Evicts the least recently used items first.
  • Most Recently Used (MRU): Evicts the most recently used items first.
  • Least Frequently Used (LFU): if the cache size has reached the maximum allocated capacity, the least recently accessed objects in the cache will be evicted. For this cache service will counts how often an entry was read from cache. Those that are used least often are discarded first.

Cache Patterns

Depending on the type of application, the type of data, and the expectation of failure, there are several caching strategies or patterns; choosing the right one can make a big difference. In other words, the caching strategy depends on how the data is written and read. For example:

  • is the system write heavy and reads less frequently? (e.g. time based logs)
  • is data written once and read multiple times? (e.g. User profile or settings)
  • is data returned always unique? (e.g. search queries)

A caching strategy for leaderboard system for mobile games will be very different than a service which aggregates and returns user profiles. Choosing the right caching strategy is the key to improving performance.

The most common cache patterns are as follows:


A cache-aside or lazy loading cache is the most commonly used caching strategy. With this approach, cache sits on the side and your code handles the responsibility of orchestrating the flow between the cache and the source of truth.


Here’s how it works:

  1. When your application needs to read data from the database, it checks the cache first to determine whether the data is available.
  2. If the data is found in cache, we’ve cache hit. The data is read and returned to the client.
  3. If the data is not found in cache, we’ve cache miss. The application has to do some extra work. It queries the database to read the data, returns it to the client and stores the data in cache so the subsequent reads for the same data results in a cache hit.

Cache-aside caches are usually general purpose and work best for read-heavy workloads. Systems using cache-aside are resilient to cache failures. If the cache cluster goes down, the system can still operate by going directly to the database. (Although, it doesn’t help much if cache goes down during peak load. Response times can become terrible and in worst case, the database can stop working.)

When cache-aside is used, the most common write strategy is to write data to the database directly. When this happens, cache may become inconsistent with the database. To deal with this, developers generally use time to live (TTL) and continue serving stale data until TTL expires. If data freshness must be guaranteed, developers either invalidate the cache entry or use an appropriate write strategy.

A disadvantage when using cache-aside as the only caching pattern is, because the data is loaded into the cache only after a cache miss, some overhead is added to the initial response time because additional roundtrips to the cache and database are needed.

Read-Through Cache

Compared to Cache-Aside, Read-Through moves the responsibility of getting the value from the datastore to the cache provider.

Read-Through Cache

While read-through and cache-aside are very similar, they have two key differences:

  • In cache-aside, the application is responsible for fetching data from the database and populating the cache. In read-through, this logic is usually supported by the library or stand-alone cache provider.
  • Unlike cache-aside, the data model in read-through cache cannot be different than that of the database.
Sequence diagram to read value from DB

Read-Through implements the Separation of Concerns principle. Now, the code interacts with the cache only. It’s up to the cache to manage the synchronization between itself and the datastore.

Read-through caches work best for read-heavy workloads i.e. when the same data is requested many times. For example, news feeds. The disadvantage is that when the data is requested the first time, it always results in cache miss and incurs the extra penalty of loading data to the cache. Developers deal with this by ‘warming’ or ‘pre-heating’ the cache loading data which is very likely to be requested. Just like cache-aside, it is also possible for data to become inconsistent between cache and the database, and solution lies in the write strategy. i.e. a proper write strategy combined with Read-through can be useful.


A write-through cache reverses the order of how the cache is populated compared to lazy loading. Under this scheme, data is written into the cache and the corresponding database at the same time. The cache ensures fast retrieval, and since data is written simultaneously in the cache and primary data store, there is complete consistency.

Write-Through Cache

In this case, when the application updates a piece of data in the cache (that is, calls put(...) to change a cache entry,) the operation will not complete (that is, the put will not return) until CacheStore successfully stored the data to the underlying data source. This does not improve write performance at all, since you are still dealing with the latency of the write to the data source. Therefor this is mostly used when there are no frequent writes to the cache.

The write-through approach has a couple of advantages:

  • Since the cache is up-to-date with the primary database, there is a much greater likelihood that the data will be found in the cache. This in turn, results in better overall application performance and user experience.
  • The performance of your database is optimal because fewer database reads are performed.

A disadvantage of the write-through approach is that infrequently-requested data is also written to the cache, resulting in a larger and more expensive cache operations.

On its own, write-through caches don’t seem to do much, in fact, they introduce extra write latency because data is written to the cache first and then to the main database. Read-Through/Write-Through are used where the application treats cache as the main data store and reads data from it and writes data to it.

Write-Back Cache

Write Back cache is some time called as write-behind or Write Deferred as well. In Write-Back Cache data is updated only in the cache and updated into the memory at a later time. Also in some case data is updated in the memory only when the cache line is ready to be replaced (cache line replacement is done using any of the above mentioned cache eviction algorithms like Least Recently Used Algorithm, FIFO, LIFO, and others depending on the application).

Write-Back Cache Strategy

Up to this point, all messages exchanged between actors were synchronous: the caller needs to wait until the service has finished processing and returned before continuing its flow. With Write-Behind, the cache sets the value to the datastore and doesn’t wait for confirmation.

Write back caches improve the write performance and are good for write-heavy workloads. When combined with read-through, it works good for mixed workloads, where the most recently updated and accessed data is always available in cache. It’s resilient to database failures and can tolerate some database downtime.

Write-Around Cache

Here, data is written directly to the database and only the data that is read makes its way into the cache.

Write-around can be combine with read-through and provides good performance in situations where data is written once and read less frequently or never. For example, real-time logs or chatroom messages. Likewise, this pattern can be combined with cache-aside as well.

Refresh-Ahead Cache

After cache invalidation or when the cache is still empty, you need to fetch the item from the datastore using one of the patterns like Cache-Aside or Read-Through. You can configure the cache to automatically refresh any recently accessed cache entry prior to its expiration. The idea is to make data available before request, that’s exactly what Refresh-Ahead is.

One disadvantage for this is poor accuracy in prediction of items which are likely to be needed in near future can result in reduced performance than with out refresh-ahead.


In summary choose the right cache pattern or a combination by carefully understanding the data access pattern and overall system design of the underlying application. On the other hand choosing a wrong caching strategy, in the worst case, may even introduce additional latency to the application.

For example, if you choose write-through/read-through when you actually should be using write-around/read-through (written data is accessed less frequently), you’ll have useless junk in your cache. This will result in more costly operation and may not get the expected outcome.