Caching: System Design Concepts for Beginners

Caching: System Design Concepts for Beginners
Caching: System Design Concepts for Beginners


Almost everyone loves watching the IPL and tries to tune in to every single match. Matches that could only be watched on TV previously can now be seen from any place with the help of the internet. Now suppose there’s a nail-biting moment in the match…

Source: giphy

And the video starts buffering!

You wouldn’t be too happy about it, will you?

If the problem persists, chances are you may even cancel your subscription. Solving this problem requires understanding system design for beginners. By this, we mean a system design concept called Caching (pronounced as cashing)

Let us learn what caching concepts are in this article.

What is Caching?

To understand what caching is, let us consider the simple example of eating food. 

All of us have at least three meals a day daily. 

Now, suppose someone fasts for a day. Will they double their food intake on the day before or after the fast because they won’t be eating or didn’t eat for a day? 

Assume the contrary situation as well. If we have a hearty meal one day, will we skip all our meals the next day?

Source: giphy

The answer to those questions is no because our body burns the locally available calories to give us energy. In other words, we don’t eat when we want to work, but we eat at regular times every single day. 

A similar situation exists in computer systems, where accessing the primary memory (like RAM) is faster than accessing the secondary memory(such as hard drives, USB memory sticks). 

As our daily food gives us the necessary energy quickly, in computers, caching acts as local or temporary storage for data providing easier and faster access than a database. This isn’t everything caching is about, though. 

YouTube is a site frequently visited by almost all of us. Have you noticed that when you want to open YouTube, you start typing “y” in the address bar, the entire link shows up on its own? 

Source: giphy

This is also because of caching. 

Thus, to formally define caching,

Caching is the process of storing data in a short term memory (called cache), providing faster access to recently viewed sites and sites with high traffic.

Now that we know what caching is, let us see where it is used.

Where is Caching used?

As the heading suggests, caching is explicitly used in some places. 

But, previously, we learnt that caching is used for faster access. Then why don’t we store all our data in the cache memory?

The reason is the hardware required for the cache memory is much more expensive than that of a database. Apart from this, finding something particular from all that data will be time-consuming if a lot of information is stored in the cache memory. This will defeat the purpose of using the cache memory. 

Thus we know that caching cannot be used to store all our data. So, let us now see the specific uses of the cache memory.


The main component in any device which processes information is the Central Processing Unit (CPU), and it has a cache memory of its own. It stores frequently used data for easy access. 

You can view and clear the cache on your system too. Search “disk cleanup” and open the app. You should see a screen like this:

Here the files containing cache are:

  • DirectX Shader Cache
  • Temporary files
  • Thumbnails

Web browsers

Earlier, we discussed an example where our browser suggests the link to a frequently visited web page by typing just the first letter. That is an example of cache memory in web browsers.

A browser cache also stores the files (like the HTML file, CSS style sheets, Javascript, cookies, and images) needed to display any websites visited. 

For example, suppose if we open our CodingNinjas profile, all the images and script files needed to display the page correctly and our login credentials are also stored in the cache memory. If we clear the cache memory, we will be asked to log in again. 

This is practically shown in this video.


Applications have a cache memory similar to web browsers. They save files and data like images, video thumbnails, search history, and other user preferences to quickly load the information.

Now that we know where caching is used let us understand how it works.

Working of Caching

As we learned in the definition of caching, its primary function is to reduce the access time by locally storing data. But how does that work?

Source: giphy

The data associated with any website or application is stored in its database. So, when we open them, the data is fetched and returned from the database using network call and I/O operations. These are time-consuming and slow down the process. Thus, to fasten it, we must avoid them. This is where caching is used.

When a request for information is made to a database, it is saved in the cache before returning the results to the user. In this way, when the same request is again made, it is fetched from the cache memory instead of requesting the data from the database, speeding up the process. 

For example, when you first opened this page, all the data was requested from the database. Now suppose you go back to the home page and open this blog again. It should open faster than before because the images, files and necessary data are already stored in the cache memory and fetched from there.

Now that we have a bare minimum knowledge of caching let us learn about its different types. 

Types of Caching

There are three types of caching. Let us see what they are.

Application Server Caching

We already know about application server caching from the example we just discussed. Let’s see it once again, though, for a better understanding. 

In the example, caching can be used by adding an application server cache to the application server, like Redis cache design. This will act as our cache memory. Now we’ll learn some related terms.

Cache Miss 

When we open the web page for the first time, a request is made to the database for the relevant files and data to open. This is called a cache miss. 

Cache Hit

When we opened the web page again, all the files and data were already stored in the cache memory and retrieved faster. This is called a cache hit. 

In an application server cache, we are considering one server, i.e., the application server. Now suppose we require more than one server for our application. Then each server will have its own cache, which is not interlinked with each other. So, if the files and data of a particular web page are already stored in the cache of some server, it will not be known by the other cache memories, and the files will be downloaded from the database. This defeats the purpose of using cache memory. 

There are two other kinds of cache to avoid this problem, about which we’ll learn next.

Distributed Caching

For the problem associated with an application server cache, what is the simplest solution that comes to your mind?

If you guessed combining all the separate cache memories into one, you are correct. 

With this, we come to the concept of distributed cache system design.

A distributed cache links together all the separate cache memories associated with the different servers into one. This is then connected with the application and the database as any cache memory would be. 

Global Caching

To understand this, let us consider we need a glass of water. We have a tank in our house where water is stored, and we have a nearby river or sea. If I take a glass of water from the tank, it will affect the amount of water I have stored. But if I take a glass of water from the nearby water body, that won’t affect its volume too much because of its vastness. 

Global caching is similar to that. It consists of massive servers provided by organisations like Amazon, at some cost fixed by them. Due to the vastness of the server, it offers faster access to data than a database. It also eliminates the chances of data loss that is present in the local storage on a device. 

HTTP Caching

Some web applications contain rich media like a lot of images, audio, and videos. In large networks, all these contents will have to pass through many servers before it reaches the user who requested the data from the database. This will take a lot of time, and this is where HTTP caching comes into play. 

In HTTP caching, the web content, including media, is stored on a CDN (content distribution network or content delivery network) server. CDNs can very easily manage such data and hence reduce both bandwidth consumption and time. 

Cache Invalidation

In caching, we request some data from a database and store it locally for our convenience, but what happens when the data on the database is updated? 

The answer to this question lies in the concept of cache invalidation. In simple terms, it means to keep the data on the cache and database error-free. 

There are three methods of checking errors in caching. Let’s see what they are.

Write Through Cache

Suppose you have a grocery list and a xerox of it, which you’ll give to the shopkeeper. In the original list, you notice that instead of typing milk, you have typed silk. In this case, how will you correct the errors?

The simplest and most probable solution to this problem is fixing both lists separately. 

Writing through cache is the same process where changes are made in both the cache memory and database separately. 

This method is easy to implement and minimises the risk of losing data, but we must repeat the same updating operation twice. This won’t be a problem for small changes, but with more significant changes, it poses a problem. 

Write Around Cache

Considering the same example, suppose you want to keep track of when you bought your groceries, so you only add the date to your list. Thus, the shopkeeper’s list does not get updated unless you xerox the original list again. Also, if the shopkeeper doesn’t have the additional data, it won’t affect his work either. 

Write around cache has a similar concept where only the database is updated. The cache will not be updated unless the files are requested from the database again. This method is advantageous since it does not fill the cache with unnecessary details, but it is only helpful for applications that don’t frequently re-read recently written data. 

Write Back Cache

Suppose the shopkeeper adds the total amount due on his list copy and tells you about it. You then update the original copy of the list. 

Write back cache works similarly wherein changes are made in the cache memory and are later updated on the database. This method is valid, provided the database is regularly updated, otherwise there will be a loss of data.

Eviction Policy

Now that we know how caching works on its own, a question may arise in your mind:

Can we add or remove data from the cache memory according to our needs?

Source: giphy

Let us see how we can do it.

Least Recently Used (LRU)

To understand this, let us consider the neither too old nor too recent trend of dalgona coffee. When the trend was at its peak, almost everyone was making themselves a nice refreshing cup of it. But as soon as the craze died down, hardly anyone bothers about it anymore.

Similarly, the data at the bottom of the cache memory (the least recently used data) is removed to add the new data when the cache memory is full. 

Doesn’t this remind you of a data structure we already know?

You’re probably right! I was talking about a queue.

Like a queue, this policy can be implemented using a doubly-linked list.

Least Frequently Used (LFU)

Have you ever noticed that when you want to forward something on WhatsApp, it suggests some people as “Frequently Contacted”?

This happens because we chat a lot with that particular person, so WhatsApp thinks we may want to send them a text again. 

The LFU policy is used here to keep a count of the number of times a particular data was accessed (or a person was contacted). When the cache memory becomes full, the data with the least frequency is deleted to accommodate the new data.

Most Recently Used (MRU)

To understand this, let us take the example of Facebook. Suppose someone sends you a friend request on Facebook, but you don’t know that person, so you will decline the request. Now Facebook also gives suggestions of people you may know, so here they are not going to suggest that same person because you claimed not to know them.

In this situation, the most recently used cache is cleared to avoid suggesting the same person.

Random Replacement

Suppose you have a packet of M&M’s. You pick a random one and eat it, irrespective of the colour. 

This describes random replacement in caching, where random data will be picked from the cache memory and deleted when it becomes full. 

After reading such an amazing blog, it must have sparked a desire to learn more about system design. Check out this Best System Design Course Online from Coding Ninjas.

This course will prepare you to answer questions about System Design in software engineering interviews.

Frequently Asked Questions

What is a caching system?

A caching system provides cache memory for the temporary storage of frequently accessed data.

What are system design concepts?

System design concepts include developing modules, architecture components, interfaces and data that provides the solution to meet the requirements of an organisation.

How do you implement caching?

Caching can be implemented using a hash map or a doubly-linked list (for LRU eviction).

What are the different types of caching?

The different types of caching are:
Application server caching
Distributed caching
Global caching
HTTP caching

How do you design a cache system?

Designing a cache system involves choosing how to use the cache (globally or locally) and where we will use it. How to design it is discussed in this entire article.

Key Takeaways

This article taught us what caching is, where it is used, how it works, and its different types. 

Caching is an important concept required in system design, and questions about it are common in interviews. You can learn more about interview experiences here. But this is neither the only topic under system design nor the only topic we need to know to ace our interviews. Then what more is needed?

The answer to that is practising. CodeStudio is a platform where you can get the necessary practice of coding questions and interview experiences of other people who were once hard workers like us and are presently working in reputed product-based companies. 

Happy learning!

By: Neelakshi Lahiri