Cache Aside mode for caching

  Cache

Order

This article mainly describes the Cache Aside mode of the lower cache.

Cache Aside

There are two main points:

  • The application program first fetches data from the cache, and if it does not get it, it fetches data from the database and puts it into the cache after success.

  • The update is to update the database first and then invalidate the cache after success. Why not update the cache after writing the database? It is mainly afraid of dirty data caused by two concurrent writes.

public V read(K key) {
  V result = cache.getIfPresent(key);
  if (result == null) {
    result = readFromDatabase(key);
    cache.put(key, result);
  }

  return result;
}

public void write(K key, V value) {
  writeToDatabase(key, value);
  cache.invalidate(key);
};

Dirty data

One is a read operation, but it did not hit the cache, and then went to fetch the data from the database. At this time, a write operation came. After the database was written, the cache was invalidated. Then, the previous read operation put the old data in again, so dirty data would be caused.

This case will appear in theory, however, the probability of actually appearing may be very low, because this condition needs to occur when the cache is invalidated when reading the cache and there is a concurrent write operation. In fact, the write operation of the database will be much slower than the read operation, and the table will be locked. The read operation must enter the database operation before the write operation, and the cache will be updated later than the write operation. The probability of all these conditions is basically not large.

maven

        <!-- https://mvnrepository.com/artifact/com.github.ben-manes.caffeine/caffeine -->
        <dependency>
            <groupId>com.github.ben-manes.caffeine</groupId>
            <artifactId>caffeine</artifactId>
            <version>2.5.5</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
        <dependency>
            <groupId>com.google.guava</groupId>
            <artifactId>guava</artifactId>
            <version>22.0</version>
        </dependency>

Code reproduction

Here, use code to reproduce this dirty data scenario.

  • When a read operation comes in and it is found that there is no cache, then loading is triggered to obtain data and has not yet returned.

  • Write in, update data source, invalidate cache

  • The old data obtained by loading is returned, and the dirty data is stored in cache.

@Test
    public void testCacheDirty() throws InterruptedException, ExecutionException {
        AtomicReference<Integer> db = new AtomicReference<>(1);

        LoadingCache<String, Integer> cache = CacheBuilder.newBuilder()
                .build(
                new CacheLoader<String, Integer>() {
                    public Integer load(String key) throws InterruptedException {
                        LOGGER.info("loading reading from db ...");
                        Integer v = db.get();
                        LOGGER.info("loading read from db get:{}",v);
                        Thread.sleep(1000L); //这里1秒才返回,模拟引发脏缓存
                        LOGGER.info("loading Read from db return : {}",v);
                        return v;
                    }
                }
        );

        Thread t2 = new Thread(() -> {
            try {
                Thread.sleep(500L);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            LOGGER.info("Writing to db ...");
            db.set(2);
            LOGGER.info("Wrote to db");
            cache.invalidate("k");
            LOGGER.info("Invalidated cached");
        });

        t2.start();

        //这里在t2 invalidate 之前 先触发cache loading
        //loading那里增加sleep,确保在invalidate之后,cache loading才返回
        //此时返回的cache就是脏数据了
        LOGGER.info("fire loading cache");
        LOGGER.info("get from cache: {}",cache.get("k"));

        t2.join();

        for(int i=0;i<3;i++){
            LOGGER.info("get from cache: {}",cache.get("k"));
        }
    }

Output

15:54:05.751 [main] INFO com.example.demo.CacheTest - fire loading cache
15:54:05.772 [main] INFO com.example.demo.CacheTest - loading reading from db ...
15:54:05.772 [main] INFO com.example.demo.CacheTest - loading read from db get:1
15:54:06.253 [Thread-1] INFO com.example.demo.CacheTest - Writing to db ...
15:54:06.253 [Thread-1] INFO com.example.demo.CacheTest - Wrote to db
15:54:06.253 [Thread-1] INFO com.example.demo.CacheTest - Invalidated cached
15:54:06.778 [main] INFO com.example.demo.CacheTest - loading Read from db return : 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1
15:54:06.782 [main] INFO com.example.demo.CacheTest - get from cache: 1

Use caffeine

@Test
    public void testCacheDirty() throws InterruptedException, ExecutionException {
        AtomicReference<Integer> db = new AtomicReference<>(1);

        com.github.benmanes.caffeine.cache.LoadingCache<String, Integer> cache = Caffeine.newBuilder()
                .build(key -> {
                    LOGGER.info("loading reading from db ...");
                    Integer v = db.get();
                    LOGGER.info("loading read from db get:{}",v);
                    Thread.sleep(1000L); //这里1秒才返回,模拟引发脏缓存
                    LOGGER.info("loading Read from db return : {}",v);
                    return v;
                });

        Thread t2 = new Thread(() -> {
            try {
                Thread.sleep(500L);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            LOGGER.info("Writing to db ...");
            db.set(2);
            LOGGER.info("Wrote to db");
            cache.invalidate("k");
            LOGGER.info("Invalidated cached");
        });

        t2.start();

        //这里在t2 invalidate 之前 先触发cache loading
        //loading那里增加sleep,确保在invalidate之后,cache loading才返回
        //此时返回的cache就是脏数据了
        LOGGER.info("fire loading cache");
        LOGGER.info("get from cache: {}",cache.get("k"));

        t2.join();

        for(int i=0;i<3;i++){
            LOGGER.info("get from cache: {}",cache.get("k"));
        }
    }

Output

16:05:10.141 [main] INFO com.example.demo.CacheTest - fire loading cache
16:05:10.153 [main] INFO com.example.demo.CacheTest - loading reading from db ...
16:05:10.153 [main] INFO com.example.demo.CacheTest - loading read from db get:1
16:05:10.634 [Thread-1] INFO com.example.demo.CacheTest - Writing to db ...
16:05:10.635 [Thread-1] INFO com.example.demo.CacheTest - Wrote to db
16:05:11.172 [main] INFO com.example.demo.CacheTest - loading Read from db return : 1
16:05:11.172 [main] INFO com.example.demo.CacheTest - get from cache: 1
16:05:11.172 [Thread-1] INFO com.example.demo.CacheTest - Invalidated cached
16:05:11.172 [main] INFO com.example.demo.CacheTest - loading reading from db ...
16:05:11.172 [main] INFO com.example.demo.CacheTest - loading read from db get:2
16:05:12.177 [main] INFO com.example.demo.CacheTest - loading Read from db return : 2
16:05:12.177 [main] INFO com.example.demo.CacheTest - get from cache: 2
16:05:12.177 [main] INFO com.example.demo.CacheTest - get from cache: 2
16:05:12.177 [main] INFO com.example.demo.CacheTest - get from cache: 2

You can see here that when invalid, loading triggered again, and then the dirty data was cleared.

doc