Redis Gossip (1): Building Knowledge Map

  Cache, Database, Middleware, redis

在这里插入图片描述

Scene: Redis Interview

在这里插入图片描述

(Image from Internet)

Interviewer: I saw on your resume that you are skilled at using Redis, so why do you say Redis is used?

Xiao Ming: (secretly pleased in my heart, isn’t Redis just a cache? Redis is mainly used as a cache to efficiently store non-persistent data through memory.

Interviewer: Can Redis be used as persistent storage?

Xiao Ming: um … should it be okay …

InterviewerHow does Redis persist?

Xiao Ming: um … not too clear.

Interviewer: What are the memory retirement mechanisms of Redis?

Xiao Ming: um … I don’t know

InterviewerWhat else can we do with Redis? Which instruction of Redis was used separately?

Xiao Ming: all I know is that Redis can also do distributed locks and message queues …

InterviewerOk, let’s move on to the next topic. …

ThinkingObviously, Xiao Ming’s performance and answer about Redis in the interview process must have been rather unsuccessful. Redis is something we use every day in our work. Why is it that when we get to an interview, it becomes a breakdown?

As developers, we are used to using what the great gods have packaged to ensure that we can focus more on business development, but we do not know what the underlying implementation of these commonly used tools is. Therefore, even though they are handy in application at ordinary times, they still cannot make the interviewer shine at the moment of the interview.

This article summarizes some knowledge points of Redis, which have principles and applications, hoping to help everyone.

I. What is Redis

REmote DIctionary Server(Redis) is a key-value storage system written by Salvatore Sanfilippo.

Redis is an open source, log-based and Key-Value database written in ANSI and C languages, complying with BSD protocol, supporting network, being memory-based and persistent, and providing API in many languages.

Here I quoted the description of Redis in the Redis tutorial, which is official but very standard.

**可基于内存亦可持久化的日志型、Key-Value数据库。**

I think this description is appropriate and comprehensive.

1.1 Redis’ Industry Status

Redis is the most widely used storage middleware in the field of Internet technology. Because of its ultra-high performance, perfect documents, various application capabilities and rich and perfect client support, Redis is unique in storage and widely praised, especially for its performance and reading speed, it has become the most popular middleware in the field. Almost every software company will use Redis, including many large Internet companies, such as JD.com, Ali, Tencent and github. Therefore, Redis has also become an essential skill for back-end developers.

1.2 Knowledge Map

In my opinion, learning every technology requires a clear context and structure, otherwise you don’t know what you’ve learned and how much you haven’t learned. Just like a book, if there is no catalog section, it will lose its soul.

Therefore, I tried to sum up the knowledge map of Redis, also known as brain map. As shown in the following figure, the knowledge points may not be complete and will be updated continuously.

知识图谱

The knowledge points of this series of articles will be basically consistent with this brain map. This article first introduces the basic knowledge of Redis, and subsequent articles will introduce the data structure, application and persistence of Redis in detail.

II. Advantages of Redis

2.1 fast speed

As a caching tool, Redis is best known for its speed. How fast is it? Redis stand-alone qps (concurrent per second) can reach 110,000 times /s and write at 81,000 times/s.
So, why is Redis so fast?

  • The vast majority of requests are purely memory operations, very fast;
  • Many data structures are used for data storage, which are very fast in search operations. The data structures in Redis are specially designed. For example, HashMap, the time complexity of searching and inserting is O (1);
  • By adopting single thread, unnecessary context switching and competition conditions are avoided, CPU consumption due to switching caused by multi-process or multi-thread is avoided, problems of various locks are not considered, locking and releasing operations are avoided, and performance consumption due to possible deadlock is avoided;
  • Non-blocking I/O multiplexing mechanism is used.

2.2 Rich Data Types

Redis has five common data types: String, List, Hash, set, zset, each of which has its own uses.

2.3 Atomicity, Support Transaction

Redis supports transactions and all its operations are atomic, while Redis also supports atomic execution after merging several operations.

2.4 Rich Features

Redis has rich features, for example, it can be used as a distributed lock. Can persist data; Can be used as message queues, leaderboards, counters; Publish/subscribe, notification, key expiration, etc. are also supported. When we use middleware to solve practical problems, Redis can always play its part.

Third, the comparison between Redis and Memcache

Memcache and Redis are excellent and high-performance memory databases. When we talk about Redis, we usually compare Memcache with Redis. (why do you want to make a comparison? Of course is to foil how good Redis is. Without contrast, there is no harm ~) Contrast includes:

3.1 Storage Methods

  • Memcache stores all the data in the memory, and it will hang up after power failure. It cannot make the data persistent, and the data cannot exceed the memory size.
  • Redis has some data stored on the hard disk, which can make the data persistent.

3.2 Data Support Types

  • Memcache supports relatively simple data types and only supports String data structures.
  • Redis has rich data types, including String, List, Hash, Set, Zset.

3.3 underlying model used

  • The underlying implementation between them and the application protocol for communication with the client are different.
  • Redis directly built the VM mechanism itself, because the general system calls system functions, which wastes some time to move and request.

3.4 Storage Value Size

  • Redis can store up to 1GB, while memcache can only store 1MB.

Seeing here, would you think Redis is especially good, full of advantages and perfect? In fact, Redis still has many shortcomings. How can we overcome these shortcomings?

IV. Problems Existing in Redis and Solutions

4.1 Double Write Consistency of Cache Database

ProblemConsistency is a common problem in distributed systems. Consistency is generally divided into two types: strong consistency and final consistency. When we want to satisfy strong consistency, Redis cannot be perfect either, because double writing of database and cache will definitely lead to inconsistency. Redis can only ensure final consistency.

SolveHow can we ensure the final consistency?

  • The first way is to set a certain expiration time for the cache. After the cache expires, the database will be automatically queried to ensure the consistency between the database and the cache.
  • If the expiration time is not set, we must first choose the correct update strategy: update the database before deleting the cache. However, there may be some problems when we delete the cache, so we need to put the key of the cache to be deleted into the message queue and try again until the deletion is successful.

4.2 Cache Avalanches

Question:We should all have seen avalanches in movies. They started to be calm and then collapsed in an instant. They were very destructive. The same is true here. When we execute the code, we set the effective time of many caches to be the same. Then these caches will be effective at the same time, and then they will revisit the database to update the data. This will cause the database to collapse due to too many connections and too much pressure.

Resolve:

  • Add a random value when setting the cache expiration time.
  • Set double cache, cache 1 sets cache time, cache 2 is not set, 1 returns to cache 2 directly after expiration, and starts a process to update caches 1 and 2.

4.3 Cache Penetration Problem

Question:Cache penetration means that some abnormal users (hackers) deliberately request data that does not exist in the cache, causing all requests to be concentrated on the database, resulting in abnormal database connection.

Resolve:

  • Use mutex lock. When the cache fails, the database cannot be accessed directly, but the lock must be acquired before the database can be requested. If you don’t get the lock, sleep for a while and try again.
  • Asynchronous update strategy is adopted. Whether or not key gets a value, it will be returned directly. The value value maintains a cache expiration time. If the cache expires, a thread will asynchronously read the database and update the cache. Cache warm-up (load cache before starting the project) is required.
  • Provide an interception mechanism that can quickly judge whether a request is valid. For example, a Bloom filter is used to maintain a series of legally valid keys internally, so as to quickly judge whether the key carried by the request is legal and valid. If it is illegal, return directly.

4.4 Cache Concurrency Competition

Question:

The cache concurrency contention problem mainly occurs when a key is set by multiple threads, and then inconsistent data will occur.

For example, in Redis, we have a value whose key is amount. Its value is 100. Both threads add 100 to the value and update it. The correct result should be 300. However, when both threads get this value, it is 100, and the final result is 200, which leads to the concurrent contention of cache.

Solve

  • If there is no sequence requirement for multi-thread operation, we can set a distributed lock, and then multiple threads compete for the lock, and whoever grabs the lock first can execute it first. This distributed lock can be implemented by zookeeper or Redis itself.
  • Redis incr command can be used.
  • When our multi-threaded operations need sequence, we can set up a message queue, add the required operations to the message queue, and execute the commands strictly according to the sequence of the queue.

V. Redis’ Expiration Policy

Redis with the increase of data, the memory occupancy rate will continue to increase. We thought that some keys will be deleted when they reach the set deletion time, but when the time comes, the memory occupancy rate is still very high. Why is this?

Redis usesPeriodically deleteAndInert deletionThe memory retirement mechanism based on.

5.1 Delete periodically

There is a difference between regular deletion and regular deletion:

  • Timing deletion means that the cache must be deleted strictly according to the set time, which requires us to set a timer to continuously poll all keys to determine whether deletion is needed. However, in this way, cpu resources will be greatly occupied and the utilization rate of resources will become low. Therefore, we choose to delete periodically.
  • The time for regular deletion is determined by us. We can check every 100ms, but we still cannot check all the caches. Redis will still get stuck and can only check some caches randomly, but some caches cannot be deleted within the specified time. Inert deletion will come in handy.

5.2 Inert Delete

Let’s give a simple example: in high school, there were too many homework assignments to finish at all. the teacher said that you would give this paper next class. have you all finished it? In fact, there are many people who haven’t finished it, so they need to make it up before the next class.

Inert deletion is also the same reason. Our value should be gone by rights, but it is still there. When you want to obtain the key, you find that the key should be expired, delete it quickly, and then return a’ no value, it has expired!’ .

Now that we have the expiration policy of periodic deletion+inert deletion, can we rest easy? This is not the case. If this key is not accessed all the time, then it will stay all the time, which is also unreasonable. This requires our memory retirement mechanism.

5.3 Redis’ Memory Retirement Mechanism

Redis generally has six memory retirement mechanisms, as shown in the following figure:

在这里插入图片描述

So how do we configure Redis’ memory retirement mechanism?

We can configure it in Redis.conf

# maxmemory-policy allkeys-lru

VI. Summary

This article explores Redis and roughly sorts out Redis’ knowledge map. By contrast, we can find that Redis has so many knowledge points to learn. Secondly, we analyzed the advantages and disadvantages of Redis, knew its memory-based efficient read-write speed and rich data types, and also analyzed how to deal with the problems of data consistency, cache penetration, cache avalanche, etc. Finally, we learned about Redis’ expiration policy and cache retirement mechanism.

I believe that everyone has some understanding of Redis. In the next article, we will analyze the data structure of Redis, how each data type is implemented, and what the corresponding commands are.

By Yang Heng
Source:Yixin Institute of Technology