Redis Topic (2): Exploring the Bottom of Redis Data Structure

  Database, redis

Preface

Last articleRedis Gossip (1): Building Knowledge MapThis paper introduces the basic concept, advantages and disadvantages of redis and its memory elimination mechanism. I believe everyone has a preliminary understanding of redis. Redis is found in many application scenarios of the Internet, and what it can do is far beyond our imagination. What is the underlying data structure of Redis and why can it do so many things? This article will explore the underlying data structure and common commands of Redis.

The brain map of this article is as follows:

图片描述

I. Redis’ Data Model

Use key-value pairsName: "Xiao Ming"To show Redis’ data model is as follows:

图片描述

  • dictEntry:In some programming languages, the data structure of key-value pairs is called a dictionary, while in Redis, each key-value key-value pair is assigned a dictionary entity, namely “dictionary”. DicEntry consists of three parts:Key pointer, val pointer, next pointer, the next pointer points to the next dicteEntry entry to form a linked list, and the next pointer can link a plurality of key value pairs with the same hash value together.The problem of hash conflict is solved by chain address method.
  • sdsSimple Dynamic StringA simple dynamic string that stores string data.
  • redisObject: The five common types of Redis are all stored in RedisObject, in redisObjecttypeThe field indicates the data type of the value (that is, the 5 basic types).ptrThe field points to the address where the object is located.

RedisObject object is very important, Redis.Type of objectInternal codingMemory reclamationShared objectAnd other functions, are based on the RedisObject object to achieve.

The advantage of this design is that it can set up a variety of different data structures for 5 common types according to different usage scenarios, thus optimizing the usage efficiency of objects in different scenarios.

Redis uses jemalloc as the default memory allocator to reduce memory fragmentation. Jemalloc divides the memory space into three areas: small, large and large in 64-bit system. Many small memory block units are divided in each range. When Redis stores data, it will select the memory block with the most appropriate size for storage.

II. Data Structure Supported by Redis

What data structures does Redis support?

If the answer is String, List, Hash, Set, Zset, these five are the common basic data types of redis, and each data type also contains a variety of data structures.

The encoding instruction is used to look at the data structure of a value. For example:

127.0.0.1:6379> set name tom
OK
127.0.0.1:6379> object encoding name
"embstr"

The name value set here is tom, and its data structure is embstr, which will be explained in detail when the string is introduced below.

127.0.0.1:6379> set age 18
OK
127.0.0.1:6379> object encoding age
"int"

The following table summarizes all data structure types in Redis:

Bottom data structure Encoding constant Object encoding instruction output
Integer type REDIS_ENCODING_INT “int”
Embstr string type REDIS_ENCODING_EMBSTR “embstr”
Simple dynamic string REDIS_ENCODING_RAW “raw”
Dictionary type REDIS_ENCODING_HT “hashtable”
Double-ended linked list REDIS_ENCODING_LINKEDLIST “linkedlist”
Compressed list REDIS_ENCODING_ZIPLIST “ziplist”
Integer set REDIS_ENCODING_INTSET “intset”
Jump tables and dictionaries REDIS_ENCODING_SKIPLIST “skiplist”

additional remarks

If the interviewer asks: What are the data types of redis?

Answer: String, list, hash, set, zet

In general, this answer is correct. As mentioned earlier, redis’ data types do include these five types, but careful students must have discovered what they said before“Commonly Used”5 data types of. In fact, with the continuous updating and improvement of Redis, there are already more than 5 types of data for Redis.

Login to redis’s official website to open the official data type introduction:

https://redis.io/topics/data-types-intro

图片描述

It is found that Redis supports more than 5 data structures, but 8 data structures. The latter three types are:

  • Bit array (or bitmap for short): special commands can be used to process string values, such as bit array: you can set and clear individual bits, set all bits to 1, find the first bit or unset bits, etc.
  • HyperLogLogs: This is a probabilistic data structure used to estimate the cardinality of a set. Don’t be afraid, it is simpler than it looks.
  • Streams: Only additional map-like item sets are attached, providing abstract log data types.

This article mainly introduces five commonly used data types, and the above three types will be explored together later.

2.1 string string

String type is the most commonly used data type in redis. In Redis, strings can be modified and exist as byte arrays at the bottom.

The string in Redis is called simple dynamic string “SDS”. This structure is very similar to ArrayList in Java and its length is dynamically variable.

struct SDS<T> {
  T capacity; // 数组容量
  T len; // 数组长度
  byte[] content; // 数组内容
}

图片描述

content[]What is stored is the content of the character string.capacityIndicates the length of the array allocation.lenRepresents the actual length of a string.

The encoding types of strings are int, embstr and raw. as shown in the above table, what are the differences among these three encoding types?

  • Int code: Stores integer values that can be represented by long.
  • Raw encoding: Saves strings longer than 44 bytes (39 bytes before redis3.2 and 44 bytes after).
  • Embstr coding: Saves strings less than 44 bytes in length (39 bytes before redis3.2 and 44 bytes after).

Set a value to test:

127.0.0.1:6379> set num 300
127.0.0.1:6379> object encoding num
"int"
127.0.0.1:6379> set key1 wealwaysbyhappyhahaha
OK
127.0.0.1:6379> object encoding key1
"embstr"
127.0.0.1:6379> set key2 hahahahahahahaahahahahahahahahahahahaha
OK
127.0.0.1:6379> strlen key2
(integer) 39
127.0.0.1:6379> object encoding key2
"embstr"
127.0.0.1:6379> set key2 hahahahahahahaahahahahahahahahahahahahahahaha
OK
127.0.0.1:6379> object encoding key2
"raw"
127.0.0.1:6379> strlen key2
(integer) 45

Raw type vs. embstr type

Embstr coding structure:

图片描述

Raw encoding structure:

图片描述

Embstr and raw are both composed of redisObject and sds. The difference is that the redisObject and sds of embstr are continuous and only need to be used.mallocAllocate memory once; Raw needs to allocate memory for redisObject and sds respectively, that is, it needs to allocate memory twice.

In comparison, embstr allocates less memory once, which is more convenient. However, embstr also has obvious disadvantages: to increase the length, both redisObject and sds need to reallocate memory.

The structural differences between embstr and raw are described above. Here comes the point ~
Why did you choose 44 as the dividing point between the two codes? Why was it 39 before version 3.2? How do you get these two values?

1) calculate the byte size occupied by RedisObject

struct RedisObject {
    int4 type; // 4bits
    int4 encoding; // 4bits
    int24 lru; // 24bits
    int32 refcount; // 4bytes = 32bits
    void *ptr; // 8bytes,64-bit system
}
  • Type: Different redis objects have different data types (string, list, hash, etc.), and type record types are used4bits.
  • Encoding: Stores the encoded form and uses4bits.
  • Lru: use24bitsRecord LRU information for the object.
  • Refcount: reference counter, used32bits.
  • *ptr: Pointer to specific content of object, required64bits.

Calculation: 4+4+24+32+64 = 128bits =16bytes

The first step is completed, and the RedisObject object header information will be occupied.16 bytesThe size of is usually fixed.

2) calculation of byte size occupied by SDS

Old version:

struct SDS {
    unsigned int capacity; // 4byte
    unsigned int len; // 4byte
    byte[] content; // 内联数组,长度为 capacity
}

Hereunsigned intA 4 byte, which adds up to 8 bytes.

If the memory allocated by the memory allocator jemalloc exceeds 64 bytes, it is considered as a large string and raw encoding will be used.

As mentioned earlier, the string of content in SDS structure is a string ending in byte 0. The reason for adding such a byte is to facilitate the direct use of glibc string processing function and the debugging and printout of the string. So we have to subtract 1 byte
64byte – 16byte – 8byte – 1byte = 39byte

New version:

struct SDS {
    int8 capacity; // 1byte
    int8 len; // 1byte
    int8 flags; // 1byte
    byte[] content; // 内联数组,长度为 capacity
}

Here unsigned int is changed into uint8_t, uint16 _ t. and a char flags flag is added, which takes only 3 bytes in total. It is equivalent to optimizing the memory usage of sds, and the corresponding memory for storing strings will become larger.

Then calculate:

图片描述

64byte – 16byte -3byte -1byte = 44byte.

Summary:

Therefore, the maximum string length embstr can hold after redis version 3.2 is 44, compared with 39 before. The reason for the length change is the optimization of memory in SDS.

2.2 List

The bottom layer of List object in Redis is implemented by quicklist (Quick List). Quick List supports adding elements from the head and tail of linked list, and can obtain the element content at the specified location.

So, how is the bottom layer of the quick list implemented? Why can such fast performance be achieved?

Rome was not built in one day, nor was quicklist implemented in one day. At first, the bottom of redis’ list was ziplist or linkedlist. First, introduce the two data structures respectively.

Ziplist compressed list

When a list contains only a small number of list items and is a small integer value or a short string, redis uses ziplist (compressed list) as the bottom implementation of list keys.

Testing:

127.0.0.1:6379> rpush dotahero sf qop doom
(integer) 3
127.0.0.1:6379> object encoding dotahero
"ziplist"

The old version of redis is used for testing here. qop Pain Queen, sf Shadowman and doom Doom are added to the dota Hero List. ziplist is used for data structure coding.

As the name implies, the compressed list is compressed. There is no pointer between each node, but multiple elements are adjacent and there is no gap.So ziplist was developed by Redis to save memory., is a sequential data structure composed of a series of specially coded continuous memory blocks. The specific structure is relatively complicated, so you can have a deeper understanding if you are interested.

struct ziplist<T> {
    int32 zlbytes; // 整个压缩列表占用字节数
    int32 zltail_offset; // 最后一个元素距离压缩列表起始位置的偏移量,用于快速定位到最后一个节点
    int16 zllength; // 元素个数
    T[] entries; // 元素内容列表,挨个挨个紧凑存储
    int8 zlend; // 标志压缩列表的结束,值恒为 0xFF
}

图片描述

Linkedlist

The double-ended list is familiar to everyone. The double-ended list here is very similar to the linkedlist in java.

图片描述

It can be seen from the figure that Redis’ linkedlist double-ended linked list has the following characteristics: nodes are provided with prev, next pointer, head pointer and tail pointer, and the complexity of obtaining front node, rear node, header node and tail node and obtaining length is O(1).

Compressed list takes up less memory, but it is a sequential data structure, and the operation of inserting and deleting elements is more complicated. Therefore, compressed list is suitable for the case of relatively small data. When there is more data, efficient insertion and deletion of double-ended list is still a better choice.

In the eyes of Redis developers, the choice of data structure has to be maximized in time and space, so they combined the compressed list and the double-ended list to createQuicklist. Like hashmap in java, it combines the advantages of arrays and linked lists.

Quicklist

  • rpush: listAddNodeHead —O(1)
  • lpush: listAddNodeTail —O(1)
  • push:listInsertNode —O(1)
  • index : listIndex —O(N)
  • pop:ListFirst/listLast —O(1)
  • llen:listLength —O(N)

图片描述

struct ziplist {
    ...
}
struct ziplist_compressed {
    int32 size;
    byte[] compressed_data;
}
struct quicklistNode {
    quicklistNode* prev;
    quicklistNode* next;
    ziplist* zl; // 指向压缩列表
    int32 size; // ziplist 的字节总数
    int16 count; // ziplist 中的元素数量
    int2 encoding; // 存储形式 2bit,原生字节数组还是 LZF 压缩存储
    ...
}
struct quicklist {
    quicklistNode* head;
    quicklistNode* tail;
    long count; // 元素总数
    int nodes; // ziplist 节点的个数
    int compressDepth; // LZF 算法压缩深度
    ...
}

The default compression depth for quicklist is 0, which means no compression. The actual depth of compression is determined by the configuration parameter list-compress-depth. In order to support fast push/pop operation, the first and second ziplist of quicklist are not compressed, and the depth is 1. If the depth is 2, it means that the first ziplist at the beginning and the second ziplist at the end of the quicklist are not compressed.

2.3 Hash

The underlying implementation of Hash data types is a ziplist (compressed list) or dictionary (also known as hashtable or hash table). The choice of compressed list or dictionary here is also determined by the number of elements.

图片描述

As shown in figure hset, there are three key-value pairs. when the number of bytes in each value does not exceed 64, the default data structure used isziplist.

图片描述

When we add data with more than 64 bytes, the default data structure has become hashtable.

A Hash object will only use a ziplist if both of the following conditions are met:

  • The number of elements in the hash is less than 512;
  • The key and value strings of all key-value pairs in the hash are less than 64 bytes in length.

As you have just seen from the compressed list, hashtables is similar to hashmap before jdk1.7. Hashmap adopts the chain address method to solve the problem of hash conflict. If you want to know more, please refer to a previous blog:
Hashmap, do you really understand

Dictionary in Redis

The dict structure in redis contains two hashtable, and usually only one hashtable has a value. However, when dict expands and shrinks its capacity, it needs to allocate a new hashtable, and then carry out gradual relocation. At this time, the two hashtable stores the old hashtable and the new hashtable respectively. After the relocation, the old hashtable will be deleted and replaced by the new hashtable.

图片描述

2.4 Set

The bottom layer of the Set data type can beintset(set of integers) orhashtable(hash table is also called hash table).

When the data are integers and the quantity is small, intset is used as the underlying data structure. When there is data other than integers or the amount of data increases, hashtable is used as the underlying data structure.

127.0.0.1:6379> sadd myset 111 222 333
(integer) 3
127.0.0.1:6379> object encoding myset
"intset"
127.0.0.1:6379> sadd myset hahaha
(integer) 1
127.0.0.1:6379> object encoding myset
"hashtable"

Inset’s data structure is:

typedef struct intset {
    // 编码方式
    uint32_t encoding;
    // 集合包含的元素数量
    uint32_t length;
    // 保存元素的数组
    int8_t contents[];
} intset;

The bottom layer of intset is implemented as an ordered array without repeating numbers. Integer types of intset can be 16-bit, 32-bit, 64-bit. If all the integers in the array are 16-bit long and a 32-bit integer is added, the entire 16-bit array will be upgraded to a 32-bit array. Upgrade can improve the flexibility of intset and save memory, but it is irreversible.

2.5 Zset

Zset in Redis, also known asOrdered set. Its bottom layer is ziplist orskiplist(Jumping Table).

The compressed list has already been described above, and is also used when the number of elements is relatively small. The jump list is mainly introduced here.

Jump table

Jump list, as its name implies, can jump, jump to query the elements you want to find. People may be unfamiliar with this data structure. Although they seldom contact it at ordinary times, it doesIt is a data structure with good performance in all aspects. It can support fast query, insert and delete operations. It is also much easier to develop than red and black trees..

Why is the watch jumping so high? How does it “jump”? Jumping tables use the idea of dichotomy, which can be used to search quickly in arrays and in linked lists.

For example, the linked list is as follows:

图片描述

Let’s assume that to find 10 nodes, we need to traverse one by one to determine whether we are looking for nodes. How can efficiency be improved? Mysql Index is believed to be familiar to everyone and can improve efficiency. It can also be used here. Pull out an index layer:

图片描述

In this way, it is only necessary to find 9 and then 10, thus greatly saving the searching time.

You can also extract another layer of indexes to save time better:

图片描述

This “binary search” based on linked list supports fast insertion and deletion with time complexity of O(logn).

Due to the fast lookup efficiency of skip tables and the simplicity and readability of implementation. So Redis gave up the red and black trees and chose a simpler jump table.

Jumping table in Redis:

typedef struct zskiplist {
     // 表头节点和表尾节点
    struct zskiplistNode *header, *tail;
    // 表中节点的数量
    unsigned long length;
    // 表中层数最大的节点的层数
    int level;
 } zskiplist;
typedef struct zskiplistNode {
    // 成员对象
    robj *obj;
    // 分值
    double score;
     // 后退指针
    struct zskiplistNode *backward;
    // 层
    struct zskiplistLevel {
        // 前进指针
        struct zskiplistNode *forward;
         // 跨度---前进指针所指向节点与当前节点的距离
        unsigned int span;
    } level[];
} zskiplistNode;

Zadd—zslinsert— average O(logN), worst O(N)

Zrem—zsldelete— average O(logN), worst O(N)

Zrank–zslGetRank— average O(logN), worst O(N)

Summary

This article briefly introduces the underlying implementations of five common data types of Redis, hoping that you can learn more about them in combination with source code and data.

The beauty of data structure is fully reflected in Redis. From String to compressed list, fast list, hash list and skip list, these data structures are applicable to different places and perform their respective duties.

Not only that, Redis upgrades and combines these data structures to maximize the efficiency and performance of memory storage. It is for this reason that Redis can become an indispensable high-performance, second-class key-value memory database for many Internet companies.

By Yang Heng

Expand reading:Redis Gossip (1): Building Knowledge Map

Source: Yixin Institute of Technology