Note：本文实际绑定的版本是branch5.0（2018-7-25）。

数据结构

SDS 简单动态字符串

基本特点

涉及字符串的存储基本都基于SDS，例如set nickname Luffy，就创建了nicknmae和Luffy2个SDS
不止用于字符串，还用于缓冲区：AOF缓冲区、客户端状态中的输入缓冲区
减少修改字符串长度时所需的内存重分配次数
杜绝缓冲区溢出问题
二进制安全
兼容部分C字符串函数

源码位置

sds.h
sds.c

结构定义

typedef char *sds;

/* Note: sdshdr5 is never used, we just access the flags byte directly.
 * However is here to document the layout of type 5 SDS strings. */
struct __attribute__ ((__packed__)) sdshdr5 {
    unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
    uint8_t len; /* used */
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
    uint16_t len; /* used */
    uint16_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
    uint32_t len; /* used */
    uint32_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
    uint64_t len; /* used */
    uint64_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};

sds被定义为5种，根据sds的len属性的bit数量划分：5、8、16、32、64bits，5bits的结构没有被使用。
紧凑对齐：attribute ((packed)) 告诉编译器取消结构在编译过程中的优化对齐,按照实际占用字节数进行对齐，是GCC特有的语法。
char buf[]很有意思，这样子声明buf，是不会增大sizeof(sdshdr#T)的，因为编译器不知道buf的长度，默认0，而如果声明成指针，就会占4或8字节。
综上，5个sds结构的size分别为1、3、5、9、17
typedef char *sds，指向的是buf字段，如果要访问sdshdr的len、alloc、flags，是用宏SDS_HDR来定位sdshdr的首字节。
sds依然遵守C语言用'\0'（null terminator）结尾的习惯，使得sds可以使用C字符串函数库。
len相当于strlen(buf) - 1，记录实际使用了buf的多少字节
alloc记录buf的容量，不含'\0'
'\0'由sds自动处理，用户不会感知到'\0'；len、alloc也是
因为有len字段，获取sds长度时间复杂度为O(1)
len、alloc字段能防止缓冲区溢出
alloc - len = avail, avail有效地降低了频繁内存重分配的开销。这种策略叫空间预分配、惰性空间释放。

接口说明

sds.h直接定义的简单接口：

size_t sdslen(const sds s)：获得s的len字段
size_t sdsavail(const sds s)：返回这个s的空闲空间字节长度
sdssetlen(sds s, size_t newlen)：直接设置s的len字段（不验证上限的）
sdsinclen(sds s, size_t inc)：s的len增加inc（不验证上限的）
size_t sdsalloc(const sds s)：获得s的alloc字段（并不是分配一个sds）
sdssetalloc(sds s, size_t newlen)：直接设置s的alloc字段

在sds.c定义的简单接口：

int sdsHdrSize(char type)：根据SDS_TYPE_xx，返回对应的sdshdr结构的sizeof
char sdsReqType(size_t string_size)：根据给定大小，返回对应的sdshdr的类型SDS_TYPE_xx，策略是从小到大匹配
size_t sdsAllocSize(sds s)：等于sdsalloc(s) + 头部长度 + 1，即sds总共占了多少内存
void *sdsAllocPtr(sds s)：即得到s的sdshdr头部指针
sds sdsempty(void)：实际调用sdsnewlen("",0);
sds sdsnew(const char *init)：把C字符串转成sds
sds sdsdup(const sds s)：克隆一个sds

核心接口：

sds sdsnewlen(const void *init, size_t initlen)：创建一个新的sds对象，内容为init，len字段为initlen
void sdsfree(sds s)：销毁一个sds
void sdsupdatelen(sds s)：更新s的len为strlen的结果，如果sds某个字节被置0，就会导致strlen长度发生变化。
void sdsclear(sds s)：清空sds，但不释放sds内存
sds sdsMakeRoomFor(sds s, size_t addlen)：增大sds的alloc，但不影响已存内容和len字段，会返回新的sds(地址发生改变)
sds sdsRemoveFreeSpace(sds s)：缩小sds的alloc，使得没有avail空间（100%利用率），会返回新的sds
void sdsIncrLen(sds s, ssize_t incr)：和sdsinclen类似，区别在于加了assert防止超过上限，以及会把newlen字节置0
sds sdsgrowzero(sds s, size_t len)：oldlen增大到len，len - oldlen这段空间会自动置0
sds sdscatlen(sds s, const void *t, size_t len)：即concat操作
sds sdscat(sds s, const char *t)：用strlen(t)去调用sdscatlen
sds sdscatsds(sds s, const sds t)：concat2个sds
sds sdscpylen(sds s, const char *t, size_t len)：把t复制进sds
sds sdscpy(sds s, const char *t)：用strlen(t)去调用sdscpylen

其他接口：

int sdsll2str(char *s, long long value)：long long转成字符串并放进s缓冲区
int sdsull2str(char *s, unsigned long long v)：同上
sds sdsfromlonglong(long long value) ：long long转成sds
sds sdscatvprintf(sds s, const char *fmt, va_list ap) ：打印相关
sds sdscatprintf(sds s, const char *fmt, ...)：打印相关
sds sdscatfmt(sds s, char const *fmt, ...)：格式化相关
sds sdstrim(sds s, const char *cset) ：trim掉头尾连续字符串，用cset识别
void sdsrange(sds s, ssize_t start, ssize_t end)：返回s的子串
void sdstolower(sds s)：转小写
void sdstoupper(sds s)：转大写
int sdscmp(const sds s1, const sds s2)：比较两个sds
sds *sdssplitlen(const char *s, ssize_t len, const char *sep, int seplen, int *count)：用指定的sep字符串切割sds
void sdsfreesplitres(sds *tokens, int count)：sdssplitlen、sdssplitargs后要调用这个释放数组
sds sdscatrepr(sds s, const char *p, size_t len) ：concat p串，p会被转成escaped的串
int is_hex_digit(char c)：判断是不是hex字符
int hex_digit_to_int(char c)：hex字符转int
sds *sdssplitargs(const char *line, int *argc)：把用空格间隔的参数列表转成sds数组
sds sdsmapchars(sds s, const char *from, const char *to, size_t setlen) ：sdsmapchars(mystring, "ho", "01", 2)，"hello" -> "0ell1"
sds sdsjoin(char **argv, int argc, char *sep)：join一组C字符串
sds sdsjoinsds(sds *argv, int argc, const char *sep, size_t seplen)：join一组SDS对象

adlist 泛型双向链表

基本特点

增加一个len字段用来记录链表长度，使得获取链表长度O(1)

泛型原理：

value字段用void*类型，使得可以存任意类型的对象
list结构里存了3个函数指针：dup、free、match，使得不同类型的list可以自定义节点的复制、销毁、匹配函数

源码位置

adlist.h
adlist.c

结构定义

/* Node, List, and Iterator are the only data structures used currently. */

typedef struct listNode {
    struct listNode *prev;
    struct listNode *next;
    void *value;
} listNode;

typedef struct listIter {
    listNode *next;
    int direction;
} listIter;

typedef struct list {
    listNode *head;
    listNode *tail;
    void *(*dup)(void *ptr);
    void (*free)(void *ptr);
    int (*match)(void *ptr, void *key);
    unsigned long len;
} list;

接口说明

list *listCreate(void); 创建新的list对象
void listEmpty(list *list); 清空list里的所有元素
void listRelease(list *list); 调用了listEmpty(list)、zfree(list)
list *listAddNodeHead(list *list, void *value); 创建一个存value的node并插在链表头
list *listAddNodeTail(list *list, void *value);创建一个存value的node并插在链表尾
list *listInsertNode(list *list, listNode *old_node, void *value, int after); 创建一个存value的node并插在old_node的前或后
void listDelNode(list *list, listNode *node); 删node
listIter *listGetIterator(list *list, int direction); 创建迭代器
listNode *listNext(listIter *iter); 根据迭代方向步进1
void listReleaseIterator(listIter *iter);删迭代器
void listRewind(list *list, listIter *li); 重置迭代器到链表头
void listRewindTail(list *list, listIter *li);重置迭代器到链表尾
list *listDup(list *orig); 复制整个链表，会调用dup接口
listNode *listSearchKey(list *list, void *key); 找出包含key的节点，会调用match接口
listNode *listIndex(list *list, long index); 模拟数组下标操作
void listRotate(list *list); 把tail移到head
void listJoin(list *l, list *o); 把整个链表o插到l末尾，并empty掉o

dict 泛型字典

基本特点

字典基于散列表
每个dict有2个dictht，用来实现rehashing，即扩容操作
rehashing是渐进式的，不是在瞬间做完
键冲突的解决方案是chaining，于是每个dictEntry都有一个next指针，用来构成同哈希键链表
因为只有next指针，所以chain是单向的，为了加速插入性能，后加入的kv，会插到链表头而不是链表尾
rehash和chaining的特点，使得rehashing时，新的表项都插入到dt[1]，而dt用chaing没有容量限制，于是rehashing过程总能完成

源码位置

dict.h
dict.c
siphash.c

结构定义

typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

// dictType实现泛型，在server.c里有各种dictType的定义
typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);
    void *(*keyDup)(void *privdata, const void *key);
    void *(*valDup)(void *privdata, const void *obj);
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    void (*keyDestructor)(void *privdata, void *key);
    void (*valDestructor)(void *privdata, void *obj);
} dictType;

typedef struct dictht {
    dictEntry **table; // 表项
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    unsigned long iterators; /* number of iterators currently running */
} dict;


typedef struct dictIterator {
    dict *d;
    long index;
    int table, safe;
    dictEntry *entry, *nextEntry;
    /* unsafe iterator fingerprint for misuse detection. */
    long long fingerprint;
} dictIterator;

typedef void (dictScanFunction)(void *privdata, const dictEntry *de);
typedef void (dictScanBucketFunction)(void *privdata, dictEntry **bucketref);

/* This is the initial size of every hash table */
#define DICT_HT_INITIAL_SIZE     4

关键逻辑

rehashing

扩容和收缩是通过dictExpand。

dictExpand让一个dict扩大或缩小到size（实际是_dictNextPower(size)），前提条件是used数要小于size，不然就意味着新的size不足以容纳现有的元素。

dictExpand最终会设置d->ht[1] = n 和 d->rehashidx = 0; 启动rehashing程序。

之后，dict.c里有大量地方用dictIsRehashing宏，走不同的逻辑，处理rehashing。

rehashing的迭代接口是dictRehash，参数n表示此次要迭代多少步。

调用dictRehash的地方：

_dictRehashStep里调用dictRehash(d,1) dictRehashMilliseconds里调用dictRehash(d,100)，超时了就结束迭代。dictRehashMilliseconds只在server.c里调用。

调用dictExpand的地方：

dictResize：把dict缩小到used大小，使得空间利用率尽量靠近1。dictResize并不在dict.c调用，所以是由上层策略决定的，实际上是根据htNeedsResize，即ratio小于0.1时，就会开始收缩。
_dictKeyIndex：每次调用都会执行_dictExpandIfNeeded，如果used已经大于size且两者之比ratio大于5，就调用dictExpand开始扩容（说明有很多chain了）

每次dictExpand，都会创建一个dictht，并分配空间：

n.table = zcalloc(realsize*sizeof(dictEntry*)); // realsize个桶

dictRehash的rehashing完成时：

    /* Check if we already rehashed the whole table... */
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;
        return 0;
    }

缺省哈希函数：siphash

缺省哈希函数是siphash，以前是murmurhash2。

跟踪dictType的hashFunction调用，最终会找到dict.c里的：

uint64_t siphash(const uint8_t *in, const size_t inlen, const uint8_t *k);
uint64_t siphash_nocase(const uint8_t *in, const size_t inlen, const uint8_t *k);

这2个函数是在 siphash.c里定义的，siphash.c有且只有这2个函数。

k是dict_hash_function_seed，一个dict.c里的静态全局变量，在服务启动时初始化。

zskiplist 跳跃表

基本特点

用来存放有序集合（zset）
集合元素不能重复，但分值score可以重复
最高level是64，旧版本是32（ZSKIPLIST_MAXLEVEL）

源码位置

server.h
t_zset.c

结构定义

typedef struct zskiplistNode {
    sds ele;
    double score;
    struct zskiplistNode *backward;
    struct zskiplistLevel {
        struct zskiplistNode *forward;
        unsigned long span;
    } level[];
} zskiplistNode;

typedef struct zskiplist {
    struct zskiplistNode *header, *tail;
    unsigned long length;
    int level;
} zskiplist;

表头节点是特殊的，不存数据（ele、score无用）。

资料

精简的skiplist实现：

https://github.com/begeekmyfriend/skiplist

intset 整数集合

基本特点

用紧凑的数组结构来存放集合元素，缓存命中率高
插入、删除性能不会太好，会触发realloc，如果是中间元素，还会触发大段memcpy
查找性能极好，因为数据有序，是基于数组的迭代式的二分查找
元素不能重复，省却了大量插入开销
升级逻辑使得存放int16、int32的开销要比int64小得多，节约内存
不支持降级

源码位置

intset.h
intset.c

结构定义

typedef struct intset {
    uint32_t encoding;
    uint32_t length;
    int8_t contents[];
} intset;

接口说明

内部接口：

uint8_t _intsetValueEncoding(int64_t v) 根据v的值范围，返回INTSET_ENC_INT64/INTSET_ENC_INT32/INTSET_ENC_INT16

对外接口：

intset *intsetNew(void); zmalloc一个intset，缺省编码是INTSET_ENC_INT16，contents为空
intset *intsetAdd(intset *is, int64_t value, uint8_t *success); 看下文。
intset *intsetRemove(intset *is, int64_t value, int *success); 移除一个元素，会引起该元素之后的所有元素的整体移动；不会降级
uint8_t intsetFind(intset *is, int64_t value); 二分查找一个元素
int64_t intsetRandom(intset *is); 随机返回一个元素，基于random()
uint8_t intsetGet(intset *is, uint32_t pos, int64_t *value); 返回指定元素
uint32_t intsetLen(const intset *is); 返回is->length
size_t intsetBlobLen(intset *is); 返回is总共占内存多少字节

关键逻辑

intsetAdd(intset is, int64_t value, uint8_t success);

先获取value的编码，然后对比is的编码，如果value编码大于is的编码，那么转而执行intsetUpgradeAndAdd(is,value)，没有传入success是因为value肯定不在集合中，毕竟越界了；如果value编码小于等于is的编码，就要检查value是不是已经在集合中（调用intsetSearch(is,value,&pos)），如果在，那么中止，如果不在，pos会设成value可以插入的位置，intsetResize这个is到length + 1，并判断pos是不是小于newlength - 1，是的话意味着插入位置后面还有元素，那么要把pos后的元素全体向右挪1（intsetMoveTail）。

最后，调用_intsetSet(is,pos,value)，把value放进pos位置，并更新is->length。

intset intsetUpgradeAndAdd(intset is, int64_t value)

先intsetResize扩容到length+1，然后从后到前把旧数据刷到新的位置，防止覆盖；刷完后，必然剩一个位置，或者在最前或者在最后，根据value的正负，把value插到头或尾即可。

升级后就再不会降级的。

ifbe

在这个intset.c里经常看到这个后缀，含义是 if big endian（是否是大端字节序）。

ziplist 压缩列表

基本特点

通过在每个entry维护一个prevlen字段，使得逆向遍历链表变成可能（双向链表）
删除节点可能会引发连锁更新，因为prevlen不一定够空间存长度，这样会导致连锁性的内存重分配
压缩列表的压缩体现在对于小整数的存储是高度优化的，减少了大量空间浪费

源码位置

ziplist.h
ziplist.c

结构定义

typedef struct zlentry {
    unsigned int prevrawlensize; // 上一节点头部长度
    unsigned int prevrawlen;    // 上一节点内容部分的长度
    unsigned int lensize;       // 当前节点头部长度
    unsigned int len;           // 当前节点内容部分的长度
    unsigned int headersize;     // 上一节点头部长度 + 当前节点头部长度
    unsigned char encoding;      // 编码
    unsigned char *p;            /* points to prev-entry-len field
} zlentry;

实际上zlentry不是直接用的，这相当于是个结构说明。

表头没有结构，而是用三个宏访问：

/* Return total bytes a ziplist is composed of. */
#define ZIPLIST_BYTES(zl)       (*((uint32_t*)(zl)))

/* Return the offset of the last item inside the ziplist. */
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))

/* Return the length of a ziplist, or UINT16_MAX if the length cannot be
 * determined without scanning the whole ziplist. */
#define ZIPLIST_LENGTH(zl)      (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))

表头：

zlbytes：表示这个ziplist占用内存多少字节
zltail：表示这个ziplist头到最后一个entry的偏移，初始偏移量等于头部大小（10字节）
zllen：表示这个ziplist中节点的数量

表头占4+4+2 = 10字节，表尾1字节（255)。

注释翻译

ziplist.c有一大段注释值得一看，翻译下。

/*
 * ZIPLIST ENTRIES 列表项
 * ===============
 *
 * Every entry in the ziplist is prefixed by metadata that contains two pieces
 * of information. First, the length of the previous entry is stored to be
 * able to traverse the list from back to front. Second, the entry encoding is
 * provided. It represents the entry type, integer or string, and in the case
 * of strings it also represents the length of the string payload.
 * So a complete entry is stored like this:
 * ziplist的每一entry都带了一个metadata前缀，metadata里有2个信息。一个是，上一entry
 * 的长度，用于从后向前遍历列表；另一个是entry的编码encoding，编码记录了entry类型
 * （整数/字符串），如果是字符串的话那么还记录了字符串的长度。
 * 所有完整的entry的存储结构如下：
 * <prevlen> <encoding> <entry-data>
 *
 * Sometimes the encoding represents the entry itself, like for small integers
 * as we'll see later. In such a case the <entry-data> part is missing, and we
 * could have just:
 * 有时候编码已经表示了entry本身，例如小整数，这种情况下，entry-data是没有的：
 * <prevlen> <encoding>
 *
 * The length of the previous entry, <prevlen>, is encoded in the following way:
 * If this length is smaller than 254 bytes, it will only consume a single
 * byte representing the length as an unsinged 8 bit integer. When the length
 * is greater than or equal to 254, it will consume 5 bytes. The first byte is
 * set to 254 (FE) to indicate a larger value is following. The remaining 4
 * bytes take the length of the previous entry as value.
 * prevlen被用如下方式编码： 
 * 如果长度小于254字节，用1个字节就足够表示了；当长度大于等于254，那么需要5字节，
 * 第1个字节被设为254(0xFE)，表示余下4字节才是真正存了长度信息的。
 * (这里说明下，不是以255为分界的原因是，255是用来表达ziplist的结尾的，有特殊含义。)
 * So practically an entry is encoded in the following way:
 * 所以entry实际上是这样编码的：
 *
 * <prevlen from 0 to 253> <encoding> <entry>
 *
 * Or alternatively if the previous entry length is greater than 253 bytes
 * the following encoding is used:
 * 或者这样（上一entry长度大于等于254字节时）：
 *
 * 0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>
 *
 * The encoding field of the entry depends on the content of the
 * entry. When the entry is a string, the first 2 bits of the encoding first
 * byte will hold the type of encoding used to store the length of the string,
 * followed by the actual length of the string. When the entry is an integer
 * the first 2 bits are both set to 1. The following 2 bits are used to specify
 * what kind of integer will be stored after this header. An overview of the
 * different types and encodings is as follows. The first byte is always enough
 * to determine the kind of entry.
 * encoding字段的最左2bits，保存了编码类型，00/01/10是字符串类型，11是整数类型。
 * 如果是11，那么接下来的2bits，保存了整数子类型。
 * encoding的总长度是动态的，取决于encoding首字节里的定义。
 *
 * |00pppppp| - 1 byte
 *      String value with length less than or equal to 63 bytes (6 bits).
 *      "pppppp" represents the unsigned 6 bit length.
 *      00：字符串长度小于等于63字节（6bits能表示的范围），长度信息存在pppppp里。
 * |01pppppp|qqqqqqqq| - 2 bytes
 *      String value with length less than or equal to 16383 bytes (14 bits).
 *      IMPORTANT: The 14 bit number is stored in big endian.
 *      01：字符串长度小于等于16383字节（14bits能表示的范围）
 *      注意，这个14bits数字是用大端字节序的
 * |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
 *      String value with length greater than or equal to 16384 bytes.
 *      Only the 4 bytes following the first byte represents the length
 *      up to 32^2-1. The 6 lower bits of the first byte are not used and
 *      are set to zero.
 *      IMPORTANT: The 32 bit number is stored in big endian.
 *      10：字符串长度大于16384字节。
 *      不过这次pppppp没有被使用，而是置0，另外加了4个字节来存长度
 *      注意，4字节的长度是用大端字节序的
 * |11 00 0000| - 3 bytes
 *      Integer encoded as int16_t (2 bytes).
 *      双字节的整数
 * |11 01 0000| - 5 bytes
 *      Integer encoded as int32_t (4 bytes).
 *      4字节的整数
 * |11 10 0000| - 9 bytes
 *      Integer encoded as int64_t (8 bytes).
 *      8字节的整数
 * |11 11 0000| - 4 bytes
 *      Integer encoded as 24 bit signed (3 bytes).
 *      3字节的整数
 * |11 11 1110| - 2 bytes
 *      Integer encoded as 8 bit signed (1 byte).
 *      1字节的整数
 * |11 11 xxxx| - (with xxxx between 0000 and 1101) immediate 4 bit integer.
 *      Unsigned integer from 0 to 12. The encoded value is actually from
 *      1 to 13 because 0000 and 1111 can not be used, so 1 should be
 *      subtracted from the encoded 4 bit value to obtain the right value.
 *      后4bits如果不是0000/1110，那么就是表示[0，12]范围内的某一个数。
 *      因为0000有特殊用途了，所以这4bits实际上是存了[1,13]范围的总共13个数。
 *      为了能表示[0,12]，需要做一个-1的操作。所以实际是存[0001, 1101]
 * |11 11 1111| - End of ziplist special entry.
 *      这个就不是什么整数了，而是ziplist的end标识
 *
 * Like for the ziplist header, all the integers are represented in little
 * endian byte order, even when this code is compiled in big endian systems.
 * 所有证书都用小端字节序，即使是在大端字节序系统里编译也是。
 *
 * EXAMPLES OF ACTUAL ZIPLISTS
 * ===========================
 * 实例：
 *
 *  [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
 *        |             |          |       |       |     |
 *     zlbytes        zltail    entries   "2"     "5"   end
 * （这个图是前3个[]用大端字节序表示的，[]里的最左边字节才是最低位的）
 * zlbytes是0x0000000f，表示这个ziplist占内存15字节。
 * zltail是0x0000000c，表示从ziplist起始位置到最后1个entry（02 f6），距离是12字节。
 * zllen是0x0002，表示总共存了2个元素。
 * 接下里的就是元素列表了，[00 f3]，00是prevlen，因为是首个元素，所以为0；
 * f3是encoding，|1111 0011|，根据上面的编码规则，可以知道这是个小整数，
 * 0011需要再减1，得到2。
 * 接下来是[02 f6]，02是prevlen，表示上一entry占2个字节；
 * f6是encoding，|1111 0110|，根据上面的编码规则，可以知道这是个小整数，
 * 0110需要再减1，得到5。
 * 最后，[ff]表示ziplist到此结束。
 *
 * 再演示怎么编码一个短字符串"Hello World"。
 * 把下面这串东西插入到上面的[02 f6]后面：
 *
 * [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]
 *
 * [02]表示prevlen；[0b]是encoding |0000 1011|，前2bits表示这是个字符串，
 * 后6bits表示字符串长度11字节；
 * 后面的那串就是正文"Hello World"了。
 */

quicklist 泛型双重双向链表

基本特点

基于ziplist的扩展list，平衡了内存连续性和随机插入删除问题
支持对中间节点做压缩，依据是list通常只对表头表尾做操作而不是中间节点

源码位置

quicklist.h
quicklist.c

quicklistNode

32字节对齐
双向链表有prev、next指针
zl：指向一个ziplist或者指向一个quicklistLZF
sz：存了ziplist的占用字节数，是压缩前大小
count：记录ziplist里元素数量，不超过32k
encodig：就是标识是否有压缩
container：
recompress：1表示这个node是暂时被解压缩出来使用的
attempted_compress：1表示这个节点大小太小了，不能压缩
extra：备用

quicklistLZF

typedef struct quicklistLZF {
    unsigned int sz; /* LZF size in bytes*/
    char compressed[];
} quicklistLZF;

sz记录compressed字节长度，即压缩后大小
用LZF算法压缩

quicklist

typedef struct quicklist {
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count; // 总entry个数
    unsigned long len; // quicklistNodes个数
    int fill : 16; // list-max-ziplist-size
    unsigned int compress : 16; // list-compress-depth
} quicklist;

参考资料：

https://blog.csdn.net/z69183787/article/details/81121779

(未经授权禁止转载)

Written on July 24, 2018

写作不易，您的支持是我写作的动力！

redis5.0源码学习笔记（1）基础数据结构

Tags: redis

数据结构

SDS 简单动态字符串

基本特点

源码位置

结构定义

接口说明

adlist 泛型双向链表

基本特点

源码位置

结构定义

接口说明

dict 泛型字典

基本特点

源码位置

结构定义

关键逻辑

rehashing

缺省哈希函数：siphash

zskiplist 跳跃表

基本特点

源码位置

结构定义

资料

intset 整数集合

基本特点

源码位置

结构定义

接口说明

关键逻辑

intsetAdd(intset *is, int64_t value, uint8_t *success);

intset *intsetUpgradeAndAdd(intset *is, int64_t value)

ifbe

ziplist 压缩列表

基本特点

源码位置

结构定义

注释翻译

quicklist 泛型双重双向链表

基本特点

源码位置

quicklistNode

quicklistLZF

quicklist

参考资料：

intsetAdd(intset is, int64_t value, uint8_t success);

intset intsetUpgradeAndAdd(intset is, int64_t value)