redis源码分析——7、对象object
一、对象的类型与编码
1. 对象的定义
redis并没有直接使用前面介绍的数据结构,而是数据结构之上封装了一层struct,定义如下:
typedef struct redisObject {
unsigned type:4; // 数据类型,常见的有5种
unsigned encoding:4; // 每种类型的至少有两种编码
unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time). */
int refcount; // 引用计数
void *ptr; // 具体的数据
} robj;
可以看到,用type
做了区分,具体的数据用一个void *
来存储,这么做的好处是对外提供了统一的接口,类似于面向对象中的接口。
2. 对象的类型
我们熟知redis有5种基本类型,这5种类型对应了上述结构中的5种type
/* The actual Redis Object */
#define OBJ_STRING 0 /* String object. */
#define OBJ_LIST 1 /* List object. */
#define OBJ_SET 2 /* Set object. */
#define OBJ_ZSET 3 /* Sorted set object. */
#define OBJ_HASH 4 /* Hash object. */
除了这5种以为,还有module
、stream
等扩展类型。我们可以用type
命令查看对象是什么类型。
3. 对象的编码
redis是一内存数据库,对内存要求严格,为了节省内存,对于redis的5种结构,每一种至少有2种编码方式,除了string
以外,其余结构都有相应的zip**结构编码。
#define OBJ_ENCODING_RAW 0 /* Raw representation 原始字符串 */
#define OBJ_ENCODING_INT 1 /* Encoded as integer 整形 */
#define OBJ_ENCODING_HT 2 /* Encoded as hash table 字典*/
#define OBJ_ENCODING_ZIPMAP 3 /* Encoded as zipmap */
#define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define OBJ_ENCODING_INTSET 6 /* Encoded as intset */
#define OBJ_ENCODING_SKIPLIST 7 /* Encoded as skiplist */
#define OBJ_ENCODING_EMBSTR 8 /* Embedded sds string encoding 内嵌字符串 */
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
#define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */
我们可以用object encoding
来查处对象的编码类型。
二、字符串对象
字符串有3种编码方式——int、raw和embstr。
- int: 如果一个对象是整数,且这个数可以用long表示,那么它的编码就是int
- embstr: 如果一个字符串长度小于44,那么它的编码就是embstr,embstr是只读的
- 除此之外的其他场景都是raw编码
1. embstr的最大长度
redis 6.0的源码中定义了#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
,也就是说embstr的最大长度是44
typedef struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:LRU_BITS;
int refcount;
void *ptr;
} robj
对于robj
结构,在64位机下占用的内存是16字节,用jemalloc申请内存时,一次申请了64字节,这么一来相当于是多了64-16=48字节。
struct __attribute__ ((__packed__)) sdshdr8 {
uint8_t len; /* used */
uint8_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
sds的sdshdr
结构占了3字节,所以此时可用的空间就是64-16-3=45字节。而redis的字符串为了兼容C字符串,末尾也会加\0
,所以能够有效存储数据的就是64-16-3-1=44字节了。
需要注意的是,早期的redis版本中sds只有一种结构,且长度是8字节,所以embstr的最大长度就是64-16-8-1=39字节,所以定义了#define REDIS_ENCODING_EMBSTR_SIZE_LIMIT 39
。
2. 编码转换
int和embstr都是string的紧凑存储,他们在一定的条件下会发生转换。
127.0.0.1:6379> set age 10
OK
127.0.0.1:6379> object encoding age
"int"
127.0.0.1:6379> incr age
(integer) 11
127.0.0.1:6379> get age
"11"
127.0.0.1:6379> object encoding age
"int"
127.0.0.1:6379> append age 0
(integer) 3
127.0.0.1:6379> get age
"110"
127.0.0.1:6379> object encoding age
"raw"
127.0.0.1:6379> decr age
(integer) 109
127.0.0.1:6379> get age
"109"
127.0.0.1:6379> object encoding age
"int"
127.0.0.1:6379> set name zhangsan
OK
127.0.0.1:6379> object encoding name
"embstr"
127.0.0.1:6379> append name 'hh'
(integer) 10
127.0.0.1:6379> get name
"zhangsanhh"
127.0.0.1:6379> object encoding name
"raw"
127.0.0.1:6379>
注意的是embstr其实是只读类型,对embstr任何操作都会转换成raw格式。
3. 源码分析
三、列表对象
1. 编码转换
早期的redis版本中,列表可以由ziplist和两种编码方式,但是后来这两种都没有了,只有一种quicklist编码,所以也就不存在编码转换的问题。
2. 源码分析
void listTypeInsert(listTypeEntry *entry, robj *value, int where) {
if (entry->li->encoding == OBJ_ENCODING_QUICKLIST) {
value = getDecodedObject(value);
sds str = value->ptr;
size_t len = sdslen(str);
if (where == LIST_TAIL) {
quicklistInsertAfter((quicklist *)entry->entry.quicklist,
&entry->entry, str, len);
} else if (where == LIST_HEAD) {
quicklistInsertBefore((quicklist *)entry->entry.quicklist,
&entry->entry, str, len);
}
decrRefCount(value);
} else {
serverPanic("Unknown list encoding");
}
}
可以看到,函数入口处判断了encoding类型,这种判断存在于所有的命令中。
四、哈希对象
哈希对象编码有ziplist和hashtable两种
1. 编码转换
使用ziplist需要同时满足两个条件,只要有一个条件不满足,就会升级成hashtable编码
- 哈希对象的键值对个数小于512
- 哈希对象的所有键和值的长度都小于64字节
以上两个条件的的阈值其实是可以修改的,配置文件中分别有hash-max-ziplist-entries
和hash-max-ziplist-value
两个选项
# Hashes are encoded using a memory efficient data structure when they have a
# small number of entries, and the biggest entry does not exceed a given
# threshold. These thresholds can be configured using the following directives.
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
需要注意的ziplist和hashtable之间的转换是单向的,只能从ziplist升级到hashtable,反之则不行。
127.0.0.1:6379> hset h1 name zhangsan
(integer) 1
127.0.0.1:6379> object encoding h1
"ziplist"
127.0.0.1:6379> hset h1 address aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
(integer) 1
127.0.0.1:6379> object encoding h1
"hashtable"
127.0.0.1:6379> hset h1 address shenzhen
(integer) 0
127.0.0.1:6379> object encoding h1
"hashtable"
127.0.0.1:6379>
2. 源码分析
/* Check the length of a number of objects to see if we need to convert a
* ziplist to a real hash. Note that we only check string encoded objects
* as their string length can be queried in constant time. */
void hashTypeTryConversion(robj *o, robj **argv, int start, int end) {
int i;
if (o->encoding != OBJ_ENCODING_ZIPLIST) return;
for (i = start; i <= end; i++) {
if (sdsEncodedObject(argv[i]) &&
sdslen(argv[i]->ptr) > server.hash_max_ziplist_value)
{
hashTypeConvert(o, OBJ_ENCODING_HT);
break;
}
}
}
void hsetCommand(client *c) {
int i, created = 0;
robj *o;
if ((c->argc % 2) == 1) {
addReplyError(c,"wrong number of arguments for HMSET");
return;
}
if ((o = hashTypeLookupWriteOrCreate(c,c->argv[1])) == NULL) return;
hashTypeTryConversion(o,c->argv,2,c->argc-1);
for (i = 2; i < c->argc; i += 2)
created += !hashTypeSet(o,c->argv[i]->ptr,c->argv[i+1]->ptr,HASH_SET_COPY);
/* HMSET (deprecated) and HSET return value is different. */
char *cmdname = c->argv[0]->ptr;
if (cmdname[1] == 's' || cmdname[1] == 'S') {
/* HSET */
addReplyLongLong(c, created);
} else {
/* HMSET */
addReply(c, shared.ok);
}
signalModifiedKey(c,c->db,c->argv[1]);
notifyKeyspaceEvent(NOTIFY_HASH,"hset",c->argv[1],c->db->id);
server.dirty++;
}
五、集合对象
集合对象的编码有intset和hashtable两种
1. 编码转换
使用intset需要同时满足两个条件,只要有一个条件不满足,就会升级成hashtable编码
- 集合对象保存的所有元素都是小于64位整数
- 集合对象保存的元素个数少于152
元素个数少于512这个阈值是可以修改的,对应的配置文件中的set-max-intset-entries 512
选项
# Sets have a special encoding in just one case: when a set is composed
# of just strings that happen to be integers in radix 10 in the range
# of 64 bit signed integers.
# The following configuration setting sets the limit in the size of the
# set in order to use this special memory saving encoding.
set-max-intset-entries 512
需要注意的是intset和hashtable之间的转换是单向的,只能从ziplist升级到hashtable,反之则不行
127.0.0.1:6379> sadd s1 100 200
(integer) 2
127.0.0.1:6379> object encoding s1
"intset"
127.0.0.1:6379> sadd s1 zhangsan
(integer) 1
127.0.0.1:6379> object encoding s1
"hashtable"
127.0.0.1:6379>
2. 源码解析
/* Convert the set to specified encoding. The resulting dict (when converting
* to a hash table) is presized to hold the number of elements in the original
* set. */
void setTypeConvert(robj *setobj, int enc) {
setTypeIterator *si;
serverAssertWithInfo(NULL,setobj,setobj->type == OBJ_SET &&
setobj->encoding == OBJ_ENCODING_INTSET);
if (enc == OBJ_ENCODING_HT) {
int64_t intele;
dict *d = dictCreate(&setDictType,NULL);
sds element;
/* Presize the dict to avoid rehashing */
dictExpand(d,intsetLen(setobj->ptr));
/* To add the elements we extract integers and create redis objects */
si = setTypeInitIterator(setobj);
while (setTypeNext(si,&element,&intele) != -1) {
element = sdsfromlonglong(intele);
serverAssert(dictAdd(d,element,NULL) == DICT_OK);
}
setTypeReleaseIterator(si);
setobj->encoding = OBJ_ENCODING_HT;
zfree(setobj->ptr);
setobj->ptr = d;
} else {
serverPanic("Unsupported set conversion");
}
}
/* Add the specified value into a set.
*
* If the value was already member of the set, nothing is done and 0 is
* returned, otherwise the new element is added and 1 is returned. */
int setTypeAdd(robj *subject, sds value) {
long long llval;
if (subject->encoding == OBJ_ENCODING_HT) {
dict *ht = subject->ptr;
dictEntry *de = dictAddRaw(ht,value,NULL);
if (de) {
dictSetKey(ht,de,sdsdup(value));
dictSetVal(ht,de,NULL);
return 1;
}
} else if (subject->encoding == OBJ_ENCODING_INTSET) {
if (isSdsRepresentableAsLongLong(value,&llval) == C_OK) {
uint8_t success = 0;
subject->ptr = intsetAdd(subject->ptr,llval,&success);
if (success) {
/* Convert to regular set when the intset contains
* too many entries. */
if (intsetLen(subject->ptr) > server.set_max_intset_entries)
setTypeConvert(subject,OBJ_ENCODING_HT);
return 1;
}
} else {
/* Failed to get integer from object, convert to regular set. */
setTypeConvert(subject,OBJ_ENCODING_HT);
/* The set *was* an intset and this value is not integer
* encodable, so dictAdd should always work. */
serverAssert(dictAdd(subject->ptr,sdsdup(value),NULL) == DICT_OK);
return 1;
}
} else {
serverPanic("Unknown set encoding");
}
return 0;
}
六、有序集合对象
有序集合点编码有ziplist和skiplist两种,虽然有序集合对象的底层实现同时用了skiplist和hashtable,但是encoding只有skiplist
1. 编码转换
使用ziplist需要同时满足两个条件,只有有一个条件不满足,就会升级成skiplist+hashtable
- 有序集合对象保存的元素个数少于128
- 有序集合对象的所有元素成员的长度小于64字节
以上两个条件的阈值都可以修改的,在配置文件中分别对应zset-max-ziplist-entries
和zset-max-ziplist-value
# Similarly to hashes and lists, sorted sets are also specially encoded in
# order to save a lot of space. This encoding is only used when the length and
# elements of a sorted set are below the following limits:
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
需要注意的是ziplist和hashtable之间的转换是单向的,只能从ziplist升级到hashtable,反之则不行
127.0.0.1:6379> zadd z1 100 zhangsan
(integer) 1
127.0.0.1:6379> object encoding z1
"ziplist"
127.0.0.1:6379>
127.0.0.1:6379> zadd z1 99 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
(integer) 1
127.0.0.1:6379> object encoding z1
"skiplist"
127.0.0.1:6379>
2. 源码分析
/* Convert the sorted set object into a ziplist if it is not already a ziplist
* and if the number of elements and the maximum element size is within the
* expected ranges. */
void zsetConvertToZiplistIfNeeded(robj *zobj, size_t maxelelen) {
if (zobj->encoding == OBJ_ENCODING_ZIPLIST) return;
zset *zset = zobj->ptr;
if (zset->zsl->length <= server.zset_max_ziplist_entries &&
maxelelen <= server.zset_max_ziplist_value)
zsetConvert(zobj,OBJ_ENCODING_ZIPLIST);
}
上一篇: Redis的RESP协议
下一篇: MD5、对称加密、非对称加密