redis源码之string对象
redis源码之string对象
Sds是redis的5种数据类型中最简单的一种,但是redis并不是直接使用sds,redis对给所有类型进行封装,组成一个object,所以说在redis内部流传的是object结构。
typedef struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time). */
int refcount;
void *ptr;
} robj;
可以看到,该结构里面包含了type
和encoding
字段,其中type
就是我们通常说的数据类型,包括string、list、hash、set、hashset,但是里面具体是怎么存储的,这就是由encoding
字段标识的。
以最简单的sds为例,源码中用createStringObject
来封装一个string类型。其中一段代码是
/* Create a string object with EMBSTR encoding if it is smaller than
* OBJ_ENCODING_EMBSTR_SIZE_LIMIT, otherwise the RAW encoding is
* used.
*
* The current limit of 44 is chosen so that the biggest string object
* we allocate as EMBSTR will still fit into the 64 byte arena of jemalloc. */
#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
robj *createStringObject(const char *ptr, size_t len) {
if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT)
return createEmbeddedStringObject(ptr,len);
else
return createRawStringObject(ptr,len);
}
由此可见,当字符串长度<44时,会创建embstr,那么embstr究竟是什么呢?查看robj结构我们发现,字段ptr
指向的是真正是数据区(万能的void *),那么通常的,我们在其他地方创建一个对象,然后用ptr指向它就好了。我们先看rawstring是怎么创建的。
/* Create a string object with encoding OBJ_ENCODING_RAW, that is a plain
* string object where o->ptr points to a proper sds string. */
robj *createRawStringObject(const char *ptr, size_t len) {
return createObject(OBJ_STRING, sdsnewlen(ptr,len));
}
... ...
robj *createObject(int type, void *ptr) {
robj *o = zmalloc(sizeof(*o));
o->type = type;
o->encoding = OBJ_ENCODING_RAW;
o->ptr = ptr;
o->refcount = 1;
/* Set the LRU to the current lruclock (minutes resolution), or
* alternatively the LFU counter. */
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
} else {
o->lru = LRU_CLOCK();
}
return o;
}
的确如我们所料,在创建rawstring的时候先创建了个object,在吧object的ptr指向sds对象,这也是通常的做法。那么embstr又是怎么回事?看看源码吧~
/* Create a string object with encoding OBJ_ENCODING_EMBSTR, that is
* an object where the sds string is actually an unmodifiable string
* allocated in the same chunk as the object itself. */
robj *createEmbeddedStringObject(const char *ptr, size_t len) {
robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);
struct sdshdr8 *sh = (void*)(o+1);
o->type = OBJ_STRING;
o->encoding = OBJ_ENCODING_EMBSTR;
o->ptr = sh+1;
o->refcount = 1;
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
} else {
o->lru = LRU_CLOCK();
}
sh->len = len;
sh->alloc = len;
sh->flags = SDS_TYPE_8;
if (ptr == SDS_NOINIT)
sh->buf[len] = '\0';
else if (ptr) {
memcpy(sh->buf,ptr,len);
sh->buf[len] = '\0';
} else {
memset(sh->buf,0,len+1);
}
return o;
}
我们看到,在rawstring中创建object时,一下子多余分配了sizeof(struct sdshdr8)+len+1
个字节,这么做是为什么呢?哦,原来是利用的局部原理,我们将object和sds放在连续的地址空间上,这样在读取的时候一次就可以加载了,而像rawstring,需要先读取object结构,然后再根据obj->ptr定位到sds的地址,再读取一次。所以说embstr减少了内存的读取次数,将两次操作用一次完成。
为什么OBJ_ENCODING_EMBSTR_SIZE_LIMIT定义是44呢?
robj占用16个字节,sdshdr8的结构如下,buf是柔性数组,整个结构只占用3个字节。
struct __attribute__ ((__packed__)) sdshdr8 {
uint8_t len; /* used */
uint8_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
在sizeof(robj)+sizeof(struct sdshdr8)+len+1
中出了len未知,其余都是已知值,size=16+3+len+1=20+len;当len取44的时候,size=64,其实注释已经说了 we allocate as EMBSTR will still fit into the 64 byte arena of jemalloc.
最后,列举一下redis中encoding的取值
/* Objects encoding. Some kind of objects like Strings and Hashes can be
* internally represented in multiple ways. The 'encoding' field of the object
* is set to one of this fields for this object. */
#define OBJ_ENCODING_RAW 0 /* Raw representation */
#define OBJ_ENCODING_INT 1 /* Encoded as integer */
#define OBJ_ENCODING_HT 2 /* Encoded as hash table */
#define OBJ_ENCODING_ZIPMAP 3 /* Encoded as zipmap */
#define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define OBJ_ENCODING_INTSET 6 /* Encoded as intset */
#define OBJ_ENCODING_SKIPLIST 7 /* Encoded as skiplist */
#define OBJ_ENCODING_EMBSTR 8 /* Embedded sds string encoding */
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
#define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */
上一篇: redis源码之辅助线程
下一篇: 一图读懂集群与分布式