愿你历尽千帆,归来仍是少年
首页 Home 关于我 About 程序设计 Program 旅行摄影 Travel 碎言碎语 Saying 文档资源 Download 留言 Saying

您是小站的第 13170 位访客,欢迎~

您现在的位置是: 网站首页 > 程序设计  > redis

redis源码之string对象

2020年8月6日 04:36 共141人围观

redis源码
简介我们知道redis有5种常见的数据类型,那么这些结构具体在redis里面是怎么使用的?

redis源码之string对象

Sds是redis的5种数据类型中最简单的一种,但是redis并不是直接使用sds,redis对给所有类型进行封装,组成一个object,所以说在redis内部流传的是object结构。

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

可以看到,该结构里面包含了type和encoding字段,其中type就是我们通常说的数据类型,包括string、list、hash、set、hashset,但是里面具体是怎么存储的,这就是由encoding字段标识的。

以最简单的sds为例,源码中用createStringObject来封装一个string类型。其中一段代码是

 /* Create a string object with EMBSTR encoding if it is smaller than
  * OBJ_ENCODING_EMBSTR_SIZE_LIMIT, otherwise the RAW encoding is
  * used.
  *
  * The current limit of 44 is chosen so that the biggest string object
  * we allocate as EMBSTR will still fit into the 64 byte arena of jemalloc. */
#define OBJ_ENCODING_EMBSTR_SIZE_LIMIT 44
robj *createStringObject(const char *ptr, size_t len) {
    if (len <= OBJ_ENCODING_EMBSTR_SIZE_LIMIT)
        return createEmbeddedStringObject(ptr,len);
    else
        return createRawStringObject(ptr,len);
}

由此可见,当字符串长度<44时,会创建embstr,那么embstr究竟是什么呢?查看robj结构我们发现,字段ptr指向的是真正是数据区(万能的void *),那么通常的,我们在其他地方创建一个对象,然后用ptr指向它就好了。我们先看rawstring是怎么创建的。

/* Create a string object with encoding OBJ_ENCODING_RAW, that is a plain
 * string object where o->ptr points to a proper sds string. */
robj *createRawStringObject(const char *ptr, size_t len) {
    return createObject(OBJ_STRING, sdsnewlen(ptr,len));
}

... ...

robj *createObject(int type, void *ptr) {
    robj *o = zmalloc(sizeof(*o));
    o->type = type;
    o->encoding = OBJ_ENCODING_RAW;
    o->ptr = ptr;
    o->refcount = 1;

    /* Set the LRU to the current lruclock (minutes resolution), or
     * alternatively the LFU counter. */
    if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
        o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
    } else {
        o->lru = LRU_CLOCK();
    }
    return o;
}

的确如我们所料,在创建rawstring的时候先创建了个object,在吧object的ptr指向sds对象,这也是通常的做法。那么embstr又是怎么回事?看看源码吧~

/* Create a string object with encoding OBJ_ENCODING_EMBSTR, that is
 * an object where the sds string is actually an unmodifiable string
 * allocated in the same chunk as the object itself. */
robj *createEmbeddedStringObject(const char *ptr, size_t len) {
    robj *o = zmalloc(sizeof(robj)+sizeof(struct sdshdr8)+len+1);
    struct sdshdr8 *sh = (void*)(o+1);

    o->type = OBJ_STRING;
    o->encoding = OBJ_ENCODING_EMBSTR;
    o->ptr = sh+1;
    o->refcount = 1;
    if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
        o->lru = (LFUGetTimeInMinutes()<<8) | LFU_INIT_VAL;
    } else {
        o->lru = LRU_CLOCK();
    }

    sh->len = len;
    sh->alloc = len;
    sh->flags = SDS_TYPE_8;
    if (ptr == SDS_NOINIT)
        sh->buf[len] = '\0';
    else if (ptr) {
        memcpy(sh->buf,ptr,len);
        sh->buf[len] = '\0';
    } else {
        memset(sh->buf,0,len+1);
    }
    return o;
}

我们看到,在rawstring中创建object时,一下子多余分配了sizeof(struct sdshdr8)+len+1个字节,这么做是为什么呢?哦,原来是利用的局部原理,我们将object和sds放在连续的地址空间上,这样在读取的时候一次就可以加载了,而像rawstring,需要先读取object结构,然后再根据obj->ptr定位到sds的地址,再读取一次。所以说embstr减少了内存的读取次数,将两次操作用一次完成。

为什么OBJ_ENCODING_EMBSTR_SIZE_LIMIT定义是44呢?

robj占用16个字节,sdshdr8的结构如下,buf是柔性数组,整个结构只占用3个字节。

struct __attribute__ ((__packed__)) sdshdr8 {
    uint8_t len; /* used */
    uint8_t alloc; /* excluding the header and null terminator */
    unsigned char flags; /* 3 lsb of type, 5 unused bits */
    char buf[];
};

在sizeof(robj)+sizeof(struct sdshdr8)+len+1中出了len未知,其余都是已知值,size=16+3+len+1=20+len;当len取44的时候,size=64,其实注释已经说了 we allocate as EMBSTR will still fit into the 64 byte arena of jemalloc.

最后,列举一下redis中encoding的取值

/* Objects encoding. Some kind of objects like Strings and Hashes can be
 * internally represented in multiple ways. The 'encoding' field of the object
 * is set to one of this fields for this object. */
#define OBJ_ENCODING_RAW 0     /* Raw representation */
#define OBJ_ENCODING_INT 1     /* Encoded as integer */
#define OBJ_ENCODING_HT 2      /* Encoded as hash table */
#define OBJ_ENCODING_ZIPMAP 3  /* Encoded as zipmap */
#define OBJ_ENCODING_LINKEDLIST 4 /* No longer used: old list encoding. */
#define OBJ_ENCODING_ZIPLIST 5 /* Encoded as ziplist */
#define OBJ_ENCODING_INTSET 6  /* Encoded as intset */
#define OBJ_ENCODING_SKIPLIST 7  /* Encoded as skiplist */
#define OBJ_ENCODING_EMBSTR 8  /* Embedded sds string encoding */
#define OBJ_ENCODING_QUICKLIST 9 /* Encoded as linked list of ziplists */
#define OBJ_ENCODING_STREAM 10 /* Encoded as a radix tree of listpacks */

上一篇: redis源码之辅助线程

下一篇: 一图读懂集群与分布式

相关文章

  • redis源码之一步一步解析客户端连接请求
  • redis源码之线程模型
  • redis源码之辅助线程


关于我

  • github
  • csdn

CDN资源库

  • bootcdn
  • baiducdn

常用教程

  • w3school
  • 菜鸟教程

云服务器

  • 阿里云
  • 腾讯云
粤ICP备20054363号 | 博客已运行