Redis源碼剖析之字典(dict)

新進(jìn)小設(shè)計(jì) 2021-06-25

展開全文

Dict在redis中是最為核心的一個(gè)數(shù)據(jù)結(jié)構(gòu)，因?yàn)樗休d了redis里的所有數(shù)據(jù)，你可以簡(jiǎn)單粗暴的認(rèn)為redis就是一個(gè)大的dict，里面存儲(chǔ)的所有的key-value。

redis中dict的本質(zhì)其實(shí)就是一個(gè)hashtable，所以它也需要考慮所有hashtable所有的問(wèn)題，如何組織K-V、如何處理hash沖突、擴(kuò)容策略及擴(kuò)容方式……。實(shí)際上Redis中hashtable的實(shí)現(xiàn)方式就是普通的hashtable，但Redis創(chuàng)新的引入了漸進(jìn)式hash以減小hashtable擴(kuò)容是對(duì)性能帶來(lái)的影響，接下來(lái)我們就來(lái)看看redis中hashtable的具體實(shí)現(xiàn)。

Redis中Dict的實(shí)現(xiàn)

dict的定義在dict.h中，其各個(gè)字段及其含義如下：

typedef struct dict {
    dictType *type;  // dictType結(jié)構(gòu)的指針，封裝了很多數(shù)據(jù)操作的函數(shù)指針，使得dict能處理任意數(shù)據(jù)類型（類似面向?qū)ο笳Z(yǔ)言的interface，可以重載其方法）
    void *privdata;  // 一個(gè)私有數(shù)據(jù)指針(privdata),由調(diào)用者在創(chuàng)建dict的時(shí)候傳進(jìn)來(lái)。
    dictht ht[2];  // 兩個(gè)hashtable，ht[0]為主，ht[1]在漸進(jìn)式hash的過(guò)程中才會(huì)用到。  
    long rehashidx; /* 增量hash過(guò)程過(guò)程中記錄rehash執(zhí)行到第幾個(gè)bucket了，當(dāng)rehashidx == -1表示沒有在做rehash */
    unsigned long iterators; /* 正在運(yùn)行的迭代器數(shù)量 */
} dict;

重點(diǎn)介紹下dictType *type字段(個(gè)人感覺命名為type不太合適)，其作用就是為了讓dict支持各種數(shù)據(jù)類型，因?yàn)椴煌臄?shù)據(jù)類型需要對(duì)應(yīng)不同的操作函數(shù)，比如計(jì)算hashcode 字符串和整數(shù)的計(jì)算方式就不一樣，所以dictType通過(guò)函數(shù)指針的方式，將不同數(shù)據(jù)類型的操作都封裝起來(lái)。從面相對(duì)象的角度來(lái)看，可以把dictType當(dāng)成dict中各種數(shù)據(jù)類型相關(guān)操作的interface，各個(gè)數(shù)據(jù)類型只需要實(shí)現(xiàn)其對(duì)應(yīng)的數(shù)據(jù)操作就行。 dictType中封裝了以下幾個(gè)函數(shù)指針。

typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);  // 對(duì)key生成hash值 
    void *(*keyDup)(void *privdata, const void *key); // 對(duì)key進(jìn)行拷貝 
    void *(*valDup)(void *privdata, const void *obj);  // 對(duì)val進(jìn)行拷貝
    int (*keyCompare)(void *privdata, const void *key1, const void *key2); // 兩個(gè)key的對(duì)比函數(shù)
    void (*keyDestructor)(void *privdata, void *key); // key的銷毀
    void (*valDestructor)(void *privdata, void *obj); // val的銷毀 
} dictType;

dict中還有另外一個(gè)重要的字段dictht ht[2]，dictht其實(shí)就是hashtable，但這里為什么是ht[2]? 這就不得不提到redis dict的漸進(jìn)式hash，dict的hashtable的擴(kuò)容不是一次性完成的，它是先建立一個(gè)大的新的hashtable存放在ht[1]中，然后逐漸把ht[0]的數(shù)據(jù)遷移到ht[1]中，rehashidx就是ht[0]中數(shù)據(jù)遷移的進(jìn)度，漸進(jìn)式hash的過(guò)程會(huì)在后文中詳解。

這里我們來(lái)看下dictht的定義：

typedef struct dictht {
    dictEntry **table;  // hashtable中的連續(xù)空間 
    unsigned long size; // table的大小 
    unsigned long sizemask;  // hashcode的掩碼  
    unsigned long used; // 已存儲(chǔ)的數(shù)據(jù)個(gè)數(shù)
} dictht;

其中dictEntry就是對(duì)dict中每對(duì)key-value的封裝，除了具體的key-value，其還包含一些其他信息，具體如下：

typedef struct dictEntry {
    void *key;
    union {   // dictEntry在不同用途時(shí)存儲(chǔ)不同的數(shù)據(jù) 
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next; // hash沖突時(shí)開鏈，單鏈表的next指針 
} dictEntry;

dict中的hashtable在出現(xiàn)hash沖突時(shí)采用的是開鏈方式，如果有多個(gè)entry落在同一個(gè)bucket中，那么他們就會(huì)串成一個(gè)單鏈表存儲(chǔ)。

如果我們將dict在內(nèi)存中的存儲(chǔ)繪制出來(lái)，會(huì)是下圖這個(gè)樣子。
在這里插入圖片描述

擴(kuò)容

在看dict幾個(gè)核心API實(shí)現(xiàn)之前，我們先來(lái)看下dict的擴(kuò)容，也就是redis的漸進(jìn)式hash。 何為漸進(jìn)式hash？redis為什么采用漸進(jìn)式hash？漸進(jìn)式hash又是如何實(shí)現(xiàn)的？

要回答這些問(wèn)題，我們先來(lái)考慮下hashtable擴(kuò)容的過(guò)程。如果熟悉java的同學(xué)可能知道，java中hashmap的擴(kuò)容是在數(shù)據(jù)元素達(dá)到某個(gè)閾值后，新建一個(gè)更大的空間，一次性把舊數(shù)據(jù)搬過(guò)去，搬完之后再繼續(xù)后續(xù)的操作。如果數(shù)據(jù)量過(guò)大的話，HashMap擴(kuò)容是非常耗時(shí)的，所有有些編程規(guī)范推薦new HashMap時(shí)最好指定其容量，防止出現(xiàn)自動(dòng)擴(kuò)容。

但是redis在新建dict的時(shí)候，沒法知道數(shù)據(jù)量大小，如果直接采用java hashmap的擴(kuò)容方式，因?yàn)閞edis是單線程的，勢(shì)必在擴(kuò)容過(guò)程中啥都干不了，阻塞掉后面的請(qǐng)求，最終影響到整個(gè)redis的性能。如何解決？其實(shí)也很簡(jiǎn)單，就是化整為零，將一次大的擴(kuò)容操作拆分成多次小的步驟，一步步來(lái)減少擴(kuò)容對(duì)其他操作的影響，其具體實(shí)現(xiàn)如下：

上文中我們已經(jīng)看到了在dict的定義中有個(gè)dictht ht[2]，dict在擴(kuò)容過(guò)程中會(huì)有兩個(gè)hashtable分別存儲(chǔ)在ht[0]和ht[1]中，其中ht[0]是舊的hashtable，ht[1]是新的更大的hashtable。

/* 檢查是否dict需要擴(kuò)容 */
static int _dictExpandIfNeeded(dict *d)
{
    /* 已經(jīng)在漸進(jìn)式hash的流程中了，直接返回 */
    if (dictIsRehashing(d)) return DICT_OK;

    /* If the hash table is empty expand it to the initial size. */
    if (d->ht[0].size == 0) return dictExpand(d, DICT_HT_INITIAL_SIZE);

    /* 當(dāng)配置了可擴(kuò)容時(shí)，容量負(fù)載達(dá)到100%就擴(kuò)容。配置不可擴(kuò)容時(shí)，負(fù)載達(dá)到5也會(huì)強(qiáng)制擴(kuò)容*/
    if (d->ht[0].used >= d->ht[0].size &&
        (dict_can_resize ||
         d->ht[0].used/d->ht[0].size > dict_force_resize_ratio))
    {
        return dictExpand(d, d->ht[0].used*2); // 擴(kuò)容一倍容量
    }
    return DICT_OK;
}

Redis在每次查找某個(gè)key的索引下標(biāo)時(shí)都會(huì)檢查是否需要對(duì)ht[0]做擴(kuò)容，如果配置的是可以擴(kuò)容那么當(dāng)hashtable使用率超過(guò)100%(uesed/size)就觸發(fā)擴(kuò)容，否則使用率操作500%時(shí)強(qiáng)制擴(kuò)容。執(zhí)行擴(kuò)容的代碼如下：

/* dict的創(chuàng)建和擴(kuò)容 */ 
int dictExpand(dict *d, unsigned long size)
{
    /* 如果size比hashtable中的元素個(gè)數(shù)還小，那size就是無(wú)效的，直接返回error */
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;

    dictht n; /* 新的hashtable */
    // 擴(kuò)容時(shí)新table容量是大于當(dāng)前size的最小2的冪次方，但有上限 
    unsigned long realsize = _dictNextPower(size);

    // 如果新容量和舊容量一致，沒有必要繼續(xù)執(zhí)行了，返回err
    if (realsize == d->ht[0].size) return DICT_ERR;

    /* 新建一個(gè)容量更大的hashtable */
    n.size = realsize;
    n.sizemask = realsize-1;
    n.table = zcalloc(realsize*sizeof(dictEntry*));
    n.used = 0;

    // 如果是dict初始化的情況，直接把新建的hashtable賦值給ht[0]就行 
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    // 非初始化的情況，將新表賦值給ht[1], 然后標(biāo)記rehashidx 0
    d->ht[1] = n;
    d->rehashidx = 0; // rehashidx表示當(dāng)前rehash到ht[0]的下標(biāo)位置
    return DICT_OK;
}

這里dictExpand只是創(chuàng)建了新的空間，將rehashidx標(biāo)記為0(rehashidx==-1表示不在rehash的過(guò)程中)，并未對(duì)ht[0]中的數(shù)據(jù)遷移到ht[1]中。數(shù)據(jù)遷移的邏輯都在_dictRehashStep()中。 _dictRehashStep()是只遷移一個(gè)bucket，它在dict的查找、插入、刪除的過(guò)程中都會(huì)被調(diào)到，每次調(diào)用至少遷移一個(gè)bucket。 而dictRehash()是_dictRehashStep()的具體實(shí)現(xiàn)，代碼如下：

 /* redis漸進(jìn)式hash，采用分批的方式，逐漸將ht[0]依下標(biāo)轉(zhuǎn)移到ht[2],避免了hashtable擴(kuò)容時(shí)大量
 * 數(shù)據(jù)遷移導(dǎo)致的性能問(wèn)題
 * 參數(shù)n是指這次rehash只做n個(gè)bucket */
int dictRehash(dict *d, int n) {
    int empty_visits = n*10; /* 最大空bucket數(shù)量，如果遇到empty_visits個(gè)空bucket，直接結(jié)束當(dāng)前rehash的過(guò)程 */
    if (!dictIsRehashing(d)) return 0;

    while(n-- && d->ht[0].used != 0) {
        dictEntry *de, *nextde;

        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1; // 如果遇到了empty_visits個(gè)空的bucket，直接結(jié)束 
        }
        // 遍歷當(dāng)前bucket中的鏈表，直接將其移動(dòng)到新的hashtable中  
        de = d->ht[0].table[d->rehashidx];
        /* 把所有的key從舊的hash桶移到新的hash桶中 */
        while(de) {
            uint64_t h;

            nextde = de->next;
            /* 獲取到key在新hashtable中的下標(biāo) */
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

    /* 檢測(cè)是否已對(duì)全表做完了rehash */
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);  // 釋放舊ht所占用的內(nèi)存空間  
        d->ht[0] = d->ht[1];  // ht[0]始終是在用ht，ht[1]始終是新ht，ht0全遷移到ht1后會(huì)交換下  
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;   
        return 0;  // 如果全表hash完，返回0
    }

    /* 還需要繼續(xù)做hash返回1 */
    return 1;
}

可以看出，rehash就是分批次把ht[0]中的數(shù)據(jù)搬到ht[1]中，這樣將原有的一個(gè)大操作拆分為很多個(gè)小操作逐步進(jìn)行，避免了redis發(fā)生dict擴(kuò)容是瞬時(shí)不可用的情況，缺點(diǎn)是在redis擴(kuò)容過(guò)程中會(huì)占用倆份存儲(chǔ)空間，而且占用時(shí)間會(huì)比較長(zhǎng)。

核心API

插入

/* 向dict中添加元素 */
int dictAdd(dict *d, void *key, void *val)
{
    dictEntry *entry = dictAddRaw(d,key,NULL);  
    // 
    if (!entry) return DICT_ERR;  
    dictSetVal(d, entry, val);
    return DICT_OK;
}

/* 添加和查找的底層實(shí)現(xiàn)：  
 * 這個(gè)函數(shù)只會(huì)返回key對(duì)應(yīng)的entry，并不會(huì)設(shè)置key對(duì)應(yīng)的value，而是把設(shè)值權(quán)交給調(diào)用者。 
 * 
 * 這個(gè)函數(shù)也作為一個(gè)API直接暴露給用戶調(diào)用，主要是為了在dict中存儲(chǔ)非指針類的數(shù)據(jù)，比如
 * entry = dictAddRaw(dict,mykey,NULL);
 * if (entry != NULL) dictSetSignedIntegerVal(entry,1000);
 *
 * 返回值:
 * 如果key已經(jīng)存在于dict中了，直接返回null，并把已經(jīng)存在的entry指針?lè)诺?amp;existing里。否則
 * 為key新建一個(gè)entry并返回其指針。 
*/
dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing)
{
    long index;
    dictEntry *entry;
    dictht *ht;

    if (dictIsRehashing(d)) _dictRehashStep(d);

    /* 獲取到新元素的下標(biāo)，如果返回-1標(biāo)識(shí)該元素已經(jīng)存在于dict中了，直接返回null */
    if ((index = _dictKeyIndex(d, key, dictHashKey(d,key), existing)) == -1)
        return NULL;

    /* 否則就給新元素分配內(nèi)存，并將其插入到鏈表的頭部(一般新插入的數(shù)據(jù)被訪問(wèn)的頻次會(huì)更高)*/
    ht = dictIsRehashing(d) ? &d->ht[1] : &d->ht[0];
    entry = zmalloc(sizeof(*entry));
    entry->next = ht->table[index];
    ht->table[index] = entry;
    ht->used++;

    /* 如果是新建的entry，需要把key填進(jìn)去 */
    dictSetKey(d, entry, key);
    return entry;
}

插入過(guò)程也比較簡(jiǎn)單，就是先定位bucket的下標(biāo)，然后插入到單鏈表的頭節(jié)點(diǎn)，注意這里也需要考慮到rehash的情況，如果是在rehash過(guò)程中，新數(shù)據(jù)一定是插入到ht[1]中的。

查找

dictEntry *dictFind(dict *d, const void *key)
{
    dictEntry *he;
    uint64_t h, idx, table;

    if (dictSize(d) == 0) return NULL; /* dict為空 */
    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);
    // 查找的過(guò)程中，可能正在rehash中，所以新老兩個(gè)hashtable都需要查 
    for (table = 0; table <= 1; table++) {
        idx = h & d->ht[table].sizemask;
        he = d->ht[table].table[idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key))
                return he;
            he = he->next;
        }
        // 如果ht[0]中沒找到，且不再rehas中，就不需要繼續(xù)找了ht[1]了。 
        if (!dictIsRehashing(d)) return NULL;
    }
    return NULL;
}

查找的過(guò)程比較簡(jiǎn)單，就是用hashcode做定位，然后遍歷單鏈表。但這里需要考慮到如果是在rehash過(guò)程中，可能需要查找ht[2]中的兩個(gè)hashtable。

刪除

/* 查找并刪除一個(gè)元素，是dictDelete()和dictUnlink()的輔助函數(shù)。*/
static dictEntry *dictGenericDelete(dict *d, const void *key, int nofree) {
    uint64_t h, idx;
    dictEntry *he, *prevHe;
    int table;

    if (d->ht[0].used == 0 && d->ht[1].used == 0) return NULL;

    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);

    // 這里也是需要考慮到rehash的情況，ht[0]和ht[1]中的數(shù)據(jù)都要?jiǎng)h除掉 
    for (table = 0; table <= 1; table++) {
        idx = h & d->ht[table].sizemask;
        he = d->ht[table].table[idx];
        prevHe = NULL;
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key)) {
                /* 從列表中unlink掉元素 */
                if (prevHe)
                    prevHe->next = he->next;
                else
                    d->ht[table].table[idx] = he->next;
                // 如果nofree是0，需要釋放k和v對(duì)應(yīng)的內(nèi)存空間 
                if (!nofree) {
                    dictFreeKey(d, he);
                    dictFreeVal(d, he);
                    zfree(he);
                }
                d->ht[table].used--;
                return he;
            }
            prevHe = he;
            he = he->next;
        }
        if (!dictIsRehashing(d)) break;
    }
    return NULL; /* 沒找到key對(duì)應(yīng)的數(shù)據(jù) */
}

其它API

其他的API實(shí)現(xiàn)都比較簡(jiǎn)單，我在dict.c源碼中做了大量的注釋，有興趣可以自行閱讀下，我這里僅列舉并說(shuō)明下其大致的功能。

dict *dictCreate(dictType *type, void *privDataPtr);  // 創(chuàng)建dict 
int dictExpand(dict *d, unsigned long size);  // 擴(kuò)縮容
int dictAdd(dict *d, void *key, void *val);  // 添加k-v
dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing); // 添加的key對(duì)應(yīng)的dictEntry 
dictEntry *dictAddOrFind(dict *d, void *key); // 添加或者查找 
int dictReplace(dict *d, void *key, void *val); // 替換key對(duì)應(yīng)的value，如果沒有就添加新的k-v
int dictDelete(dict *d, const void *key);  // 刪除某個(gè)key對(duì)應(yīng)的數(shù)據(jù) 
dictEntry *dictUnlink(dict *ht, const void *key); // 卸載某個(gè)key對(duì)應(yīng)的entry 
void dictFreeUnlinkedEntry(dict *d, dictEntry *he); // 卸載并清除key對(duì)應(yīng)的entry
void dictRelease(dict *d);  // 釋放整個(gè)dict 
dictEntry * dictFind(dict *d, const void *key);  // 數(shù)據(jù)查找
void *dictFetchValue(dict *d, const void *key);  // 獲取key對(duì)應(yīng)的value
int dictResize(dict *d);  // 重設(shè)dict的大小，主要是縮容用的
/************    迭代器相關(guān)     *********** */
dictIterator *dictGetIterator(dict *d);  
dictIterator *dictGetSafeIterator(dict *d);
dictEntry *dictNext(dictIterator *iter);
void dictReleaseIterator(dictIterator *iter);
/************    迭代器相關(guān)     *********** */
dictEntry *dictGetRandomKey(dict *d);  // 隨機(jī)返回一個(gè)entry 
dictEntry *dictGetFairRandomKey(dict *d);   // 隨機(jī)返回一個(gè)entry，但返回每個(gè)entry的概率會(huì)更均勻 
unsigned int dictGetSomeKeys(dict *d, dictEntry **des, unsigned int count); // 獲取dict中的部分?jǐn)?shù)據(jù)

其他的API見代碼dict.c和dict.h.

本文是Redis源碼剖析系列博文，同時(shí)也有與之對(duì)應(yīng)的Redis中文注釋版，有想深入學(xué)習(xí)Redis的同學(xué)，歡迎star和關(guān)注。
Redis中文注解版?zhèn)}庫(kù)：https://github.com/xindoo/Redis
Redis源碼剖析專欄：https:///s/1h
如果覺得本文對(duì)你有用，歡迎一鍵三連。
本文來(lái)自https://blog.csdn.net/xindoo

本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購(gòu)買等信息，謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請(qǐng)點(diǎn)擊一鍵舉報(bào)。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻(xiàn)花（0） +1

來(lái)自：新進(jìn)小設(shè)計(jì) > 《待分類》

舉報(bào)/認(rèn)領(lǐng)