一致性Hash算法(KetamaHash)的c#實現(xiàn)

命運之輪 2011-12-21

展開全文

最近在研究"一致性HASH算法"（Consistent Hashing），用于解決memcached集群中當服務器出現(xiàn)增減變動時對散列值的影響。后來在JAVAEYE上的一篇文章中，找到了其中的 KetamaHash 算法的JAVA實現(xiàn)（一種基于虛擬結(jié)點的HASH算法），于是為了加深理解，對照 JAVA版本，用C#重寫了一個。放到這里，如果大家感興趣的話，可以下載測試一下，如果發(fā)現(xiàn)寫法有問題請及時告之我，以便我及時修正。

下面是對Ketama的介紹：

Ketama is an implementation of a consistent hashing algorithm, meaning you can add or remove servers from the memcached pool without causing a complete remap of all keys.
Here’s how it works:
* Take your list of servers (eg: 1.2.3.4:11211, 5.6.7.8:11211, 9.8.7.6:11211)
* Hash each server string to several (100-200) unsigned ints
* Conceptually, these numbers are placed on a circle called the continuum. (imagine a clock face that goes from 0 to 2^32)
* Each number links to the server it was hashed from, so servers appear at several points on the continuum, by each of the numbers they hashed to.
* To map a key->server, hash your key to a single unsigned int, and find the next biggest number on the continuum. The server linked to that number is the correct server for that key.
* If you hash your key to a value near 2^32 and there are no points on the continuum greater than your hash, return the first server in the continuum.
If you then add or remove a server from the list, only a small proportion of keys end up mapping to different servers.

我的理解，這類方法其實有點像大學里的微積分的思想（通過連續(xù)函數(shù)的取值區(qū)間來計算圖形面積）。通過把已知的實結(jié)點（memcached服務IP端口）列表結(jié)成一個圓，然后在兩兩實結(jié)點之間“放置”盡可能多的虛擬節(jié)點(上面文中的unsigned ints)，假設用戶數(shù)據(jù)映射在虛擬節(jié)點上（用戶數(shù)據(jù)真正存儲位置是在該虛擬節(jié)點代表的實際物理服務器上），這樣就能抑制分布不均勻，最大限度地減小服務器增減時的緩存重新分布。如下圖：

     下面是添加結(jié)點時：

      如這篇文章所說（總結(jié)一致性哈希(Consistent Hashing) ）：

     Consistent Hashing最大限度地抑制了hash鍵的重新分布。另外要取得比較好的負載均衡的效果，往往在服務器數(shù)量比較少的時候需要增加虛擬節(jié)點來保證服務器能均勻的分布在圓環(huán)上。因為使用一般的hash方法，服務器的映射地點的分布非常不均勻。使用虛擬節(jié)點的思想，為每個物理節(jié)點（服務器）在圓上分配100～200個點。這樣就能抑制分布不均勻，最大限度地減小服務器增減時的緩存重新分布。用戶數(shù)據(jù)映射在虛擬節(jié)點上，就表示用戶數(shù)據(jù)真正存儲位置是在該虛擬節(jié)點代表的實際物理服務器上。

了解了原理，下面看一下具體實現(xiàn)。

       JAVA實現(xiàn)代碼取自Spy Memcached client中的類，原文的作者也盡可能的對代碼進行注釋說明，所以讓我剩了不少時間。

       下面是相應的.NET實現(xiàn)（注釋參考JAVA版本）:

public class KetamaNodeLocator
{
    //原文中的JAVA類TreeMap實現(xiàn)了Comparator方法，這里我圖省事，直接用了net下的SortedList，其中Comparer接口方法）
    private SortedList<long, string> ketamaNodes = new SortedList<long, string>();
    private HashAlgorithm hashAlg;
    private int numReps = 160;

    //此處參數(shù)與JAVA版中有區(qū)別，因為使用的靜態(tài)方法，所以不再傳遞HashAlgorithm alg參數(shù)
    public KetamaNodeLocator(List<string> nodes, int nodeCopies)
    {
        ketamaNodes = new SortedList<long, string>();

        numReps = nodeCopies;
        //對所有節(jié)點，生成nCopies個虛擬結(jié)點
        foreach (string node in nodes)
        {
            //每四個虛擬結(jié)點為一組
            for (int i = 0; i < numReps / 4; i++)
            {
                //getKeyForNode方法為這組虛擬結(jié)點得到惟一名稱
                byte[] digest = HashAlgorithm.computeMd5(node + i);
                /** Md5是一個16字節(jié)長度的數(shù)組，將16字節(jié)的數(shù)組每四個字節(jié)一組，分別對應一個虛擬結(jié)點，這就是為什么上面把虛擬結(jié)點四個劃分一組的原因*/
                for (int h = 0; h < 4; h++)
                {
                    long m = HashAlgorithm.hash(digest, h);
                    ketamaNodes[m] = node;
                }
            }
        }
    }

    public string GetPrimary(string k)
    {
        byte[] digest = HashAlgorithm.computeMd5(k);
        string rv = GetNodeForKey(HashAlgorithm.hash(digest, 0));
        return rv;
    }

    string GetNodeForKey(long hash)
    {
        string rv;
        long key = hash;
        //如果找到這個節(jié)點，直接取節(jié)點，返回
        if (!ketamaNodes.ContainsKey(key))
        {
            //得到大于當前key的那個子Map，然后從中取出第一個key，就是大于且離它最近的那個key 說明詳見: http://www./topic/684087
            var tailMap = from coll in ketamaNodes
                          where coll.Key > hash
                          select new { coll.Key };
            if (tailMap == null || tailMap.Count() == 0)
                key = ketamaNodes.FirstOrDefault().Key;
            else
                key = tailMap.FirstOrDefault().Key;
        }
        rv = ketamaNodes[key];
        return rv;
    }
}

      下面的代碼與JAVA中有所不同，它使用靜態(tài)方法實現(xiàn)：

public class HashAlgorithm
{
    public static long hash(byte[] digest, int nTime)
    {
        long rv = ((long)(digest[3 + nTime * 4] & 0xFF) << 24)
                | ((long)(digest[2 + nTime * 4] & 0xFF) << 16)
                | ((long)(digest[1 + nTime * 4] & 0xFF) << 8)
                | ((long)digest[0 + nTime * 4] & 0xFF);

        return rv & 0xffffffffL; /* Truncate to 32-bits */
    }

    /**
     * Get the md5 of the given key.
     */
    public static byte[] computeMd5(string k)
    {
        MD5 md5 = new MD5CryptoServiceProvider();

        byte[] keyBytes = md5.ComputeHash(Encoding.UTF8.GetBytes(k));
        md5.Clear();
        //md5.update(keyBytes);
        //return md5.digest();
        return keyBytes;
    }
}

     下面是.net版本下的測試結(jié)果

       分布平均性測試:測試隨機生成的眾多key是否會平均分布到各個結(jié)點上測試結(jié)果如下:



      最上面一行是參數(shù)說明，節(jié)點數(shù)目，總共有多少key，每個節(jié)點應該分配key的比例是多少。下面是每個結(jié)點分配到key的數(shù)目和比例。多次測試后發(fā)現(xiàn)，這個Hash算法的節(jié)點分布都在標準比例左右徘徊。

節(jié)點增刪測試:在環(huán)上插入N個結(jié)點，每個節(jié)點nCopies個虛擬結(jié)點。隨機生成眾多key，在增刪節(jié)點時，測試同一個key選擇相同節(jié)點的概率，測試如果如下:

上面三行分別是正常情況，節(jié)點增加，節(jié)點刪除情況下的節(jié)點數(shù)目。下面兩行表示在節(jié)點增加和刪除情況下，同一個key分配在相同節(jié)點上的比例(命中率)。

     后來我修改了幾次增刪結(jié)點的數(shù)量，基本驗證了JAVA那位仁兄所說的那樣：

     多次測試后發(fā)現(xiàn)，命中率與結(jié)點數(shù)目和增減的節(jié)點數(shù)量有關。同樣增刪結(jié)點數(shù)目情況下，結(jié)點多時命中率高。同樣節(jié)點數(shù)目，增刪結(jié)點越少，命中率越高。這些都與實際情況相符。

   這里還有一些鏈接，都是介紹和討論Consistent Hashing的，有興趣的朋友可以看一下，呵呵：)

     總結(jié)一致性哈希(Consistent Hashing)

     Ketama一致性Hash算法學習(含Java代碼)