SQL常見的可優(yōu)化點

小豬窩969 2015-07-20

展開全文

# ###########################################
# 索引相關(guān)
# ###########################################
– 查詢（或更新，刪除，可以轉(zhuǎn)換為查詢）沒有用到索引
    這是最基礎(chǔ)的步驟，需要對sql執(zhí)行explain查看執(zhí)行計劃中是否用到了索引，需要重點關(guān)注type=ALL, key=NULL的字段。
– 在索引字段上施加函數(shù)
    to_char(gmt_created, ‘mmdd’) = ’0101′
    正確的寫法
    gmt_created between to_date(“20090101″, “yyyymmdd”) and to_date(“20090102″, “yyyymmdd”)
– 在索引字段上使用全模糊
    member_id like ‘%alibab%’
    B樹無法解決此類問題，可以考慮搜索引擎。
    但是member_id like ‘a(chǎn)libab%’可以用到索引。
    其實，對任何一個字段使用 like ‘%xxxx%’都是一種不規(guī)范的做法，需要能檢查到這種錯誤用法。
– 多列字段的索引，沒有用到前導(dǎo)索引
    索引：(memeber_id, group_id)
    where group_id=9234
    實際上，這個條件是沒有辦法用到上面的索引的。
    這是一個很常見的誤用。要理解為什么不能用到這個索引，需要理解mysql如何構(gòu)造多列索引的。
    索引是一棵B樹，問題是，對于多列索引，mysql將索引字段按照索引建立的順序進(jìn)行拼裝，組成一個新的字符串，這個字符串被用來做為構(gòu)建B樹的鍵。所以，在查詢條件里，如果沒有用到前導(dǎo)列，就沒辦法訪問多列索引的B樹。
    應(yīng)該建立索引：(group_id, member_id)
– 訪問到了索引之外的字段
    索引(member_id, subject)
    select subject from offer where member_id=234
    在member_id=234記錄數(shù)很多的情況下，會優(yōu)于
    select subject, gmt_created from offer where member_id=234
    原因是第二條sql會根據(jù)索引查找到的rowid訪問表里的記錄。第一條sql使用索引范圍掃描就可以得到結(jié)果。
    如果某個sql執(zhí)行次數(shù)很多，但是讀取的字段沒有被索引覆蓋，那么，可能需要建立覆蓋性索引。
– 計數(shù)count(id)有時比count(*)慢
    count(id) === count(1) where id is not null
    如果沒有(id)索引，那么會用全表掃描，而count(*)會使用最優(yōu)的索引
    進(jìn)行用索引快速全掃描
    計數(shù)統(tǒng)一使用count(*)
– 正確使用stop機制
    判斷member_id在offer表中是否存在記錄：
    select count(*) from offer where member_id=234 limit 1
    優(yōu)于
    select count(*) from offer where member_id=234
    原因是第一條sql會在得到第一條符合條件的記錄后停止。
    
# ###########################################
# 高效分頁
# ###########################################
– 高效的分頁
    使用join技術(shù)，利用索引查找到符合條件的id，構(gòu)造成臨時表，用這個小
    的臨時表于原表做join
    select *
    from
    (
        select t.*, rownum AS rn
        from
            (select * from blog.blog_article
            where domain_id=1
            and draft=0
            order by domain_id, draft, gmt_created desc) t
        where rownum >= 2
    ) a
    where a.rn <= 3
    應(yīng)該改寫成
    select blog_article.*
    from
    (
        select rid, rownum as rn
        from
        (
        select rowid as id  from blog.blog_article
        where domain_id=1
        and draft=0
        order by domain_id, draft, gmt_created desc
        ) t
        where rownum >= 2
    ) a, blog_article
    where a.rn >= 3
    and a.rid = blog_article.rowid
– order by沒有用到索引
    有索引（a, b, ）
    混合排序規(guī)則
    ORDER BY a ASC, b DESC, c DESC /* mixed sort direction */
    缺失了前導(dǎo)列
    WHERE g = const ORDER BY b, c /* a prefix is missing */
    缺失了中間列
    WHERE a = const ORDER BY c /* b is missing */
    使用了不在索引中的列進(jìn)行排序
    WHERE a = const ORDER BY a, d /* d is not part of index */
    
# ###########################################
# 高效地利用primary key
# ###########################################
– 隨機查詢
    一個錯誤的做法：
    select *
    from title
    where kind_id=1
    order by rand()
    limit 1;
    create index k on title(kind_id);
    這個sql執(zhí)行過程中需要全表掃描，并且將數(shù)據(jù)保存到臨時表，這是一個非常耗時的操作。
    改進(jìn)的做法，利用偏移量。
    select round(rand() * count(*))
    from titile
    where kind_id=1;
    select *
    from title
    where kind_id=1
    limit 1 offset $random;
    create index k on title(kind_id);
    相比上面的做法，這種寫法能夠利用到kind_id上的索引，減少了需要掃描的數(shù)據(jù)塊。但是，如果offset非常大，那么需要掃描的數(shù)據(jù)塊也非常大，極端情況是掃描索引k的所有數(shù)據(jù)塊。
    最優(yōu)的做法，利用主鍵進(jìn)行范圍查找
    select round(rand() * count(*))
    from title
    where kind_id=1;
    select *
    from title
    where kind_id = and id > $random
    limit 1;
    這個sql利用primary key進(jìn)行范圍查詢，完全走索引，并且只讀取一條記錄，速度非?？?。但是，這種用法的限制是primary key必須是int型，并且是連續(xù)自增長的。

# ###########################################
# 高效join
# ###########################################
– 小表驅(qū)動大表進(jìn)行join
– 避免子查詢
    子查詢是一個影響性能的隱患。應(yīng)該使用join改寫sql。

# ###########################################
# 數(shù)據(jù)類型
# ###########################################
– 避免隱式轉(zhuǎn)換
    CREATE TABLE `user` (
    `id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
    `account` char(11) NOT NULL COMMENT ”,
    `email` varchar(128),
    PRIMARY KEY (`id`),
    UNIQUE KEY `username` (`account`)
    ) ENGINE=InnoDB CHARSET=utf8;
    mysql>  explain select * from user where account=123 \G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: user
             type: ALL
    possible_keys: username
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 2
            Extra: Using where
    1 row in set (0.00 sec)
    可以看到，account=123的條件并沒有用到唯一索引`username`。mysql的server從storage engine中讀取所有的記錄，使用to_number()函數(shù)，將記錄中的account轉(zhuǎn)換成數(shù)字，被轉(zhuǎn)換后的數(shù)字用來和參數(shù)比較。我們的測試表里有2條記錄，而執(zhí)行計劃中rows的值也是2，并且type的值為ALL，這也說明索引`username`并沒有被用到。
    mysql> explain select * from user where account=’123′ \G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: user
             type: const
    possible_keys: username
              key: username
          key_len: 33
              ref: const
             rows: 1
            Extra:
    1 row in set (0.00 sec)
    參數(shù)為字符串類型，我們可以看到索引`username`，被使用到了。
    這是一個經(jīng)常被誤用的做法。
– 主鍵不是自增列
    自增列的主鍵有多個好處：
    插入性能高。
    減小page的碎片。
    提供二級索引的性能，降低二級索引的空間，因為二級索引存儲的是主鍵的值，并不是page中的行id。