|
---|
|
|
| step1,一個(gè)簡(jiǎn)單的聯(lián)系人表 Java代碼  - CREATE TABLE `contact784` (
- `cid` bigint AUTO_INCREMENT NOT NULL,
- `uid` bigint NOT NULL,
- `email` varchar(128) NOT NULL,
- `name` varchar(64) NOT NULL,
- `mobile` varchar(16) NULL,
- `atime` timestamp NULL,
- `type` enum('BLACK','WHITE','NORMAL') NOT NULl default 'NORMAL',
- `info` text NULL,
- `memo` varchar(1024) NULL,
- PRIMARY key(`cid`)
- ) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT = 100;
- ALTER TABLE `contact784` ADD UNIQUE INDEX uniq_uid_email(`uid`,`email`);
step2,插入了100W數(shù)據(jù): Java代碼  - # -*- coding: utf-8 -*-
- #@author python.han@gmail.com
-
- import MySQLdb
- import random
- import string
- import threading
- import time
-
- domains = ['org','com.cn','qq.com','yahoo.com','163.com','com','cn','sina.cn','sina.com']
- host = "localhost"
- user = "xx"
- pwd = "xx"
- db = "t3"
-
- def getRandomValue():
- email = ""
- s = ""
- for x in range(random.randint(1,10)):
- s += random.choice(string.letters)
- b = list(s)
- domain = ''.join(b)+"."+random.choice(domains)
- email = s+"@"+domain
- return email,s
-
-
- def insert(count):
- conn=MySQLdb.connect(host=host,user=user,passwd=pwd,db=db)
- cursor=conn.cursor()
- for cid in xrange(count):
- uid = random.randint(1000000000,9999999999)
- email,name = getRandomValue()
- sql = "insert into contact784(uid,email,name) values (%d,'%s', '%s')" %(uid,email,name)
- n=cursor.execute(sql)
- cursor.close()
- conn.commit ()
- conn.close()
-
-
- if __name__=='__main__':
-
- start = time.clock()
- for i in range(100):
- worker = threading.Thread(target = insert(10000))
- worker.start()
- end = time.clock()
- print "elsaped:%s" %(end-start)
step3,要重新單線程插入,需要把數(shù)據(jù)清空. 因?yàn)閜ython多線程由于GIL的關(guān)系,實(shí)際上上面的100個(gè)線程只產(chǎn)生了一個(gè)連接,需要測(cè)試一下純單線程插入是不是要快些:) 執(zhí)行:delete from contact784 半小時(shí)沒有執(zhí)行完畢! 診斷方式: 1,iostat ,top等查看磁盤io很大 2,inotifywatch發(fā)現(xiàn)io的事件非常多 原因:在大表上使用delete from 清空一個(gè)表是非常慢的。因?yàn)镮nnoDB必須處理表中的每一行,根據(jù)InnoDB的事務(wù)設(shè)計(jì)原則,首先需要把“刪除動(dòng)作”寫入“事務(wù)日志”,然后寫入實(shí)際的表。所以,清空大表的時(shí)候,最好直接drop table然后重建。 注: 在delete from 執(zhí)行的過程中: 用:select count(*) from contact784;發(fā)現(xiàn)表的數(shù)據(jù)量一直是100行 用:explain select count(*) from contact784;可以發(fā)現(xiàn)數(shù)量一直在減少,顯示當(dāng)前 784是是因?yàn)榍懊孢@個(gè)文章的原因“ http://hanyh./blog/431323 |
|