大家好,我是鄧飛。 前幾天一直學(xué)習(xí)孟德爾隨機(jī)化的理論知識(shí),寫了幾篇博客,迷迷糊糊感覺入門了,今天跑代碼試了試,看著結(jié)果和圖表,感覺理解更深入了。果然,看書百遍,不如一練,今天分享一下實(shí)操代碼。 前幾天的博客: 孟德爾隨機(jī)化的術(shù)語理解
孟德爾隨機(jī)化:工具變量三大假設(shè)
從一篇孟德爾隨機(jī)化文章看MR常見結(jié)果形式
示例數(shù)據(jù)使用官網(wǎng)的數(shù)據(jù),進(jìn)行了一點(diǎn)補(bǔ)充,對(duì)結(jié)果進(jìn)行了可視化。(https://mrcieu./TwoSampleMR/articles/introduction.html)
整個(gè)步驟: 
步驟1:提取暴露數(shù)據(jù)的GWAS

> ## 1, 安裝TwoSampleMR,如果已安裝,可以忽略 > > # library(remotes) > # install_github("MRCIEU/TwoSampleMR") > > ## 2, 載入TwoSampleMR包 > library(TwoSampleMR) > > ## 3,從數(shù)據(jù)庫中提取暴露的GWAS summary數(shù)據(jù) > exposure_dat = extract_instruments("ieu-a-2") > dim(exposure_dat) [1] 79 15
共有79行15列的暴露數(shù)據(jù)結(jié)果。
步驟2:提取結(jié)局?jǐn)?shù)據(jù)的GWAS 
> ## 4,從數(shù)據(jù)庫中提取結(jié)局變量的的GWAS summary數(shù)據(jù),SNP用暴露數(shù)據(jù)的結(jié)果 > # Get effects of instruments on outcome > outcome_dat = extract_outcome_data(snps=exposure_dat$SNP, outcomes = "ieu-a-7") Extracting data for 79 SNP(s) from 1 GWAS(s) > dim(outcome_dat) [1] 79 16
共79行15列的結(jié)局?jǐn)?shù)據(jù),注意,這里直接使用暴露數(shù)據(jù)質(zhì)控后的SNP,提取結(jié)局?jǐn)?shù)據(jù)得到的結(jié)果,所以位點(diǎn)數(shù)是一樣的。 步驟3:合并暴露數(shù)據(jù)和結(jié)局?jǐn)?shù)據(jù) 
> ## 5,將暴露數(shù)據(jù)和結(jié)局?jǐn)?shù)據(jù)合并 > # Harmonise the exposure and outcome data > dat = harmonise_data(exposure_dat, outcome_dat) Harmonising Body mass index || id:ieu-a-2 (ieu-a-2) and Coronary heart disease || id:ieu-a-7 (ieu-a-7) > dim(dat) [1] 79 36
合并的數(shù)據(jù),共79行,36列,這些數(shù)據(jù)可以用于孟德爾隨機(jī)化的分析。
步驟4:孟德爾隨機(jī)化分析及結(jié)果可視化 
> ## 6,進(jìn)行孟德爾隨機(jī)化分析 > res = mr(dat) Analysing 'ieu-a-2' on 'ieu-a-7' > ## 7,異質(zhì)化分析 > mr_heterogeneity(dat) id.exposure id.outcome outcome exposure method Q 1 ieu-a-2 ieu-a-7 Coronary heart disease || id:ieu-a-7 Body mass index || id:ieu-a-2 MR Egger 143.3046 2 ieu-a-2 ieu-a-7 Coronary heart disease || id:ieu-a-7 Body mass index || id:ieu-a-2 Inverse variance weighted 143.6508 Q_df Q_pval 1 77 6.841585e-06 2 78 8.728420e-06 > ## 8,水平多效性分析 > mr_pleiotropy_test(dat) id.exposure id.outcome outcome exposure egger_intercept se pval 1 ieu-a-2 ieu-a-7 Coronary heart disease || id:ieu-a-7 Body mass index || id:ieu-a-2 -0.001719304 0.003985962 0.6674266 > ## 9,留一法分析 > res_loo = mr_leaveoneout(dat) > mr_leaveoneout_plot(res_loo) $`ieu-a-2.ieu-a-7`
attr(,"split_type") [1] "data.frame" attr(,"split_labels") id.exposure id.outcome 1 ieu-a-2 ieu-a-7 > ## 10,散點(diǎn)圖 > > p1 = mr_scatter_plot(res, dat) > p1 $`ieu-a-2.ieu-a-7`
attr(,"split_type") [1] "data.frame" attr(,"split_labels") id.exposure id.outcome 1 ieu-a-2 ieu-a-7 > > ## 11,森林圖 > res_single = mr_singlesnp(dat) > mr_forest_plot(res_single) $`ieu-a-2.ieu-a-7`
attr(,"split_type") [1] "data.frame" attr(,"split_labels") id.exposure id.outcome 1 ieu-a-2 ieu-a-7 Warning messages: 1: Removed 1 row containing missing values or values outside the scale range (`geom_errorbarh()`). 2: Removed 1 row containing missing values or values outside the scale range (`geom_point()`). > > ## 12,漏斗圖 > mr_funnel_plot(res_single) $`ieu-a-2.ieu-a-7`
attr(,"split_type") [1] "data.frame" attr(,"split_labels") id.exposure id.outcome 1 ieu-a-2 ieu-a-7
下圖是留一法的森林圖: 
下圖是孟德爾隨機(jī)化的森林圖:

下圖是孟德爾隨機(jī)化的散點(diǎn)圖: 
下圖是孟德爾隨機(jī)化的漏斗圖: 
還有哪些需要研究的?
如何讀取自己的GWAS summary結(jié)果,并將格式整理為TwoSampleMR的格式? 如何對(duì)暴露數(shù)據(jù)GWAS結(jié)果進(jìn)行質(zhì)控,包括LD質(zhì)控,F(xiàn)值質(zhì)控,R2質(zhì)控等?
怎么對(duì)已發(fā)表的文章進(jìn)行結(jié)果圖標(biāo)的復(fù)現(xiàn)?
這些都是細(xì)枝末節(jié),等我后續(xù)一一完成博客的文章,歡迎繼續(xù)關(guān)注。
上面分析完整的代碼匯總: ## 1, 安裝TwoSampleMR,如果已安裝,可以忽略
# library(remotes) # install_github("MRCIEU/TwoSampleMR")
## 2, 載入TwoSampleMR包 library(TwoSampleMR)
## 3,從數(shù)據(jù)庫中提取暴露的GWAS summary數(shù)據(jù) exposure_dat = extract_instruments("ieu-a-2") dim(exposure_dat)
## 4,從數(shù)據(jù)庫中提取結(jié)局變量的的GWAS summary數(shù)據(jù),SNP用暴露數(shù)據(jù)的結(jié)果 # Get effects of instruments on outcome outcome_dat = extract_outcome_data(snps=exposure_dat$SNP, outcomes = "ieu-a-7") dim(outcome_dat)
## 5,將暴露數(shù)據(jù)和結(jié)局?jǐn)?shù)據(jù)合并 # Harmonise the exposure and outcome data dat = harmonise_data(exposure_dat, outcome_dat) dim(dat)
## 6,進(jìn)行孟德爾隨機(jī)化分析 res = mr(dat)
## 7,異質(zhì)化分析 mr_heterogeneity(dat)
## 8,水平多效性分析 mr_pleiotropy_test(dat)
## 9,留一法分析 res_loo = mr_leaveoneout(dat) mr_leaveoneout_plot(res_loo)
## 10,散點(diǎn)圖
p1 = mr_scatter_plot(res, dat) p1
## 11,森林圖 res_single = mr_singlesnp(dat) mr_forest_plot(res_single)
## 12,漏斗圖 mr_funnel_plot(res_single)
想要更好的學(xué)習(xí)和交流,快來加入飛哥的知識(shí)星球,這是一個(gè)生物統(tǒng)計(jì)+數(shù)量遺傳學(xué)+GWAS+GS的社區(qū),在這里你可以向飛哥提問、幫你制定學(xué)習(xí)計(jì)劃、跟著飛哥一起做實(shí)戰(zhàn)項(xiàng)目,沖沖沖。點(diǎn)擊這里加入吧:飛哥的學(xué)習(xí)圈子
|