小編自語:全基因組選擇, 參考群需要建多大, 這篇文章用實(shí)際數(shù)據(jù)和模擬數(shù)據(jù)證明, 參考群至少要有500才有效果. 另外, 多性狀SSGBLUP比單性狀SSGBLUP要好. 所以, 學(xué)好傳統(tǒng)的數(shù)量遺傳學(xué)對于基因組選擇也是有幫助的.
文獻(xiàn)下載及數(shù)據(jù)下載paper download
 data download 
1 摘要We simulated two traits with heritabilities 0.1, 0.3, and with high genetic correlation 0.7, our results also showed that the prediction accuracies were low for GBLUP compared with other three methods with different genotyped reference population sizes and the accuracies were improved with increasing the genotyped reference population size. However, the increase was small for ssGBLUP compared with BLUP when the genotyped reference population size was <500. Our results also demonstrated that the accuracies of genomic prediction can be further improved by implementing two-trait ssGBLUP model, the maximum gain on accuracy was 2 and 2.6% for trait of chest width compared to single-trait ssGBLUP and traditional BLUP, while the gain was decreased with the weakness of genetic correlation. Two-trait ssGBLUP even performed worse than single trait analysis in the scenario of low genetic correlation. 在基因組預(yù)測中, 一步法要比多步法有很多優(yōu)勢. 本研究調(diào)查了Yorkshire的7中身體測量性狀, 以及模擬數(shù)據(jù)來比較SSBLUP的效率 在Yorkshire群體中, 有592個體有基因型數(shù)據(jù), PorcineSNP80的芯片 比較常規(guī)ABLUP和單性狀SSGBLUP以及多性狀SSGBLUP, 以及GBLUP方法 結(jié)果顯示, GBLUP相對于ABLUP準(zhǔn)確性降低 SSGBLUP相對于傳統(tǒng)ABLUP, 準(zhǔn)確性提高1%, 準(zhǔn)確性提高的比較低主要是因?yàn)閰⒖既簲?shù)目較低(592頭) 如果模擬少量的測序個體, 進(jìn)行分析, 也顯示SSGBLUP相對于ABLUP提高的也比較低, 提高了0.6%. 如果兩性狀遺傳相關(guān)比較高(0.7), 使用雙性狀SSGBLUP, 準(zhǔn)確性能進(jìn)一步提高, 能提高2.6%. 如果兩性狀遺傳相關(guān)比較低, 那么雙性狀SSGBLUP相對于單性狀SSGBLUP沒有優(yōu)勢, 甚至還不如單性狀SSGBLUP
2 方法介紹2.1 GBLUP VS 貝葉斯Since the historic work of Meuwissen et al. (2001), combining genome data with corresponding statistical models has been successfully applied to genome selection. The key issue of genomic selection is to predict individual genomic breeding values (GEBV) using genome-wide marker information. Many statistical methods have been developed to predict GEBV, which are basically different in the assumption of distribution of SNP effects. The linear BLUP models (at either the SNP level or the individual animal level) assume that effects of all SNP are normally distributed with same variance (Meuwissen et al., 2001; VanRaden, 2008). On the other hand, the Bayesian Alphabet methods (e.g., BayesA, BayesB, and BayesCpi) (Meuwissen et al., 2001; Habier et al., 2011) allow each SNP effect to have its own variance. Many studies have reported that Bayesian methods performed similar to genomic BLUP (GBLUP) model in real data (Hayes et al., 2009a) and GBLUP is also simpler and lower computation-demanding than the Bayesian Alphabet methods.
基因組選擇的關(guān)鍵點(diǎn), 在于計(jì)算基因組育種值(GEBV), 許多統(tǒng)計(jì)模型可以做這個事情 BLUP方法(GBLUP, SSGBLUP)假定所有的標(biāo)記效應(yīng)值都是一樣的方差 貝葉斯方法(BayesA, BayesB, BayesCpi等)假定每個SNP有自己的方差 研究表明, 貝葉斯的方法在真實(shí)數(shù)據(jù)里面, 效果和GBLUP類似, 但是GBLUP方法運(yùn)算更快, 更方便.
2.2 一步法以及多性狀一步法Generally, genomic prediction utilizes information of genotyped animals. In practice, however, only a subset of individuals can be genotyped. Furthermore, in order to make use of phenotype information of non-genotyped individuals, a single-step GBLUP (ssGBLUP) has been developed by constructing H matrix using marker genotypes and pedigree jointly instead of G matrix or pedigree-based relationship matrix alone (Legarra et al., 2009; Christensen and Lund, 2010). Field data of cattle, pigs and chickens indicated that single-step method leads to higher accuracy and much simpler than multi-step genomic selection methods (Aguilar et al., 2011; Chen et al., 2011; Forni et al., 2011; Christensen et al., 2012; Simeone et al., 2012; Li et al., 2014; Song et al., 2017).
后代全部測定, 成本太高, 我們可以測定一部分個體, 然后通過系譜+基因型構(gòu)建H矩陣, 進(jìn)行一步法估算SSGBLUP, 更具有操作性 在奶牛, 豬, 雞實(shí)際分析中, 顯示多性狀模型比單性狀模型預(yù)測的準(zhǔn)確性更高
Genomic selection usually handles a single trait only. However, many traits are genetically correlated. As in traditional genetic evaluation, a multi-trait model is expected to increase the accuracy of the GEBV by making use of information from genetically correlated traits which will be more profound for traits with low heritability or with a small number of phenotypic records (Jia and Jannink, 2012; Guo et al., 2014). Many studies report multi-trait model for genetically correlated traits could lead to more accurate predictions than single trait genomic prediction (Calus and Veerkamp, 2011; Jia and Jannink, 2012; Guo et al., 2014; Wang et al., 2017).
2.3 多性狀SSGBLUP應(yīng)用范圍性狀間有遺傳相關(guān), 特別是遺傳力比較低時 表型數(shù)據(jù)比較少時, 多性狀模型更好
2.4 ABLUP
固定因子: 棟舍 + 場年季 + 性別 隨機(jī)因子: 加性效應(yīng)
2.5 GBLUP
2.6 單性狀SSGBLUP 矯正H矩陣
因?yàn)橛行┬誀頖不能解釋所有變異, 設(shè)置其能解釋95%的變異, 剩下的系譜解釋5%的變異 根據(jù)G矩陣和A22矩陣的對角線和非對角線方程, 計(jì)算 alpha和beta
2.7 兩性狀SSGBLUP
和常規(guī)ABLUP多性狀分析模型類似. 3 結(jié)論3.1 遺傳力, 遺傳相關(guān), 表型相關(guān)
對角線: 遺傳力 上三角: 遺傳相關(guān) 下三角: 表型相關(guān)
3.2 ABLUP VS GBLUP VS SSGBLUP 不同方法, 不同性狀的準(zhǔn)確性和可靠性比較. 可以看出GBLUP相對于ABLUP是下降的, SSGBLUP提升也十分有限, 這主要是因?yàn)閰⒖既簜€數(shù)太少, 準(zhǔn)確性提高較少.
3.3 單性狀SSGBLUP VS 多性狀SSGBLUP
Compared to single-trait model, as shown in Table 4, the accuracies of genomic prediction for CW from two-trait ssGBLUP were increased from 0.684 (Table 3) to 0.703 and 0.697 in the scenarios of high and medium correlations but slightly decreased to 0.676 in low genetic correlation. The gain on accuracy was 2% in situation with high genetic correlation, 1.3% in medium genetic correlation.
遺傳相關(guān)比較高和中等的性狀, 雙性狀分析準(zhǔn)確性提高. 遺傳相關(guān)較低的性狀, 雙性狀分析, 準(zhǔn)確性提高不明顯, 有輕微降低
3.4 參考群大小對準(zhǔn)確性的影響
As shown in Figure 2, the same validation population was predicted through different genotyped reference population sizes using BLUP, GBLUP, ssGBLUP, and two-trait ssGBLUP methods. The accuracies of prediction were low for GBLUP compared with other three methods in all scenarios, while the accuracy of genomic prediction from GBLUP was rapidly increased with increasing the reference population size, especially when the reference population size was enlarged over 500. The accuracy of traditional BLUP with the same reference and validation population as ssGBLUP was also shown in Figure 2. Generally, ssGBLUP provided higher accuracies of predictions than traditional BLUP in different genotyped reference population sizes for Trait A and Trait B, however, the increase was tiny especially in Trait A with low heritability of 0.1 when the genotyped reference population size was below 500, this was consistent with the results of real pig data in this study. In all scenarios, two-trait ssGBLUP produced the highest accuracy for Trait A and Trait B with high genetic correlation of 0.7, but the scope of improvement was low for Trait B with heritability of 0.3.
模擬數(shù)據(jù)顯示, GBLUP的準(zhǔn)確性要差于ABLUP和SSGBLUP, 但是隨著參考群的增加, 特別是在500以上時, 顯著提升 整體來說, 準(zhǔn)確性的排名, 雙性狀SSGBLUP > 單性狀SSGBLUP > ABLUP 在參考群小于500時, SSGBLUP相對于ABLUP, 提升的效果不明顯, 這種結(jié)論在實(shí)際數(shù)據(jù)中也有體現(xiàn). 在所有的模擬中, 雙性狀SSGBLUP要好于單性狀SSGBLUP, 遺傳相關(guān)高的雙性狀SSGBLUP要好于遺傳相關(guān)低的性狀.
參考文獻(xiàn):Song H, Zhang J, Zhang Q and Ding X (2019) Using Different Single-Step Strategies to Improve the Efficiency of Genomic Prediction on Body Measurement Traits in Pig. Front. Genet. 9:730. doi: 10.3389/fgene.2018.00730
|