“ No one knows everything, and you don't have to.” --free傻孩子本節(jié)為大家介紹一種常用的排序方法NMDS分析,全稱為非度量多維尺度分析 (non-metric multidimensional scaling)。NMDS是一種將多維空間的研究對象(樣本或變量)簡化到低維空間進(jìn)行定位、分析和歸類,同時又保留對象間原始關(guān)系的數(shù)據(jù)分析方法。目前網(wǎng)絡(luò)上關(guān)于NMDS分析的介紹和論述已經(jīng)很多了,本公眾號就不再贅述了。本節(jié)的關(guān)注點(diǎn)是如何繪制漂亮的NMDS散點(diǎn)圖。
也許大家已經(jīng)發(fā)現(xiàn)NMDS分析大多情況下是用來展示物種數(shù)據(jù)的一種分析方法,為什么呢?這是因為當(dāng)前流行使用的物種數(shù)據(jù)大多數(shù)為OTU或ASV測序數(shù)據(jù),這類數(shù)據(jù)包含豐富的0值。NMDS分析一般使用的是Bray-Curtis距離算法,該算法對0值不敏感,換句話說即使有很多的0值的情況下也能獲得較為穩(wěn)健的結(jié)果。因此,NMDS分析適宜于測序數(shù)據(jù)的分析。然而,因為該方法使用的是非參數(shù)的方法,所以不能給出每一軸對于數(shù)據(jù)分布的解釋量(如有錯誤請指正),這是該方法存在的局限性。 加載NMDS分析所需要的包,如下: library(tidyverse) library(vegan) library(MASS) library(readxl) 數(shù)據(jù)導(dǎo)入 readxlsx <- function(file = "file.xlsx", n =3) { require(readxl) dat <- list() i = 0 while (i < n) { i = i+1 dat[[i]] <- read_excel(file, sheet = i, col_names = T) } return(dat) }
otu <- readxlsx(file = "1-16S.xlsx", n =5) 數(shù)據(jù)處理 ITS <- otu[[4]] %>% data.frame()#tibble data change to dataframe data rownames(ITS) <- ITS$OTUid#defined row names ITS2 <- ITS[,-1] %>% t() %>% #transposition data.frame()#row names are sample names, and colnum names are OTU id 數(shù)據(jù)格式如下: 
ITS.nmds<-metaMDS(ITS2) ITS.nmds#The smaller the value of stress, the better the goodness of fit #當(dāng)stress > 0.2時表明使用該方法不合適,建議使用其它方法對數(shù)據(jù)進(jìn)行#分析 stressplot(ITS.nmds)#檢查觀測值非相似性與排序距離之間的關(guān)系 
擬合結(jié)果顯示,沒有點(diǎn)出現(xiàn)在距離線段很遠(yuǎn)的位置,意味著該數(shù)據(jù)可以使用NMDS分析。 簡單出圖 1). 只顯示樣方點(diǎn) ordiplot(ITS.nmds, type = "text",display = "sites") 
2). 只顯示物種信息 ordiplot(ITS.nmds, type = "text",display = "species") 
提取樣方點(diǎn)
#提取樣方點(diǎn) ITS.scores <- as.data.frame(scores(ITS.nmds)) #提取點(diǎn) ITS.scores %>% as_tibble(rownames = "sample") ->ITS_sites ITS_sites 
根據(jù)處理給數(shù)據(jù)分組 otu[[3]] %>% dplyr::select(Code,Tdiff) %>% mutate(group = if_else(Tdiff>0,"warmer", if_else(Tdiff<0,"colder","in_situ"))) ->group group 
將分組信息添加到NMDS數(shù)據(jù)樣點(diǎn)中 ITS_sites %>% left_join(group, by = c("sample" = "Code")) %>% filter(group!="NA")->ITS_sites2 ITS_sites2 
因為我用的是已發(fā)表文章中的數(shù)據(jù),數(shù)據(jù)給出的樣方信息和分組信息數(shù)量不匹配所以我過濾掉了不匹配的部分,如果處理自己的數(shù)據(jù)則不必使用filter函數(shù)。 繪圖背景等參數(shù)設(shè)置(直接粘貼并運(yùn)行) main_theme = theme(panel.background=element_blank(), panel.grid=element_blank(), axis.line.x=element_line(size=0.5, colour="black"), axis.line.y=element_line(size=0.5, colour="black"), axis.ticks=element_line(color="black"), axis.text=element_text(color="black", size=12), legend.position="right", legend.background=element_blank(), legend.key=element_blank(), legend.text= element_text(size=12), text=element_text(family="sans", size=12), plot.title=element_text(hjust = 0.5,vjust=0.5,size=12), plot.subtitle=element_text(size=12)) 繪圖 ggplot(data=ITS_sites2,aes(NMDS1,NMDS2)) + geom_hline(aes(yintercept=0),colour="#d8d6d6",linetype=5)+ geom_vline(aes(xintercept=0),colour="#d8d6d6",linetype=5)+ geom_point(aes(color = group),shape = 19,size = 3.5)+ scale_color_manual(values = c("#2166ac","#f4a582","#e31a1c"))+ #scale_x_continuous(breaks = seq(-0.59,0.66,0.2),limits = c(-0.59,0.66))+ #scale_y_continuous(breaks = seq(-0.60,0.45,0.15),limits = c(-0.60,0.45))+ labs(x= "NMDS1", y = "NMDS2",color = "Treatments")+ theme_bw() + main_theme 
分組NMDS ggplot(data=ITS_sites2,aes(NMDS1,NMDS2)) + geom_hline(aes(yintercept=0),colour="#d8d6d6",linetype=5)+ geom_vline(aes(xintercept=0),colour="#d8d6d6",linetype=5)+ geom_point(aes(color = group),shape = 19,size = 3.5)+ scale_color_manual(values = c("#2166ac","#f4a582","#e31a1c"))+ #scale_x_continuous(breaks = seq(-0.59,0.66,0.2),limits = c(-0.59,0.66))+ #scale_y_continuous(breaks = seq(-0.60,0.45,0.15),limits = c(-0.60,0.45))+ stat_ellipse(aes(fill=group),geom="polygon",level=0.95,alpha=0.15)+ labs(x= "NMDS1", y = "NMDS2", color = "Treatments",fill = "Treatments")+ theme_bw() + main_theme 
“等溫線”NMDS “等溫線”NMDS適用于處理比較多的情況,如梯度等 1)重新構(gòu)建分組 ITS_sites2 %>% mutate(group2=if_else(Tdiff< -5.7,"very cold", if_else(Tdiff< 0,"cold", if_else(Tdiff<5.7, "in situ", if_else(Tdiff <9.6, "warm","hot"))))) ->ITS_sites3 ITS_sites3 
ggplot(data=ITS_sites3,aes(NMDS1,NMDS2)) + geom_hline(aes(yintercept=0),colour="#d8d6d6",linetype=5)+ geom_vline(aes(xintercept=0),colour="#d8d6d6",linetype=5)+ geom_point(aes(color = group2),shape = 19,size = 3.5)+ #scale_color_manual(values = c("#2166ac","#f4a582","#e31a1c"))+ #scale_x_continuous(breaks = seq(-0.59,0.66,0.2),limits = c(-0.59,0.66))+ #scale_y_continuous(breaks = seq(-0.60,0.45,0.15),limits = c(-0.60,0.45))+ stat_density2d(aes(color = group2),size = 0.6)+ labs(x= "NMDS1", y = "NMDS2", color = "Treatments")+ theme_bw() + main_theme 
鏈接:https://www./s/CFvkjJ3nECi 如有問題,可以加入我們的群聊一起討論,如下:
|