日韩黑丝制服一区视频播放|日韩欧美人妻丝袜视频在线观看|九九影院一级蜜桃|亚洲中文在线导航|青草草视频在线观看|婷婷五月色伊人网站|日本一区二区在线|国产AV一二三四区毛片|正在播放久草视频|亚洲色图精品一区

分享

ggalluvial:沖擊圖展示組間變化、時(shí)間序列和復(fù)雜多屬性alluvial diagram

 萌小芊 2018-02-20

感謝“宏基因組0”群友李海敏、沈偉推薦此包繪制堆疊柱狀圖各成分連線:突出展示組間物種豐度變化。

沖擊圖(alluvial diagram)是流程圖(flow diagram)的一種,最初開發(fā)用于代表網(wǎng)絡(luò)結(jié)構(gòu)的時(shí)間變化。

實(shí)例1. neuroscience coalesced from other related disciplines to form its own field. From PLoS ONE 5(1): e8694 (2010)

實(shí)例2. Sciences封面哈扎人腸道菌群 圖1中的C/D就使用了3個(gè)沖擊圖。詳見3分和30分文章差距在哪里?

ggalluvial是一個(gè)基于ggplot2的擴(kuò)展包,專門用于快速繪制沖擊圖(alluvial diagram),有些人也叫它?;鶊D(Sankey diagram),但兩者略有區(qū)別,將來我們會(huì)介紹riverplot包繪制桑基圖。

軟件源代碼位于Github: https://github.com/corybrunson/ggalluvial

CRNA官方演示教程: https://cran./web/packages/ggalluvial/vignettes/ggalluvial.html

安裝

以下三種方裝方式,三選1:

# 國內(nèi)用戶推薦清華鏡像站site='https://mirrors.tuna./CRAN'# 安裝穩(wěn)定版(推薦)install.packages('ggalluvial', repo=site)# 安裝開發(fā)版(連github不穩(wěn)定有時(shí)間下載失敗,多試幾次可以成功)devtools::install_github('corybrunson/ggalluvial', build_vignettes = TRUE)# 安裝新功能最優(yōu)版devtools::install_github('corybrunson/ggalluvial', ref = 'optimization')

顯示幫助文檔

使用vignette查看演示教程

# 查看教程vignette(topic = 'ggalluvial', package = 'ggalluvial')

接下來我們的演示均基于此官方演示教程,我的主要貢獻(xiàn)是翻譯與代碼注釋。

基于ggplot2的沖擊圖

原作者:Jason Cory Brunson, 更新日期:2018-02-11

1. 最簡單的示例

基于泰坦尼克事件人員統(tǒng)計(jì)繪制性別與艙位和年齡的關(guān)系。

# 加載包library(ggalluvial)# 轉(zhuǎn)換內(nèi)部數(shù)據(jù)為數(shù)據(jù)框,寬表格模式titanic_wide <- data.frame(titanic)#="" 顯示數(shù)據(jù)格式head(titanic_wide)#="">   Class    Sex   Age Survived Freq#> 1   1st   Male Child       No    0#> 2   2nd   Male Child       No    0#> 3   3rd   Male Child       No   35#> 4  Crew   Male Child       No    0#> 5   1st Female Child       No    0#> 6   2nd Female Child       No    0# 繪制性別與艙位和年齡的關(guān)系ggplot(data = titanic_wide,       aes(axis1 = Class, axis2 = Sex, axis3 = Age,           weight = Freq)) +  scale_x_discrete(limits = c('Class', 'Sex', 'Age'), expand = c(.1, .05)) +  geom_alluvium(aes(fill = Survived)) +  geom_stratum() + geom_text(stat = 'stratum', label.strata = TRUE) +  theme_minimal() +  ggtitle('passengers on the maiden voyage of the Titanic',          'stratified by demographics and survival')

具體參考說明:data設(shè)置數(shù)據(jù)源,axis設(shè)置顯示的柱,weight為數(shù)值,geom_alluvium為沖擊圖組間面積連接并按生存率比填充分組,geom_stratum()每種有柱狀圖,geom_text()顯示柱狀圖中標(biāo)簽,theme_minimal()主題樣式的一種,ggtitle()設(shè)置圖標(biāo)題

圖1. 展示性別與艙位和年齡的關(guān)系及存活率比例

我們發(fā)現(xiàn)上圖居然畫的是寬表格模式下的數(shù)據(jù),而通常ggplot2處理都是長表格模式,如何轉(zhuǎn)換呢?

to_loades轉(zhuǎn)換為長表格

# 長表格模式,to_loades多組組合,會(huì)生成alluvium和stratum列。主分組位于命名的key列中titanic_long <- to_lodes(data.frame(titanic),=""  =""  =""  =""  =""  =""  =""  =""  =""  =""  =""  =""  ="" key='Demographic' ,=""  =""  =""  =""  =""  =""  =""  =""  =""  =""  =""  =""  ="" axes="1:3)head(titanic_long)ggplot(data" =="" titanic_long,=""  =""  =""  ="" aes(x="Demographic," stratum="stratum," alluvium="alluvium,"  =""  =""  =""  =""  ="" weight="Freq," label="stratum))" +=""  geom_alluvium(aes(fill="Survived))" +=""  geom_stratum()="" +="" geom_text(stat='stratum' )="" +=""  theme_minimal()="" +=""  ggtitle('passengers="" on="" the="" maiden="" voyage="" of="" the="" titanic',=""  =""  =""  =""  =""  'stratified="" by="" demographics="" and="">

產(chǎn)生和上圖一樣的圖,只是數(shù)據(jù)源格式不同。

2. 輸入數(shù)據(jù)格式

定義一種Alluvial寬表格

# 顯示數(shù)據(jù)格式head(as.data.frame(UCBAdmissions), n = 12)##       Admit Gender Dept Freq## 1  Admitted   Male    A  512## 2  Rejected   Male    A  313## 3  Admitted Female    A   89## 4  Rejected Female    A   19## 5  Admitted   Male    B  353## 6  Rejected   Male    B  207## 7  Admitted Female    B   17## 8  Rejected Female    B    8## 9  Admitted   Male    C  120## 10 Rejected   Male    C  205## 11 Admitted Female    C  202## 12 Rejected Female    C  391# 判斷數(shù)據(jù)格式is_alluvial(as.data.frame(UCBAdmissions), logical = FALSE, silent = TRUE)## [1] 'alluvia'

查看性別與專業(yè)間關(guān)系,并按錄取情況分組

ggplot(as.data.frame(UCBAdmissions),       aes(weight = Freq, axis1 = Gender, axis2 = Dept)) +  geom_alluvium(aes(fill = Admit), width = 1/12) +  geom_stratum(width = 1/12, fill = 'black', color = 'grey') +  geom_label(stat = 'stratum', label.strata = TRUE) +  scale_x_continuous(breaks = 1:2, labels = c('Gender', 'Dept')) +  scale_fill_brewer(type = 'qual', palette = 'Set1') +  ggtitle('UC Berkeley admissions and rejections, by sex and department')

3. 三類型間關(guān)系,按重點(diǎn)著色

Titanic按生存,性別,艙位分類查看關(guān)系,并按艙位填充色

ggplot(as.data.frame(Titanic),       aes(weight = Freq,           axis1 = Survived, axis2 = Sex, axis3 = Class)) +  geom_alluvium(aes(fill = Class),                width = 0, knot.pos = 0, reverse = FALSE) +  guides(fill = FALSE) +  geom_stratum(width = 1/8, reverse = FALSE) +  geom_text(stat = 'stratum', label.strata = TRUE, reverse = FALSE) +  scale_x_continuous(breaks = 1:3, labels = c('Survived', 'Sex', 'Class')) +  coord_flip() +  ggtitle('Titanic survival by class and sex')

4. 長表格數(shù)據(jù)

# to_lodes轉(zhuǎn)換為長表格UCB_lodes <- to_lodes(as.data.frame(ucbadmissions),="" axes="1:3)head(UCB_lodes," n="12)##"  =""  freq="" alluvium=""  =""  ="" x=""  stratum##="" 1=""  ="" 512=""  =""  =""  =""  1="" admit="" admitted##="" 2=""  ="" 313=""  =""  =""  =""  2="" admit="" rejected##="" 3=""  =""  89=""  =""  =""  =""  3="" admit="" admitted##="" 4=""  =""  19=""  =""  =""  =""  4="" admit="" rejected##="" 5=""  ="" 353=""  =""  =""  =""  5="" admit="" admitted##="" 6=""  ="" 207=""  =""  =""  =""  6="" admit="" rejected##="" 7=""  =""  17=""  =""  =""  =""  7="" admit="" admitted##="" 8=""  =""  ="" 8=""  =""  =""  =""  8="" admit="" rejected##="" 9=""  ="" 120=""  =""  =""  =""  9="" admit="" admitted##="" 10=""  205=""  =""  =""  ="" 10="" admit="" rejected##="" 11=""  202=""  =""  =""  ="" 11="" admit="" admitted##="" 12=""  391=""  =""  =""  ="" 12="" admit="" rejected#="" 判斷是否符合格式要求is_alluvial(ucb_lodes,="" logical="FALSE," silent="TRUE)##" [1]="">

主要列說明:

  • x, 主要的分類,即X軸上每個(gè)柱

  • stratum, 主要分類中的分組

  • alluvium, 連接圖的索引

5. 繪制非等高沖擊圖

以各國難民數(shù)據(jù)為例,觀察多國難民數(shù)量隨時(shí)間變化

data(Refugees, package = 'alluvial')country_regions <- c(=""  afghanistan='Middle East' ,=""  burundi='Central Africa' ,=""  `congo="" drc`='Central Africa' ,=""  iraq='Middle East' ,=""  myanmar='Southeast Asia' ,=""  palestine='Middle East' ,=""  somalia='Horn of Africa' ,=""  sudan='Central Africa' ,=""  syria='Middle East' ,=""  vietnam='Southeast Asia' )refugees$region=""><- country_regions[refugees$country]ggplot(data="Refugees,"  =""  =""  ="" aes(x="year," weight="refugees," alluvium="country))" +=""  geom_alluvium(aes(fill="country," colour="country),"  =""  =""  =""  =""  =""  =""  =""  alpha=".75," decreasing="FALSE)" +=""  scale_x_continuous(breaks="seq(2003," 2013,="" 2))="" +=""  theme(axis.text.x="element_text(angle" =="" -30,="" hjust="0))" +=""  scale_fill_brewer(type='qual' ,="" palette='Set3' )="" +=""  scale_color_brewer(type='qual' ,="" palette='Set3' )="" +=""  facet_wrap(~="" region,="" scales='fixed' )="" +=""  ggtitle('refugee="" volume="" by="" country="" and="" region="" of="">

6. 等高非等量關(guān)系

不同學(xué)期學(xué)生學(xué)習(xí)科目的變化

data(majors)majors$curriculum <- as.factor(majors$curriculum)ggplot(majors,=""  =""  =""  ="" aes(x="semester," stratum="curriculum," alluvium="student,"  =""  =""  =""  =""  ="" fill="curriculum," label="curriculum))" +=""  scale_fill_brewer(type='qual' ,="" palette='Set2' )="" +=""  geom_flow(stat='alluvium' ,="" lode.guidance='rightleft' ,=""  =""  =""  =""  =""  =""  color='darkgray' )="" +=""  geom_stratum()="" +=""  theme(legend.position='bottom' )="" +=""  ggtitle('student="" curricula="" across="" several="">

7. 工作狀態(tài)時(shí)間變化圖

data(vaccinations)levels(vaccinations$response) <- rev(levels(vaccinations$response))ggplot(vaccinations,=""  =""  =""  ="" aes(x="survey," stratum="response," alluvium="subject,"  =""  =""  =""  =""  ="" weight="freq,"  =""  =""  =""  =""  ="" fill="response," label="response))" +=""  geom_flow()="" +=""  geom_stratum(alpha=".5)" +=""  geom_text(stat='stratum' ,="" size="3)" +=""  theme(legend.position='none' )="" +=""  ggtitle('vaccination="" survey="" responses="" at="" three="" points="" in="">

8. 分類學(xué)門水平相對(duì)豐度實(shí)戰(zhàn)

# 實(shí)戰(zhàn)1. 組間豐度變化 # 編寫測試數(shù)據(jù)df=data.frame(  Phylum=c('Ruminococcaceae','Bacteroidaceae','Eubacteriaceae','Lachnospiraceae','Porphyromonadaceae'),  GroupA=c(37.7397,31.34317,222.08827,5.08956,3.7393),  GroupB=c(113.2191,94.02951,66.26481,15.26868,11.2179),  GroupC=c(123.2191,94.02951,46.26481,35.26868,1.2179),  GroupD=c(37.7397,31.34317,222.08827,5.08956,3.7393))# 數(shù)據(jù)轉(zhuǎn)換長表格library(reshape2)melt_df = melt(df)# 繪制分組對(duì)應(yīng)的分類學(xué),有點(diǎn)像circosggplot(data = melt_df,       aes(axis1 = Phylum, axis2 = variable,           weight = value)) +  scale_x_discrete(limits = c('Phylum', 'variable'), expand = c(.1, .05)) +  geom_alluvium(aes(fill = Phylum)) +  geom_stratum() + geom_text(stat = 'stratum', label.strata = TRUE) +  theme_minimal() +  ggtitle('Phlyum abundance in each group')

繪制分組對(duì)應(yīng)的分類學(xué),有點(diǎn)像circos

# 組間各豐度變化 ggplot(data = melt_df,       aes(x = variable, weight = value, alluvium = Phylum)) +  geom_alluvium(aes(fill = Phylum, colour = Phylum, colour = Phylum),                alpha = .75, decreasing = FALSE) +  theme_minimal() +  theme(axis.text.x = element_text(angle = -30, hjust = 0)) +  ggtitle('Phylum change among groups')

組間各豐度變化,如果組為時(shí)間效果更好

Reference

# 如何引用citation('ggalluvial')

Jason Cory Brunson (2017). ggalluvial: Alluvial Diagrams in ‘ggplot2’. R package version 0.5.0.
 https://CRAN.R-project.org/package=ggalluvial

https://en./wiki/Alluvial_diagram

ggalluvial包源碼:http://corybrunson./ggalluvial/index.html

官方示例 Alluvial Diagrams in ggplot2 https://cran./web/packages/ggalluvial/vignettes/ggalluvial.html

    本站是提供個(gè)人知識(shí)管理的網(wǎng)絡(luò)存儲(chǔ)空間,所有內(nèi)容均由用戶發(fā)布,不代表本站觀點(diǎn)。請(qǐng)注意甄別內(nèi)容中的聯(lián)系方式、誘導(dǎo)購買等信息,謹(jǐn)防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容,請(qǐng)點(diǎn)擊一鍵舉報(bào)。
    轉(zhuǎn)藏 分享 獻(xiàn)花(0

    0條評(píng)論

    發(fā)表

    請(qǐng)遵守用戶 評(píng)論公約

    類似文章 更多