R語言的一大特色是繪制精美的的統(tǒng)計(jì)圖,而其中R包ggplot2專為繪圖而生
一起簡單了解一下ggplot2的基本語法
目錄 圖層 一開始先明確ggplot2的繪圖邏輯,和PS類似,采用圖層疊加 的方式,不同的圖層用 ' ' 相連,多個(gè)圖層最終結(jié)合成一幅圖
library (ggplot2) ggplot(data=mtcars,aes(x=wt,y=mpg))
以mtcars 為例,以wt為x軸,mpg為y軸用ggplot()
先建立一個(gè)最基礎(chǔ)的圖層
ggplot(data=mtcars,aes(x=wt,y=mpg)) geom_point()
通過 ' ' 在基礎(chǔ)圖層上添加上散點(diǎn)(geom_point()
) ,得到一幅簡單的散點(diǎn)圖,后面還能添加更多的圖層得到復(fù)雜的圖形
映射 映射即視覺通道映射,通俗來說就是將數(shù)據(jù)映射到圖形的某一成分中,數(shù)據(jù)會(huì)以指定的形式在圖形中得以呈現(xiàn)
使用到函數(shù)aes()
,除了最基礎(chǔ)的x,y軸的映射,還有其他映射類型color,fill,alpha,size,shape,linetype等等
ggplot(data=mtcars,aes(x=wt,y=mpg,color=as.factor(am))) geom_point()
x,y軸不變加上顏色映射類型 ,并傳入因子型的數(shù)據(jù),得到了兩種顏色二分類的散點(diǎn)圖
ggplot(data=mtcars,aes(x=wt,y=mpg,shape=as.factor(cyl))) geom_point()
x,y軸不變加上形狀映射類型 ,傳入cyl的三分類數(shù)據(jù),得到有三種不同形狀的散點(diǎn)圖
ggplot(data=mtcars,aes(x=wt,y=mpg,size=qsec,alpha=hp)) geom_point()
也可以同時(shí)加上兩種互不干擾的映射,透明度映射 和點(diǎn)大小映射
這里只是演示了關(guān)于點(diǎn)的映射,相對(duì)的還有線、圖形、文本的映射,后面遇到再介紹
幾何圖形和統(tǒng)計(jì)變換 通??梢允褂?code>geom_類函數(shù)來繪制指定的統(tǒng)計(jì)圖
圖形 函數(shù) 點(diǎn)圖 geom_point() 折線圖 geom_line() 箱線圖 geom_boxplot() 密度圖 geom_density() 柱狀圖 geom_bar() 小提琴圖 geom_violin() ... ...
library (gridExtra)library (ggplot2) p1 <- ggplot(mtcars, aes(wt, mpg)) geom_point() p2 <- ggplot(economics, aes(date, unemploy)) geom_line() p3 <- ggplot(mpg, aes(class, hwy)) geom_boxplot() p4 <- ggplot(diamonds, aes(carat)) geom_density() p5 <- ggplot(mpg, aes(class)) geom_bar() p6 <- ggplot(mtcars, aes(mpg, factor(cyl))) geom_violin() grid.arrange(p1,p2,p3,p4,p5,p6,nrow=2 ,ncol=3 )
以上為6種常見圖形的實(shí)例
以geom_point()
為例,簡單介紹一下參數(shù)
##?geom_point## geom_point( mapping = NULL , data = NULL , stat = 'identity' , position = 'identity' , ... , na.rm = FALSE , show.legend = NA , inherit.aes = TRUE )
mapping、inherit.aes = TRUE 可使用aes()
指定相關(guān)映射stat 統(tǒng)計(jì)變換,默認(rèn)不變換ggplot(mtcars, aes(wt, mpg)) geom_point(aes(shape=factor(cyl)),color='green' ,size=3 ) geom_smooth()
geom_point
里面能重新指定映射(全局和局部的關(guān)系),也能添加參數(shù)改變圖形屬性,在點(diǎn)圖的基礎(chǔ)上還可以疊加光滑曲線(geom_smooth()
)
geom_
(幾何圖形)和stat_
(統(tǒng)計(jì)變換)都能作為一種疊加圖層的方法,且兩者繪圖效果相似。
geom_
側(cè)重圖形的繪制,通過參數(shù)stat 指定統(tǒng)計(jì)方法stat_
側(cè)重統(tǒng)計(jì)變換,通過參數(shù)geom 指定繪圖類型兩者能夠相互轉(zhuǎn)換,在幫助文件上也會(huì)同時(shí)寫出兩種繪圖方法
geom_bar( mapping = NULL , data = NULL , stat = 'count' , position = 'stack' , ... , width = NULL , na.rm = FALSE , orientation = NA , show.legend = NA , inherit.aes = TRUE ) stat_count( mapping = NULL , data = NULL , geom = 'bar' , position = 'stack' , ... , width = NULL , na.rm = FALSE , orientation = NA , show.legend = NA , inherit.aes = TRUE )
geom_bar
stat = 'count' 、stat_count
geom = 'bar' 兩種方法所得到的柱形圖相同。
geom_bar
,指定count 統(tǒng)計(jì)變換方法,將數(shù)據(jù)轉(zhuǎn)換成頻數(shù)從而得到柱形圖stat_count
默認(rèn)count 的統(tǒng)計(jì)變換方法,指定bar 繪圖方法從而得到柱形圖標(biāo)尺(Scale) 之前介紹的圖大都都是按照默認(rèn)參數(shù)生成的,scale_
類函數(shù)可以修改圖形的細(xì)節(jié)
坐標(biāo)軸 1、坐標(biāo)軸刻度及標(biāo)簽
對(duì)于連續(xù)性變量 ,通常使用函數(shù)scale_x_continuous
、scale_y_continuous
scale_x_continuous( name = waiver(), breaks = waiver(), minor_breaks = waiver(), n.breaks = NULL , labels = waiver(), limits = NULL , expand = waiver(), oob = censor, na.value = NA_real_ , trans = 'identity' , guide = waiver(), position = 'bottom' , sec.axis = waiver() )
name 修改軸標(biāo)題,使用函數(shù)labs()
也能達(dá)到相同效果library (gridExtra) p1 <- ggplot(mtcars, aes(wt, mpg)) geom_point() scale_x_continuous(name='AAA' ) p2 <- ggplot(mtcars, aes(wt, mpg)) geom_point() labs(x='BBB' ) grid.arrange(p1,p2,ncol=2 )
breaks 將數(shù)據(jù)進(jìn)行指定分組,搭配參數(shù)label 可以修改組名ggplot(mpg, aes(displ, hwy)) geom_point() scale_x_continuous(breaks = c(2 , 4 , 6 ),label = c('two' , 'four' , 'six' ))
limits 限定坐標(biāo)軸的刻度范圍,和函數(shù)xlim
效果一樣library (gridExtra) p1 <- ggplot(mtcars, aes(wt, mpg)) geom_point() scale_x_continuous(name='AAA' ,limits=c(1 ,7 )) ##限定x軸刻度在1到7 p2 <- ggplot(mtcars, aes(wt, mpg)) geom_point() xlim(1 ,8 ) grid.arrange(p1,p2,ncol=2 )
library (gridExtra) df <- data.frame( x = rnorm(10 ) * 100000 , y = seq(0 , 1 , length.out = 10 ) ) p1 <- ggplot(df, aes(y, x)) geom_point() scale_x_continuous(labels = scales::percent,name='percent' ) p2 <- ggplot(df, aes(y, x)) geom_point() scale_x_continuous(labels = scales::dollar,name='dollar' ) grid.arrange(p1,p2,ncol=2 )
scales::percent
、scales::dollar
分別指定x軸刻度的類別,分別為百分比和美元
'asn' , 'atanh' , 'boxcox' , 'date' , 'exp' , 'hms' , 'identity' , 'log' , 'log10' , 'log1p' , 'log2' , 'logit' , 'modulus' , 'probability' , 'probit' , 'pseudo_log' , 'reciprocal' , 'reverse' , 'sqrt' , 'time'
p1 <- ggplot(mtcars, aes(wt, mpg)) geom_point() scale_x_continuous(name='None' ) p2 <- ggplot(mtcars, aes(wt, mpg)) geom_point() scale_x_continuous(name='log2' ,trans='log2' ) grid.arrange(p1,p2,ncol=2 )
position 設(shè)定坐標(biāo)軸的位置,x軸 “top”、“bottom” ,y軸 'left'、'right'對(duì)于離散型數(shù)據(jù)
scale_x_discrete()
、scale_y_discrete()
函數(shù)的用法和連續(xù)型變量的用法類似,參數(shù)幾乎通用
ggplot(diamonds, aes(cut)) geom_bar() scale_x_discrete('Cut' ,labels = c('Fair' = 'F' ,'Good' = 'G' ,'Very Good' = 'VG' ,'Perfect' = 'P' ,'Ideal' = 'I' ))## 軸標(biāo)題、刻度標(biāo)簽替換
對(duì)于時(shí)間變量
一般使用函數(shù)scale_x_date()
、scale_y_date()
scale_x_date( name = waiver(), breaks = waiver(), date_breaks = waiver(), labels = waiver(), date_labels = waiver(), minor_breaks = waiver(), date_minor_breaks = waiver(), limits = NULL , expand = waiver(), oob = censor, guide = waiver(), position = 'bottom' , sec.axis = waiver() )
library (gridExtra) last_month <- Sys.Date() - 0 :29 ## 生成從今天起往前30天的時(shí)間序列 df <- data.frame( date = last_month, price = runif(30 ) ) ## 為30個(gè)時(shí)間序列隨機(jī)生成一個(gè)對(duì)應(yīng)的值 base <- ggplot(df, aes(date, price)) geom_line() p1 <- base scale_x_date(date_labels = '%b %d' ) labs(title='p1' ) p2 <- base scale_x_date(date_breaks = '1 week' , date_labels = '%W' ) labs(title='p2' ) p3 <- base scale_x_date(date_minor_breaks = '1 day' ) labs(title='p3' ) p4 <- base scale_x_date(limits = c(Sys.Date() - 7 , NA )) labs(title='p4' ) grid.arrange(p1,p2,p3,p4,ncol=2 )
p1中通過參數(shù)'date_labels '可以格式化輸出時(shí)間序列
p2中通過參數(shù)“date_breaks ”指定主坐標(biāo)的間隔
p3中通過參數(shù)'date_minor_breaks '指定主坐標(biāo)間的分隔距離
p4中通過參數(shù)'limits '限制x軸刻度范圍
圖形標(biāo)題 函數(shù)labs()
能為圖形修改或添加各種文字屬性
labs( ..., title = waiver(), subtitle = waiver(), caption = waiver(), tag = waiver(), alt = waiver(), alt_insight = waiver() )
p <- ggplot(mtcars, aes(mpg, wt, colour = cyl)) geom_point() p p_re <- p labs(x='XXX' ,y='YYY' ,title='title' ,subtitle='subtitle' , tag='tag' ,caption='caption' ,colour='colour' ,alt='This is alt' ) p_re############## > get_alt_text(p_re) [1 ] 'This is alt'
以上兩圖對(duì)比,展示出tag、title、subtiltle等的顯示位置
關(guān)于參數(shù)alt ,相當(dāng)于對(duì)圖形變量的描述,不會(huì)展示在具體圖形中,需要用函數(shù)get_alt_text()
來調(diào)用
顏色 1、顏色漸變
scale_colour_gradient( ... , low = '#132B43' , high = '#56B1F7' , space = 'Lab' , na.value = 'grey50' , guide = 'colourbar' , aesthetics = 'colour' )
guide 圖例的形式,連續(xù)型“colourbar”、離散型'legend'aesthetics 設(shè)定顏色映射通道 “fill”、'colour'ggplot(mpg, aes(displ, hwy, color = hwy)) geom_point() scale_color_gradient(low = '#132B43' , high = '#56B1F7' ,guide='colourbar' )
一幅從'#132B43'到'#56B1F7'的漸變點(diǎn)圖
2、調(diào)用調(diào)色板顏色
scale_colour_brewer( ... , type = 'seq' , palette = 1 , direction = 1 , aesthetics = 'colour' )
type seq (sequential)、div (diverging)、qual (qualitative)palette 指定調(diào)色版,可字符指定調(diào)色板,也可數(shù)字指定調(diào)色板列表中的種類(順序未知)。也可自己創(chuàng)建調(diào)色板。library (RColorBrewer) display.brewer.all() ## 默認(rèn)可選調(diào)色板類型
direction 顏色變換方向,1 正向,-1 反向library (gridExtra) dsamp <- diamonds[sample(nrow(diamonds), 1000 ), ] d <- ggplot(dsamp, aes(carat, price)) geom_point(aes(colour = clarity)) labs(title='default' ) d1 <- ggplot(dsamp, aes(carat, price)) geom_point(aes(colour = clarity)) scale_color_brewer(palette='BuGn' ) labs(title='BuGn' ) d2 <- ggplot(dsamp, aes(carat, price)) geom_point(aes(colour = clarity)) scale_color_brewer(palette='Set1' ) labs(title='Set1' ) d3 <- ggplot(dsamp, aes(carat, price)) geom_point(aes(colour = clarity)) scale_color_brewer(palette='PiYG' ) labs(title='PiYG' ) grid.arrange(d,d1,d2,d3,nrow=2 )
調(diào)整映射參數(shù) scale_alpha()
、scale_shape()
、scale_size()
p1 <- ggplot(mpg, aes(displ, hwy)) geom_point(aes(alpha = year)) p2 <- ggplot(mpg, aes(displ, hwy)) geom_point(aes(alpha = year)) scale_alpha(range = c(0.4 , 0.8 )) grid.arrange(p1,p2)
將透明度映射范圍限定在0.4~0.8
dsmall <- diamonds[sample(nrow(diamonds), 100 ), ] d <- ggplot(dsmall, aes(carat, price)) geom_point(aes(shape = cut)) d1 <- ggplot(dsmall, aes(carat, price)) geom_point(aes(shape = cut)) scale_shape(solid=F ) grid.arrange(d,d1)
參數(shù)solid 可改變點(diǎn)的填充
p1 <- ggplot(mpg, aes(displ, hwy, size = hwy)) geom_point() p2 <- ggplot(mpg, aes(displ, hwy, size = hwy)) geom_point() scale_size(range=c(0 ,10 )) p3 <- ggplot(mpg, aes(displ, hwy, size = hwy)) geom_point() scale_size_binned() grid.arrange(p1,p2,p3)
參數(shù)range 指定點(diǎn)的大小范圍
函數(shù)scale_size_binned()
使圖例分箱,更易觀察
坐標(biāo)系 ggplot2中提供了很多修改坐標(biāo)系的函數(shù)
1、coord_cartesian
默認(rèn)的直角坐標(biāo)系
coord_cartesian( xlim = NULL , ylim = NULL , expand = TRUE , default = FALSE , clip = 'on' )
p1 <- ggplot(mtcars, aes(disp, wt)) geom_point() geom_smooth() p2 <- ggplot(mtcars, aes(disp, wt)) geom_point() geom_smooth() coord_cartesian(expand=F ) grid.arrange(p1,p2)
clip 邊界外能否顯示點(diǎn),默認(rèn)'on'不顯示,'off'顯示p1 <- ggplot(mtcars, aes(disp, wt)) geom_point() geom_smooth() coord_cartesian(expand=F ,clip='off' ) labs(title = 'clip=\'off\'' ) p2 <- ggplot(mtcars, aes(disp, wt)) geom_point() geom_smooth() coord_cartesian(expand=F ) labs(title = 'clip=\'on\'' ) grid.arrange(p1,p2)
2、coord_fixed()
調(diào)整x軸與y軸的比例長度
coord_fixed(ratio = 1 , xlim = NULL , ylim = NULL , expand = TRUE , clip = 'on' )
ratio 默認(rèn)比例為1,即x軸與y軸上每個(gè)長度單位都一一對(duì)應(yīng)p1 <- ggplot(mtcars, aes(mpg, wt)) geom_point() labs(title='default' ) p2 <- ggplot(mtcars, aes(mpg, wt)) geom_point() coord_fixed() labs(title='ratio=1' ) p3 <- ggplot(mtcars, aes(mpg, wt)) geom_point() coord_fixed(ratio=5 ) labs(title='ratio=5' ) grid.arrange(p1,p2,p3)
3、coord_flip()
x軸,y軸調(diào)換
p1 <- ggplot(diamonds, aes(cut, price)) geom_boxplot() p2 <- ggplot(diamonds, aes(cut, price)) geom_boxplot() coord_flip() grid.arrange(p1,p2)
4、coord_polar
極坐標(biāo)系變換
coord_polar(theta = 'x' , start = 0 , direction = 1 , clip = 'on' )
pie <- ggplot(mtcars, aes(x = factor(1 ), fill = factor(cyl))) geom_bar(width = 1 ) p1 <- pie coord_polar() p2 <- pie coord_polar(theta='y' ) grid.arrange(p1,p2)
通過將柱形圖進(jìn)行坐標(biāo)系轉(zhuǎn)換,并將y值映射就可得到餅圖
direction 繪制的順序,1順時(shí)針,-1逆時(shí)針玫瑰圖的繪制,將柱形圖進(jìn)行極坐標(biāo)轉(zhuǎn)換
p1 <- ggplot(mpg,aes(class,fill=model)) geom_bar() theme(legend.position='none' ) p2 <- ggplot(mpg,aes(class,fill=model)) geom_bar() coord_polar() theme(legend.position='none' ) grid.arrange(p1,p2)
主題 ggplot2默認(rèn)出圖是灰底的圖,自帶函數(shù)theme_
有已配置好的主題可供選擇,也可使用函數(shù)theme()
自定義自己的主題
library (patchwork) mtcars2 <- within(mtcars, { vs <- factor(vs, labels = c('V-shaped' , 'Straight' )) am <- factor(am, labels = c('Automatic' , 'Manual' )) cyl <- factor(cyl) gear <- factor(gear) }) p <- ggplot(mtcars2) geom_point(aes(x = wt, y = mpg, colour = gear)) p1 <- p theme_gray() labs(title='theme_gray' ) p2 <- p theme_bw() labs(title='theme_bw' ) p3 <- p theme_linedraw() labs(title='theme_linedraw' ) p4 <- p theme_light() labs(title='theme_light' ) p5 <- p theme_dark() labs(title='theme_dark' ) p6 <- p theme_minimal() labs(title='theme_minimal' ) p7 <- p theme_classic() labs(title='theme_classic' ) p8 <- p theme_void() labs(title='theme_void' ) p9 <- p theme_test() labs(title='theme_test' ) (p1 / p4 / p7) | (p2 / p5 / p8) | (p3 / p6 / p9)
以上為9種預(yù)設(shè)的主題
theme(......)
可自定義的范圍太大了,先挖個(gè)坑,之后單獨(dú)填
注釋 注釋是一個(gè)特殊的圖層,不繼承全局設(shè)置,使用函數(shù)annotation()
對(duì)統(tǒng)計(jì)圖進(jìn)行注釋
annotate( geom, x = NULL , y = NULL , xmin = NULL , xmax = NULL , ymin = NULL , ymax = NULL , xend = NULL , yend = NULL , ... , na.rm = FALSE )
主要參數(shù)是 geom 指定需要添加注釋的類型,如,文字(text)、矩形(rect)等。后面的參數(shù)根據(jù) geom 的不同而不同 ggplot(mtcars, aes(x = wt, y = mpg)) geom_point() annotate('text' , x = 4 , y = 20 , label = 'text' , size = 10 , colour = 'green' )
簡單的向(4,20)處添加文本'text',還可以對(duì)文字樣式進(jìn)行自定義
ggplot(mtcars, aes(x = wt, y = mpg)) geom_point() annotate('rect' , xmin = 3 , xmax = 4.2 , ymin = 12 , ymax = 21 ,alpha = .2 ,fill='green' )
繪制一個(gè)矩形,并指定透明度和填充顏色
ggplot(mtcars, aes(x = wt, y = mpg)) geom_point() annotate('segment' , x = 2.5 , xend = 4 , y = 15 , yend = 25 ,colour = 'blue' ,size=1 )
'segment'指定兩點(diǎn)繪制線段
geom_()
類函數(shù)中,geom_abline()
(指定斜率、截距)、geom_hline()
(繪制橫線)、geom_vline()
(繪制豎線)也能繪制指定直線圖例 1、連續(xù)型變量
guide_colourbar()
或 guide_colorbar()
guide_colorbar( title = waiver(), title.position = NULL , title.theme = NULL , title.hjust = NULL , title.vjust = NULL , label = TRUE , label.position = NULL , label.theme = NULL , label.hjust = NULL , label.vjust = NULL , barwidth = NULL , barheight = NULL , nbin = 300 , raster = TRUE , frame.colour = NULL , frame.linewidth = 0.5 , frame.linetype = 1 , ticks = TRUE , ticks.colour = 'white' , ticks.linewidth = 0.5 , draw.ulim = TRUE , draw.llim = TRUE , direction = NULL , default.unit = 'line' , reverse = FALSE , order = 0 , available_aes = c('colour' , 'color' , 'fill' ), ... )
barwidth 、barheight 調(diào)整連續(xù)型圖例的寬度和高度p <- ggplot(mtcars,aes(drat,mpg,fill=qsec)) geom_point() p1 <- p guides(fill = guide_colourbar(title='title' ,label=F , title.position='bottom' ,barwidth=1 , frame.colour = 'black' ,ticks = F )) grid.arrange(p,p1)
使用函數(shù)時(shí),需嵌套入函數(shù)guides()
且指定映射
2、離散型變量
guide_legend( title = waiver(), title.position = NULL , title.theme = NULL , title.hjust = NULL , title.vjust = NULL , label = TRUE , label.position = NULL , label.theme = NULL , label.hjust = NULL , label.vjust = NULL , keywidth = NULL , keyheight = NULL , direction = NULL , default.unit = 'line' , override.aes = list(), nrow = NULL , ncol = NULL , byrow = FALSE , reverse = FALSE , order = 0 , ... )
keywidth 、keyheight 每個(gè)離散點(diǎn)外圍框框的大小
nrow 、ncol 修改多個(gè)圖例元素的排列方式
其他參數(shù)與連續(xù)型變量參數(shù)類似
p1 <- ggplot(mtcars, aes(drat, mpg, colour = factor(cyl))) geom_point() p2 <- ggplot(mtcars, aes(drat, mpg, colour = factor(cyl))) geom_point() guides(colour=guide_legend(title = 'title' ,keyheight=2 )) grid.arrange(p1,p2)
分面(Facetting) 根據(jù)數(shù)據(jù)的分組信息繪制多幅子圖,做到將高維數(shù)據(jù)降維表示的目的
函數(shù)facet_grid()
、facet_wrap()
兩種方式表示分面
1、facet_grid()
網(wǎng)格狀的分面,指定變量定義行和列
facet_grid( rows = NULL , cols = NULL , scales = 'fixed' , space = 'fixed' , shrink = TRUE , labeller = 'label_value' , as.table = TRUE , switch = NULL , drop = TRUE , margins = FALSE , facets = NULL )
rows 、cols 分別指定行列分面對(duì)象,搭配函數(shù)vars()
使用,或使用簡便格式 行分組變量~列分組變量 ,空著用 . 表示
scales 設(shè)置是否共用坐標(biāo)軸,fixed共用坐標(biāo)軸、free不共用
p<-ggplot(mpg, aes(cty, hwy)) geom_point(size=2 ,alpha=0.4 ) p1 <- p facet_grid(rows=vars(fl)) p2 <- p facet_grid(.~fl) p3 <- p facet_grid(vars(drv),vars(fl)) p4 <- p facet_grid(drv~fl,scales='free' ) grid.arrange(p1,p2,p3,p4,ncol=2 )
2、facet_wrap()
先按分組變量生成多個(gè)子圖,再按順序排列
簡便格式 ~ 分組變量1 分組變量2
facet_wrap( facets, nrow = NULL , ncol = NULL , scales = 'fixed' , shrink = TRUE , labeller = 'label_value' , as.table = TRUE , switch = NULL , drop = TRUE , dir = 'h' , strip.position = 'top' )
p <- ggplot(mpg, aes(displ, hwy)) geom_point() p1 <- p facet_wrap(vars(class)) p2 <- p facet_wrap(vars(class), nrow = 4 ) p3 <- p facet_wrap(~cyl drv) p4 <- p facet_wrap(vars(cyl, drv), labeller = 'label_both' ) grid.arrange(p1,p2,p3,p4,nrow=2 )
基礎(chǔ)語法就先簡單介紹這些。
還有很多ggplot2的細(xì)節(jié),比如自定義主題、各種geom_
的具體使用等,等實(shí)際用到再記錄一下,或者之后再單獨(dú)研究研究。