需要者詳情請聯(lián)系作者(非需要者勿擾,我很社恐): 1、購買打包合集(2025KS微信VIP付費合集),價格感人,加入微信VIP群(答疑交流群,甚至有小伙伴覺得群比代碼更好),可以獲取建號以來所有內(nèi)容,群成員專享視頻教程,提前更新,其他更多福利! 2、《KS科研分享與服務》公眾號有QQ群,進入門檻是20元(完全是為了防止白嫖黨,請理解),請考慮清楚。群里有免費推文的注釋代碼和示例數(shù)據(jù)(終身擁有),沒有付費內(nèi)容,群成員福利是購買單個付費內(nèi)容半價! 還是來源于之前那篇cancer cell,小伙伴對于其中的一些分析方式比較感興趣,之前沒有提及,所以分享一下。原文結(jié)果是這樣描述的”Notably, based on RNA-seq ontology graphic user environment (ROGUE)......our clusters demonstrated high internal homogeneity......",提到了一種方法ROGUE,可以判斷細胞的同質(zhì)性,或者說細胞群的純度。  (reference:Cross-tissue human fibroblast atlas reveals myofibroblast subtypes with distinct roles in immune modulation)看了一下,包出來很久了,2020年,是張澤明院士團隊的,他們真的對單細胞領域貢獻了好多方法。包的原文發(fā)表在Nature communications上,參考:An entropy-based metric for assessing the purityof single cell populations。這個包用于準確量化單細胞RNA測序(scRNA-seq)數(shù)據(jù)中鑒定的細胞簇的純度。作者證明ROGUE廣泛適用,能夠?qū)Ω鞣N模擬和真實數(shù)據(jù)集中的簇純度進行準確、敏感和穩(wěn)健的評估。他們表明ROGUE可以識別額外的細胞亞型,并有助于檢測特定亞群中的精確生物信號。感覺說了又好像沒說:那具體有啥用呢?就不套用官方的話了,用我自己的理解說一下,不對的地方還請批評指正。主要有兩個方面可應用:- 其一:ROGUE值可以用于識別純度較高的細胞亞型。這應該是大多數(shù)人會遇到的一個問題,提取大類細胞做亞群鑒定,到底分幾群才合適呢?很多人可能是佛系的聽天由命。而ROGUE值恰好可以為我們提供一個參考判斷。ROGUE越高,越接近1,表明細胞群越純,反之表示細胞群異質(zhì)性比較高,這個群體還可再細分,這樣我們可以分離得到一些亞群。
- 其二:ROGUE可以用于評估批次效應的影響。這個主要針對多數(shù)據(jù)集的整合,或者不同來源數(shù)據(jù)整合,比如我們見過一篇文獻就是多個公共數(shù)據(jù)庫數(shù)據(jù)整合,建設審稿人問你如何確定不同數(shù)據(jù)集批次效應有無去除,單純的放一個UMAP圖可能說服不了他,那么就可以搬出ROUGE算法,計算每種celltype在不同來源/不同sample中的純度,如果ROUGE高,細胞群體純度高,批次效應弱!
#github鏈接:https:///PaulingLiu/ROGUE #教程鏈接:https://htmlpreview.github.io/?https:///PaulingLiu/ROGUE/blob/master/vignettes/ROGUE_Tutorials.html
#install.packages("tidyverse") 這個是依賴包,之前沒有的話先安裝 if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools") devtools::install_github("PaulingLiu/ROGUE") #下載安裝不成功也可以本地安裝,本地包下載地址 https://codeload.github.com/PaulingLiu/ROGUE/legacy.tar.gz/HEAD 加載數(shù)據(jù)(我們演示的數(shù)據(jù)時一個Epi亞群初步分群的數(shù)據(jù)),提取矩陣和metadata:過濾低質(zhì)量細胞和表達量低的基因。
setwd('D:\\KS項目\\公眾號文章\\ROUGE單細胞純度分析')
#安裝包 # if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools") # devtools::install_github("PaulingLiu/ROGUE")
library(Seurat) library(ROGUE) library(tidyverse) library(ggplot2) library(ggrastr) #------------------------------------------------------------------------------- expr <- GetAssayData(Epi, assay = 'RNA',layer = 'counts') %>% as.matrix() meta <- Epi@meta.data
#filter genes and cells expr <- matr.filter(expr, min.cells = 10, min.genes = 10)
ent.res <- SE_fun(expr) SEplot(ent.res) ROGUE calculation,這個是針對整個Epi細胞群體的計算,得到的最終值是0.3很低,說明Epi群體異質(zhì)性很大,這個是符合的,且不說Epi可以分亞群,我們這個演示數(shù)據(jù)的Epi包含的還是正常人和腫瘤病人的Epi,那自然異質(zhì)性更大了。rogue.value <- CalculateRogue(ent.res, platform = "UMI") #[1] 0.339205 為了獲得每個聚類的準確純度估計值,計算不同樣本中每種細胞類型的ROGUE值。并用箱線圖可視化結(jié)果!rogue.res <- rogue(expr, labels = meta$seurat_clusters, samples = meta$orig.ident, platform = "UMI", span = 0.6)
#這組顏色來源于cancer cell,可以收藏 myColor <- c("#E41B1B", "#4376AC", "#48A75A", "#87638F", "#D87F32", "#737690", "#D690C6","#B17A7D", "#847A74", "#4285BF","#204B75", "#588257", "#B6DB7B", "#E3BC06", "#FA9B93", "#E9358B", "#A0094E", "#999999", "#6FCDDC", "#BD5E95") #寬數(shù)據(jù)轉(zhuǎn)化為長數(shù)據(jù)格式,使用ggplot作圖 plotData <- rogue.res %>% tidyr::gather(key = clusters, value = ROGUE) %>% filter(!is.na(ROGUE)) #散點箱線圖 ggplot(data = plotData, aes(clusters, ROGUE, color = clusters)) + geom_boxplot(outlier.shape = NA) +#添加box geom_jitter_rast(shape = 16, position = position_jitter(0.2)) +#添加抖動散點 scale_color_manual(values = myColor) + theme_classic() + theme( axis.text = element_text(size = 12, colour = "black"), axis.title = element_text(size = 13, colour = "black") ) + labs(x = "", y = "ROGUE index") + ylim(0, 1)
以上我們使用的是一個亞群分析的數(shù)據(jù),可能有些例子舉的不是很恰當,但是也可以作為參考。同時我們也演示另外一組數(shù)據(jù),關于批次效應的數(shù)據(jù)演示這是一個大型數(shù)據(jù)集,合并了多個數(shù)據(jù)庫不同來源的同一組織的單細胞數(shù)據(jù),這里使用ROGUE驗證一下。合并公共數(shù)據(jù)庫最讓人擔心的問題不就是怕批次效應,或者數(shù)據(jù)集差異,最終得到錯誤結(jié)果。分析和前面一樣,很簡單。對于這個數(shù)據(jù),最后我們計算了每個樣本的cluster純度,以及每個數(shù)據(jù)庫下cluster純度,發(fā)現(xiàn)ROGUE還可以,說明批次效應較小。library(Seurat) library(ROGUE) library(tidyverse) library(ggplot2) library(ggrastr)
# Rce <- subset(sce, sequencing=='scRNA_seq')#單細胞seurat obj #------------------------------------------------------------------------------- expr <- GetAssayData(Rce, assay = 'RNA',layer = 'counts') %>% as.matrix() meta <- Rce@meta.data
#filter genes and cells expr <- matr.filter(expr, min.cells = 15, min.genes = 15) #-------------------------------------------------------------------------------
ent.res <- SE_fun(expr) SEplot(ent.res)
#------------------------------------------------------------------------------- #ROGUE calculation
rogue.value <- CalculateRogue(ent.res, platform = "UMI")
rogue.res.sample <- rogue(expr, labels = meta$celltype, samples = meta$orig.ident, platform = "UMI", span = 0.6) rogue.res.database <- rogue(expr, labels = meta$celltype, samples = meta$database, platform = "UMI", span = 0.6)
write.csv(rogue.res.sample, file = 'rogue.res.sample.csv') write.csv(rogue.res.database, file = 'rogue.res.database.csv')
rogue_sample <- read.csv('rogue.res.sample.csv', header = T,row.names = 1) myColor <- c( "#E41B1B", "#4376AC", "#48A75A", "#87638F", "#D87F32", "#737690", "#D690C6", "#B17A7D", "#847A74", "#4285BF", "#204B75", "#588257", "#B6DB7B", "#E3BC06", "#FA9B93", "#E9358B", "#A0094E", "#999999", "#6FCDDC", "#BD5E95" ) plot_rogue_sample <- rogue_sample %>% tidyr::gather(key = clusters, value = ROGUE) %>% filter(!is.na(ROGUE)) ggplot(data = plot_rogue_sample, aes(clusters, ROGUE, color = clusters)) + geom_boxplot(outlier.shape = NA) + geom_jitter_rast(shape = 16, position = position_jitter(0.2)) + scale_color_manual(values = myColor) + theme_classic() + theme( axis.text = element_text(size = 12, colour = "black"), axis.title = element_text(size = 13, colour = "black") ) + labs(x = "", y = "ROGUE index") + ylim(0, 1)
 rogue_database <- read.csv('rogue.res.database.csv',header = T,row.names = 1) plot_rogue_database <- rogue_database %>% tidyr::gather(key = clusters, value = ROGUE) %>% filter(!is.na(ROGUE)) ggplot(data = plot_rogue_database, aes(clusters, ROGUE, color = clusters)) + geom_boxplot(outlier.shape = NA) + geom_jitter_rast(shape = 16, position = position_jitter(0.2)) + scale_color_manual(values = myColor) + theme_classic() + theme( axis.text = element_text(size = 12, colour = "black"), axis.title = element_text(size = 13, colour = "black") ) + labs(x = "", y = "ROGUE index") + ylim(0, 1)
|