Unlock the power of bioinformatics with our detailed guide on 'How to Efficiently Select Common Genes From Two Data Frames'. This resource provides step-by-tep instructions, helping researchers and biologists streamline their work by identifying common gene from different data sets. Perfect for beginners and experienced practitioners alike.
R script for choosing common genes from two data frames
cur_dir<-"E:/your_directory"
setwd(cur_dir)
data<-read.delim(file="data_file.txt",sep="\t",as.is = TRUE, stringsAsFactors = FALSE)
data1<-read.delim(file="data1_file.txt",sep = "\t", as.is = TRUE, stringsAsFactors = FALSE)
#Data processing
data<-data[!duplicated(data$id),]
data1<-data1[!duplicated(data1$id),]
rownames(data)<-data$id
rownames(data1)<-data1$id
data<-data[,-1]
data1<-data1[,-1]
common_genes <- intersect(rownames(data), rownames(data1))
data_common <- data[common_genes,]
data1_common <- data1[common_genes,]
Are you looking at the R script? doesn't it seem pretty easy? Let me give you a few caveats.
A few caveats
I did not use the "row.names=1" option in the step to load the data, because datasets resulted from RNA sequencing and microarrya have duplicated genes. So, I used "!duplicated" which is an option to remove duplicated genes.
'MolecularExpression_and_R_program' 카테고리의 다른 글
데이터 시각화 마법: R로 그래픽 그리기 시리즈 1 (0) | 2023.06.28 |
---|---|
data matrix 또는 data frame에서 원하는 유전자들을 추출하는 방법 (0) | 2023.04.14 |
댓글