본문 바로가기
MolecularExpression_and_R_program

How to efficiently Select Common Genes from Two Data Frames

by bioExplorer 2023. 6. 27.

Unlock the power of bioinformatics with our detailed guide on 'How to Efficiently Select Common Genes From Two Data Frames'. This resource provides step-by-tep instructions, helping researchers and biologists streamline their work by identifying common gene from different data sets. Perfect for beginners and experienced practitioners alike.

R script for choosing common genes from two data frames

cur_dir<-"E:/your_directory"
setwd(cur_dir)
data<-read.delim(file="data_file.txt",sep="\t",as.is = TRUE, stringsAsFactors = FALSE)
data1<-read.delim(file="data1_file.txt",sep = "\t", as.is = TRUE, stringsAsFactors = FALSE)
#Data processing
data<-data[!duplicated(data$id),]
data1<-data1[!duplicated(data1$id),]
rownames(data)<-data$id
rownames(data1)<-data1$id
data<-data[,-1]
data1<-data1[,-1]
common_genes <- intersect(rownames(data), rownames(data1))
data_common <- data[common_genes,]
data1_common <- data1[common_genes,]

Are you looking at the R script? doesn't it seem pretty easy? Let me give you a few caveats.

A few caveats

I did not use the "row.names=1" option in the step to load the data, because datasets resulted from RNA sequencing and microarrya have duplicated genes. So, I used "!duplicated" which is an option to remove duplicated genes.

댓글