Why are there different numbers of gene_id and exon_id in a txdb database from UCSC?

Dbuch

Junior Member

Join Date: Apr 2012

Posts: 3
- Share
- Tweet
#1

Why are there different numbers of gene_id and exon_id in a txdb database from UCSC?

04-20-2012, 11:09 AM

I'm trying to run DEXseq but am having trouble with the newExonCountSet function because my gene_id list and exon_id list are different sizes. When I unlist the mm10 gene_id dataset from UCSC annotated with Ensembl Genes (ensGene) it yields 351,862 gene_ids even though the corresponding exon_id dataset contains only 350,630 IDs. I had a similar problem when trying to annotate the UCSC database with RefSeq Genes (tablename = 'refGene'). Any thoughts on why there is a difference and how I can get the Gene_ID and Exon_ID datasets to match up? Thanks!

Code:

library("GenomicRanges") library("Rsamtools") library("GenomicFeatures") library("rtracklayer") txdb <- makeTranscriptDbFromUCSC(genome="mm10", tablename = "ensGene") #Getting rid of strand information exonRangesList <- exons(txdb) strand(exonRangesList) <- '*' exonRangesNoStrand <- split(exonRangesList) ExonIDs <- exons(txdb, vals=NULL, columns=c('gene_id', 'exon_id')) GeneIDList <- elementMetadata(ExonIDs) GeneIDVector <- GeneIDList$gene_id ExonIDVector <- GeneIDList$exon_id GeneIDVector2 <-unlist(GeneIDVector)
Tags: None
Dbuch

Junior Member

Join Date: Apr 2012

Posts: 3
- Share
- Tweet
#2

05-09-2012, 07:32 AM

Update: When I make the txdb using the code below (getting the data from Ensembl instead of UCSC) you get equal numbers of gene_id and exon_id.

Code:

txdb <- makeTranscriptDbFromBiomart(biomart="ensembl", dataset="mmusculus_gene_ensembl")
Comment

Previous template Next

Essential Discoveries and Tools in Epitranscriptomics

by seqadmin

The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
- Channel: Articles
04-22-2024, 07:01 AM

Topics	Statistics	Last Post
A Close Examination at Probiotic-Related Bacteremia by seqadmin Started by seqadmin, 05-02-2024, 08:06 AM	0 responses 16 views 0 likes	Last Post by seqadmin 05-02-2024, 08:06 AM
Expanded Genetic Insights into Blood Pressure Regulation by seqadmin Started by seqadmin, 04-30-2024, 12:17 PM	0 responses 20 views 0 likes	Last Post by seqadmin 04-30-2024, 12:17 PM
The Role of Enhancers in Defining Cell Fate by seqadmin Started by seqadmin, 04-29-2024, 10:49 AM	0 responses 25 views 0 likes	Last Post by seqadmin 04-29-2024, 10:49 AM
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, 04-25-2024, 11:49 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-25-2024, 11:49 AM

Seqanswers Leaderboard Ad

Announcement

Why are there different numbers of gene_id and exon_id in a txdb database from UCSC?

Comment

Latest Articles

ad_right_rmr

News