How does one convert gene symbols into Entrez Gene IDs for using the data with GAGE?
Header Leaderboard Ad
Collapse
Entrez ID for GAGE
Collapse
Announcement
Collapse
No announcement yet.
X
-
See this thread (you will need to use the suggestion in post #3 in reverse): http://seqanswers.com/forums/showthread.php?t=9390
NCBI's e-Utilities may also help: http://www.ncbi.nlm.nih.gov/books/NBK179288/
-
Pathview package has a function id2eg, which convert various types of gene IDs to Entrez Gene ID for major research species. Check the help info:
library(pathview)
?id2eg
Meanwhile, gage package has a dedicated vignette on “Gene set and data preparation”, check section 5-“gene or transcript ID conversion::
Comment
-
Function id2eg in pathview package works only if the annotation package exists, which is not the case for S. pombe.
If you just need your gene set data in Entrez Gene ID, you use the kegg.gsets function in gage package:
> grep("pombe", korg[,2])
[1] 126
> korg[126,]
kegg.code scientific.name
"spo" "Schizosaccharomyces pombe"
common.name entrez.gnodes
"fission yeast" "0"
kegg.geneid ncbi.geneid
"SPAC144.03" "2542823"
>kg.spo=kegg.gsets(species =" spo", id.type ="entrez")
…
If you need to convert your input data gene IDs, you can follow the thread GenoMax referred above, to download the gene_info data file from NCBI ftp site:
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz
under unix/linux shell, do:
gunzip gene_info.gz
egrep '(^4896)' gene_info >>sp.gene_info.txt
Column 2-6 are (Entrez) GeneID, Symbol, LocusTag, Synonyms, dbXrefs. Note S. pombe taxonomy ID is 4896.
Or you can also use Bioconductor biomaRt package to the ID conversion.
Comment
-
Thanks bigmw! Could you be a bit more clear where should I apply these commands in the process? I am not sure if I need ENTREZ or not. I am sure if I want to use my cufflinks data then I have to convert the IDs, but is it the same if I want to do the analysis with Deseq2 for instance?
Also, In part 3.2 it starts with:
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
I need help with finding the corresponding package for S. pombe instead of "TxDb.Hsapiens.UCSC.hg19.knownGene"!
Sorry, I am totally confused in this with all the IDs and libraries! I appreciate if you can give me some more help.
Comment
Latest Articles
Collapse
-
by seqadmin
Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...-
Channel: Articles
09-07-2023, 11:15 PM -
-
by seqadmin
Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.
Whole Transcriptome RNA-seq
Whole transcriptome sequencing...-
Channel: Articles
08-31-2023, 11:07 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, Yesterday, 06:18 AM
|
0 responses
5 views
0 likes
|
Last Post
by seqadmin
Yesterday, 06:18 AM
|
||
Started by seqadmin, 09-20-2023, 09:17 AM
|
0 responses
8 views
0 likes
|
Last Post
by seqadmin
09-20-2023, 09:17 AM
|
||
Started by seqadmin, 09-19-2023, 09:23 AM
|
0 responses
25 views
0 likes
|
Last Post
by seqadmin
09-19-2023, 09:23 AM
|
||
Started by seqadmin, 09-19-2023, 09:14 AM
|
0 responses
7 views
0 likes
|
Last Post
by seqadmin
09-19-2023, 09:14 AM
|
Comment