Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Entrez ID for GAGE

    How does one convert gene symbols into Entrez Gene IDs for using the data with GAGE?

  • #2
    See this thread (you will need to use the suggestion in post #3 in reverse): http://seqanswers.com/forums/showthread.php?t=9390

    NCBI's e-Utilities may also help: http://www.ncbi.nlm.nih.gov/books/NBK179288/

    Comment


    • #3
      Pathview package has a function id2eg, which convert various types of gene IDs to Entrez Gene ID for major research species. Check the help info:
      library(pathview)
      ?id2eg

      Meanwhile, gage package has a dedicated vignette on “Gene set and data preparation”, check section 5-“gene or transcript ID conversion::

      Comment


      • #4
        Thanks, I work on S. pombe and I cannot an annotation package for it on bioconductor.
        What should I do? And what should I put for org?

        > gnames.eg=pathview::id2eg(gnames, category="symbol", org="????")

        Comment


        • #5
          Function id2eg in pathview package works only if the annotation package exists, which is not the case for S. pombe.
          If you just need your gene set data in Entrez Gene ID, you use the kegg.gsets function in gage package:
          > grep("pombe", korg[,2])
          [1] 126
          > korg[126,]
          kegg.code scientific.name
          "spo" "Schizosaccharomyces pombe"
          common.name entrez.gnodes
          "fission yeast" "0"
          kegg.geneid ncbi.geneid
          "SPAC144.03" "2542823"
          >kg.spo=kegg.gsets(species =" spo", id.type ="entrez")


          If you need to convert your input data gene IDs, you can follow the thread GenoMax referred above, to download the gene_info data file from NCBI ftp site:
          ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz
          under unix/linux shell, do:
          gunzip gene_info.gz
          egrep '(^4896)' gene_info >>sp.gene_info.txt

          Column 2-6 are (Entrez) GeneID, Symbol, LocusTag, Synonyms, dbXrefs. Note S. pombe taxonomy ID is 4896.

          Or you can also use Bioconductor biomaRt package to the ID conversion.

          Comment


          • #6
            Thanks bigmw! Could you be a bit more clear where should I apply these commands in the process? I am not sure if I need ENTREZ or not. I am sure if I want to use my cufflinks data then I have to convert the IDs, but is it the same if I want to do the analysis with Deseq2 for instance?

            Also, In part 3.2 it starts with:

            > library(TxDb.Hsapiens.UCSC.hg19.knownGene)

            I need help with finding the corresponding package for S. pombe instead of "TxDb.Hsapiens.UCSC.hg19.knownGene"!

            Sorry, I am totally confused in this with all the IDs and libraries! I appreciate if you can give me some more help.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 08:47 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            54 views
            0 likes
            Last Post seqadmin  
            Working...
            X