Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • moushengxu@gmail.com
    replied
    How to do the opposite?

    Originally posted by peachgil View Post
    In Bioconductor, just use the following codes:

    > library(org.Hs.eg.db)
    > library(annotate)
    > lookUp('3815', 'org.Hs.eg', 'SYMBOL')
    $`3815`
    [1] "KIT"

    > lookUp('3815', 'org.Hs.eg', 'REFSEQ')
    $`3815`
    [1] "NM_000222" "NM_001093772" "NP_000213" "NP_001087241"
    I have a set of HGNC gene symbols, and I want to convert them to Entrez Gene IDs.

    Thanks much!

    Leave a comment:


  • zaclown
    replied
    Thank you all guys

    Leave a comment:


  • jmw86069
    replied
    Always a fan of the linux one-liner, here is an example for human ACTB gene using hg18:

    mysql -h genome-mysql.cse.ucsc.edu -A -u genome -D hg18 -e "select k2ll.value as entrezGeneId, kx.refseq as refseqMrna, kx.geneSymbol as entrezGeneSymbol, kx.description as entrezGeneDesc from kgXref kx, knownToLocusLink k2ll where k2ll.name=kx.kgID and kx.geneSymbol='ACTB';"
    UCSC's C.elegans tables don't include the knownGene and kg% tables, but some poking around ( using "show tables like '%locus%';" ) led me to formulate this MySQL query that takes locusLinkId as input and prints the gene symbol, refseq mRNA, description, etc.

    mysql -h genome-mysql.cse.ucsc.edu -A -u genome -D ce6 -e "select rl.locusLinkId, rl.name as geneName, rl.product as geneDescription, rl.mrnaAcc as refseqMrna, rl.protAcc as refseqProt from refLink rl where rl.locusLinkId=174288;"
    The bummer is that you have to tell it to use "ce6" -- it isn't generic enough to sniff out what organism and version to use a priori. But you'll know which one to use right? :-) And you can of course change the "=174288" to "IN (174288, 174289,174290)" for more of a bulk-input-experience, depending upon what you need. If you end up batch-scripting some geneID conversions, I'd definitely use the "IN" clause instead of querying them one-by-one. Markedly faster.

    DAVID is in theory a great resource, but could be opened up to increase the API limits, or to allow direct data downloads.

    Leave a comment:


  • MDonlin
    replied
    You can also do ID conversion using Biomart at EBI.

    Leave a comment:


  • peachgil
    replied
    In Bioconductor, just use the following codes:

    > library(org.Hs.eg.db)
    > library(annotate)
    > lookUp('3815', 'org.Hs.eg', 'SYMBOL')
    $`3815`
    [1] "KIT"

    > lookUp('3815', 'org.Hs.eg', 'REFSEQ')
    $`3815`
    [1] "NM_000222" "NM_001093772" "NP_000213" "NP_001087241"

    Leave a comment:


  • rdu
    replied
    Bioconductor package "biomaRt" also could do it.

    Leave a comment:


  • Fuad
    replied
    DAVID has a Gene ID Conversion tool:



    Fuad

    Leave a comment:


  • Richard Finney
    replied
    NCBI maintains a flatfiles of gene annotations which contains the information you're after:
    ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz
    ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz
    ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2refseq.gz
    [ There are other interesting files in that directory ]


    The tax_id (taxonomy ID for C.Elgans is 6239 ) [ from Taxonomy browser http://www.ncbi.nlm.nih.gov/taxonomy ]

    You can type : "wget -nc ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_info.gz" from the command line, or download via a browser.

    Example using this data :
    bash-3.00$ cat gene2refseq | awk '{if ($1==6239) print $0}' | head
    6239 171590 REVIEWED NM_058260.3 193203640 NP_490660.1 17510631 NC_003279.6 193203938 4123 10231 - -
    6239 171591 REVIEWED NM_058259.3 193203639 NP_490661.1 17510629 NC_003279.6 193203938 11498 16830 + -
    6239 171592 REVIEWED NM_058261.3 133902001 NP_490662.1 17510633 NC_003279.6 193203938 17496 26780 - -
    6239 171592 REVIEWED NM_058262.3 86561628 NP_490663.1 17510635 NC_003279.6 193203938 17496 26780 - -
    6239 171593 REVIEWED NM_058263.3 115533565 NP_490664.2 115533566 NC_003279.6 193203938 27594 32481 - -
    6239 171594 REVIEWED NM_058265.3 71995026 NP_490666.2 25143331 NC_003279.6 193203938 49918 54359 + -
    6239 171595 REVIEWED NM_058267.4 115533567 NP_490668.4 115533568 NC_003279.6 193203938 55315 64020 - -
    6239 171597 REVIEWED NM_058269.2 71995034 NP_490670.1 17510145 NC_003279.6 193203938 85044 86283 - -
    6239 171599 REVIEWED NM_058271.6 212645149 NP_490672.2 25143337 NC_003279.6 193203938 93030 94880 + -
    6239 171600 REVIEWED NM_058272.4 212645150 NP_490673.1 17510147 NC_003279.6 193203938 96478 100612 - -
    -bash-3.00$ cat gene_info | grep 171590 | awk '{if ($1==6239) print $0}'
    6239 171590 Y74C9A.3 Y74C9A.3 - WormBase:WBGene00022277 I - hypothetical protein protein-coding - - - - 20101017

    Leave a comment:


  • gaffa
    replied
    Try UniProt's online conversion service: http://www.uniprot.org -> "ID Mapping" tab

    Leave a comment:


  • zaclown
    started a topic entrez ID conversion

    entrez ID conversion

    Hello,

    does anyone know how to convert entrez I.D. to either Refseq ID or Gene Symbols?
    I have found resources on Refseq to Gene Symbol conversion, but I can't find anything on Entrez I.D.
    The genome I work with is C. elegans.
    Thanks in advance for any suggestion

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 06:46 AM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-24-2024, 11:09 AM
0 responses
24 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-19-2024, 07:20 AM
0 responses
159 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
127 views
0 likes
Last Post seqadmin  
Working...
X