Announcement

Collapse

Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

Downloading data from ncbi

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Downloading data from ncbi

    Hello
    I've been trying to get the hang of NCBI's esearch suite as I want to download their gene summary paragraphs. Would anyone be able to clarify the correct code format for this to output a file that is of the form

    Gene Summary
    PCSK9 'This gene encodes...'

    If this was for all genes or for a list provided both would work thanks
    Last edited by Cannon; 01-13-2022, 04:30 PM.

  • #2
    Using Entrezdirect:

    Code:
    $ esearch -db gene -query "PCSK9 [GENE] AND human [ORGN]" | efetch -format acc
    
    1. PCSK9
    Official Symbol: PCSK9 and Name: proprotein convertase subtilisin/kexin type 9 [Homo sapiens (human)]
    Other Aliases: FH3, FHCL3, HCHOLA3, LDLCQ1, NARC-1, NARC1, PC9
    Other Designations: proprotein convertase subtilisin/kexin type 9; convertase subtilisin/kexin type 9 preproprotein; neural apoptosis regulated convertase 1; subtilisin/kexin-like protease PC9
    Chromosome: 1; Location: 1p32.3
    Annotation: Chromosome 1 NC_000001.11 (55039548..55064852)
    MIM: 607786
    ID: 255738
    
    2. PCSK9
    Official Symbol: PCSK9 and Name: proprotein convertase subtilisin/kexin type 9 [Homo sapiens (human)]
    Other Aliases: FH3, HCHOLA3, NARC-1, NARC1
    Other Designations: Hypercholesterolemia, familial, 3; hypercholesterolemia, autosomal dominant 3
    Chromosome: 1; Location: 1p34.1-p32
    This record was replaced with GeneID: 255738
    ID: 353175

    Comment


    • #3
      Another variation:

      Code:
      $ esearch -db gene -query "PCSK9 [GENE] AND human [ORGN]" | esummary | xtract -pattern DocumentSummary -element Name,Summary
      PCSK9	This gene encodes a member of the subtilisin-like proprotein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway. The encoded protein undergoes an autocatalytic processing event with its prosegment in the ER and is constitutively secreted as an inactive protease into the extracellular matrix and trans-Golgi network. It is expressed in liver, intestine and kidney tissues and escorts specific receptors for lysosomal degradation. It plays a role in cholesterol and fatty acid metabolism. Mutations in this gene have been associated with autosomal dominant familial hypercholesterolemia. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Feb 2014]
      For more than one gene put them in a file:

      Code:
      $ more id
      BRCA2
      TP53
      PCSK9
      
      $ for i in `cat id`; do esearch -db gene -query "${i} [GENE] AND human [ORGN]" | esummary | xtract -pattern DocumentSummary -element Name,Summary; done
      BRCA2	Inherited mutations in BRCA1 and this gene, BRCA2, confer increased lifetime risk of developing breast or ovarian cancer. Both BRCA1 and BRCA2 are involved in maintenance of genome stability, specifically the homologous recombination pathway for double-strand DNA repair. The largest exon in both genes is exon 11, which harbors the most important and frequent mutations in breast cancer patients. The BRCA2 gene was found on chromosome 13q12.3 in human. The BRCA2 protein contains several copies of a 70 aa motif called the BRC motif, and these motifs mediate binding to the RAD51 recombinase which functions in DNA repair. BRCA2 is considered a tumor suppressor gene, as tumors with BRCA2 mutations generally exhibit loss of heterozygosity (LOH) of the wild-type allele. [provided by RefSeq, May 2020]
      TP53	This gene encodes a tumor suppressor protein containing transcriptional activation, DNA binding, and oligomerization domains. The encoded protein responds to diverse cellular stresses to regulate expression of target genes, thereby inducing cell cycle arrest, apoptosis, senescence, DNA repair, or changes in metabolism. Mutations in this gene are associated with a variety of human cancers, including hereditary cancers such as Li-Fraumeni syndrome. Alternative splicing of this gene and the use of alternate promoters result in multiple transcript variants and isoforms. Additional isoforms have also been shown to result from the use of alternate translation initiation codons from identical transcript variants (PMIDs: 12032546, 20937277). [provided by RefSeq, Dec 2016]
      PCSK9	This gene encodes a member of the subtilisin-like proprotein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway. The encoded protein undergoes an autocatalytic processing event with its prosegment in the ER and is constitutively secreted as an inactive protease into the extracellular matrix and trans-Golgi network. It is expressed in liver, intestine and kidney tissues and escorts specific receptors for lysosomal degradation. It plays a role in cholesterol and fatty acid metabolism. Mutations in this gene have been associated with autosomal dominant familial hypercholesterolemia. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Feb 2014]
      Last edited by GenoMax; 01-14-2022, 06:47 PM.

      Comment


      • #4
        Originally posted by GenoMax View Post
        Another variation:

        Code:
        $ esearch -db gene -query "PCSK9 [GENE] AND human [ORGN]" | esummary | xtract -pattern DocumentSummary -element Name,Summary
        PCSK9	This gene encodes a member of the subtilisin-like proprotein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway. The encoded protein undergoes an autocatalytic processing event with its prosegment in the ER and is constitutively secreted as an inactive protease into the extracellular matrix and trans-Golgi network. It is expressed in liver, intestine and kidney tissues and escorts specific receptors for lysosomal degradation. It plays a role in cholesterol and fatty acid metabolism. Mutations in this gene have been associated with autosomal dominant familial hypercholesterolemia. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Feb 2014]
        For more than one gene put them in a file:

        Code:
        $ more id
        BRCA2
        TP53
        PCSK9
        
        $ for i in `cat id`; do esearch -db gene -query "${i} [GENE] AND human [ORGN]" | esummary | xtract -pattern DocumentSummary -element Name,Summary; done
        BRCA2	Inherited mutations in BRCA1 and this gene, BRCA2, confer increased lifetime risk of developing breast or ovarian cancer. Both BRCA1 and BRCA2 are involved in maintenance of genome stability, specifically the homologous recombination pathway for double-strand DNA repair. The largest exon in both genes is exon 11, which harbors the most important and frequent mutations in breast cancer patients. The BRCA2 gene was found on chromosome 13q12.3 in human. The BRCA2 protein contains several copies of a 70 aa motif called the BRC motif, and these motifs mediate binding to the RAD51 recombinase which functions in DNA repair. BRCA2 is considered a tumor suppressor gene, as tumors with BRCA2 mutations generally exhibit loss of heterozygosity (LOH) of the wild-type allele. [provided by RefSeq, May 2020]
        TP53	This gene encodes a tumor suppressor protein containing transcriptional activation, DNA binding, and oligomerization domains. The encoded protein responds to diverse cellular stresses to regulate expression of target genes, thereby inducing cell cycle arrest, apoptosis, senescence, DNA repair, or changes in metabolism. Mutations in this gene are associated with a variety of human cancers, including hereditary cancers such as Li-Fraumeni syndrome. Alternative splicing of this gene and the use of alternate promoters result in multiple transcript variants and isoforms. Additional isoforms have also been shown to result from the use of alternate translation initiation codons from identical transcript variants (PMIDs: 12032546, 20937277). [provided by RefSeq, Dec 2016]
        PCSK9	This gene encodes a member of the subtilisin-like proprotein convertase family, which includes proteases that process protein and peptide precursors trafficking through regulated or constitutive branches of the secretory pathway. The encoded protein undergoes an autocatalytic processing event with its prosegment in the ER and is constitutively secreted as an inactive protease into the extracellular matrix and trans-Golgi network. It is expressed in liver, intestine and kidney tissues and escorts specific receptors for lysosomal degradation. It plays a role in cholesterol and fatty acid metabolism. Mutations in this gene have been associated with autosomal dominant familial hypercholesterolemia. Alternative splicing results in multiple transcript variants. [provided by RefSeq, Feb 2014]
        Hello, I am also trying to get the hang of NCBI's esearch suite as I want to download their gene summary paragraphs. So, I was searching for it online and gladly, I have found my question answer in your post. I will surely try those codes. Do you guys want to write assignments but facing difficulties in writing them? If yes, then you don't have to worry anymore because you can visit https://www.topessaywriting.org/samples/industry here to take help for your assignments. It will save a lot of your precious time.
        Last edited by RuthDFelix; 02-21-2022, 06:08 AM.

        Comment

        Working...
        X