Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to download gene annotation from NCBI?

    The NCBI Map Viewer has the latest pig genome build and shows the
    locations of all the genes. I would like to download this gene
    annotation so I can load it into my own GBrowse genome browser.
    So I need the NCBI gene annotation for the latest pig genome build in
    gff3 format, and the way to do it seems to be to download an asn.1
    file from NCBI, convert it to genbank format, and then use the bioperl
    script bp_genbank2gff3.pl to convert from genbank to gff3.

    I downloaded the gene annotation for the pig genome from the NCBI ftp site at
    ftp://ftp.ncbi.nlm.nih.gov/gene/DATA..._scrofa.ags.gz

    I downloaded the asn2gb conversion program from
    ftp://ftp.ncbi.nlm.nih.gov/asn1-conv...latform/linux/

    I run ./linux.asn2gb -i Sus_scrofa.ags -b T
    and get the error "Asn io_failure for input file 'Sus_scrofa.ags'"
    I've tried all the options for the -a and -t flags without luck.

    I'm able to convert the Sus_scrofa.ags file to xml format using the
    gene2xml program, but I don't know of any tool that can convert from
    XML to gff3.
    I downloaded a genbank format file of pig genes from
    ftp://ftp.ncbi.nlm.nih.gov/genomes/S...RNA/rna.gbk.gz but the
    file doesn't give chromosome coordinates for the genes, so I can't
    make a gff3 file out of it.

    Any pointers on how to use the asn tools properly, or how to get NCBI
    annotation in gff format in general, would be much appreciated.

    Thanks

    -John

  • #2
    I managed to run:

    ./gene2xml.linux -i Sus_scrofa.ags -b T -c T

    This prints XML output. Strangely Sus_scrofa.ags had to be gzipped and named Sus_scrofa.ags.gz.

    XML to gff convertion should be fairly easy, but I do not know a tool yet. You may check:

    Comment


    • #3
      John,

      I'm running into the same problem that you had. The NCBI Sus scrofa genome FTP site provides .asn, .fa, .gbk, .gbs, and .mfa files for each chromosome (last updated 10-12-2011).

      Were you able to convert the .asn data to .gff3 or .gtf format for annotation? I'd be interested to hear the best method you found for generating the annotation file that corresponds to the most recent S. scrofa genome.

      Thanks in advance,
      jjw

      Comment


      • #4
        I was not able to figure out how to convert any of the NCBI annotation data into a usable form. I sent an email to NCBI but didn't get a useful reply from them. Thankfully another group has generated a good gene build for Sscr10.2. As described here: http://animalgenome.org/pig/newsletter/No.110.html, you can download annotation at this site: http://gbi.agrsci.dk/pig/sscrofa10_2_annotation/
        Alternatively, Ensembl is running 10.2 through their pipeline and should have a gene build available in two or three months. If you can wait that long that would be another good alternative to NCBI's annotation.

        Comment


        • #5
          Thanks for the quick reply, John.

          No doubt, you've saved me a lot of frustration. I appreciate it.

          jjw

          Comment


          • #6
            Originally posted by jgarbe View Post
            Any pointers on how to use the asn tools properly, or how to get NCBI annotation in gff format in general, would be much appreciated.
            The NCBI are currently revising all their GFF3 output (it hadn't been compliant with the standards), so this should be much easier now/soon.

            Try ftp://ftp.ncbi.nlm.nih.gov/genomes/Sus_scrofa/GFF/ for the NCBI RefSeq annotation of pig Sscrofa10.2

            Comment


            • #7
              Thanks Peter,

              I'll take a look.

              jjw14

              Comment


              • #8
                Hi ,all
                I need buffalo gff or.ggf3 file from NCBI but I donot know how can get it .
                Could anyone help me to know the answer
                Thanks

                Comment


                • #9
                  Bubalus bubalis? ftp://ftp.ncbi.nlm.nih.gov/genomes/Bubalus_bubalis/GFF/
                  Last edited by maubp; 08-12-2014, 02:00 AM.

                  Comment


                  • #10
                    Thanks Peter

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      The Impact of AI in Genomic Medicine
                      by seqadmin



                      Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                      02-26-2024, 02:07 PM
                    • seqadmin
                      Multiomics Techniques Advancing Disease Research
                      by seqadmin


                      New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

                      A major leap in the field has
                      ...
                      02-08-2024, 06:33 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 02-23-2024, 04:11 PM
                    0 responses
                    57 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 02-21-2024, 08:52 AM
                    0 responses
                    67 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 02-20-2024, 08:57 AM
                    0 responses
                    55 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 02-14-2024, 09:19 AM
                    0 responses
                    65 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X