Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GenBank to .tbl (Sequin format)

    Hi everyone,

    I'm working on submitting a set of whole genome shotgun sequencing projects to GenBank/NCBI. For this set of genomes, I have annotations which were generated using the RAST system (in GenBank and FFF format). However, in order to submit to GenBank/NCBI, these annotations need to be converted to what NCBI calls a 'feature table' (Sequin format/.tbl file). The file format is detailed here: http://www.ncbi.nlm.nih.gov/Sequin/table.html

    I've searched the web for parsers to create the required table format using either GenBank or FFF formated files, and have asked the NCBI support staff if they know of such a parser. However, I have not been able to find one. Does anyone know where I can find something to convert between GenBank or FFF and the NCBI feature table format?

    Thanks in advance!

    Sincerely,
    Erin

  • #2
    I thought you could give them GenBank/EMBL files too? Maybe I'm thinking of EMBL not the NCBI...

    P.S. What is this "FFF format"? I thought it was a typo for GFF, but you did it three times.

    Comment


    • #3
      Originally posted by maubp View Post
      I thought you could give them GenBank/EMBL files too? Maybe I'm thinking of EMBL not the NCBI...
      Unfortunately not Bit crazy. But it's easy to write a conversion script between the two. I've got one somewhere.

      Comment


      • #4
        I asked and they won't except GenBank files. It seems a bit crazy, since that's what they're going to make out of the .tbl/Sequin file anyway.

        I'm sure I could write my own conversion script, but I'm a bit new to this whole scripting business, so it may take me a whole. I thought it was worth checking with the community to see if someone had one handy before I go through the trouble.

        And yes, FFF was a typo for GFF. Guess my thinking cap was a bit loose at the end of the day. Sorry for the confusion.

        Comment


        • #5
          Just found one parser that claims to convert between GenBank and Sequin, but it appears to work for only one contig at a time (created table ends after the last gene of the first contig) and ignores tRNAs.

          Comment


          • #6
            I'll try and dig out my script.

            If it's any help, Torsten Seemann's automated annotation pipeline can output sequin and/or table format:

            Comment


            • #7
              Thanks nickloman, we've thought about just re-doing the annotations through NCBI's pipeline, but the problem is we already used the annotations we have for all of our analyses and want to have them associated with the genomes when we submit them. I'm working on seeing if I can use the parser I posted above if I pre-split the files into contigs and add the tRNAs/rRNAs by hand, but I'll keep an eye out in case you find your script first!

              Comment


              • #8
                Found it! Hope it's vaguely useful:

                genbank_to_tbl.py. GitHub Gist: instantly share code, notes, and snippets.

                Comment


                • #9
                  Great! Thanks!

                  Erin

                  Comment


                  • #10
                    Hey nikloman,

                    Just as an fyi and a note for potential future users of your script, the code you linked to broke at the first CDS feature in my GBK. I made a couple of minor changes and it seems to work now, although it doesn't pick up the annotations for the tRNAs/rRNAs. At this point I figure it's relatively trivial to go through and add those in by hand for a small number of genomes. In the future I will be submitting an additional ~70 genomes, and will (hopefully) post an updated script with that feature fixed.

                    I've attached my edits as a plain text file (the forum wont accept a .py file).

                    Thank you again!

                    Erin
                    Attached Files

                    Comment


                    • #11
                      Ah OK, well it's like most scripts - you get it working for your problem and then you forget about it. But glad you could make it run for you!

                      Comment


                      • #12
                        Have either of you found a gff to the Sequin format/.tbl file converter?

                        Comment


                        • #13
                          Originally posted by oudacontrol View Post
                          Have either of you found a gff to the Sequin format/.tbl file converter?
                          nickloman's script works fine for the format conversion itself, but then there are a myriad of changes that must be made to your original annotations to conform with GenBank naming conventions. For the number of genomes I'm submitting, I found it easier to just submit the fasta files for re-submission through NCBI's pipeline, which spits out Sequin formatted files.

                          Comment


                          • #14
                            Hi everyone.

                            for people who have Artemis intalled on their computer, you can also open the .gbk with the soft and use the 'SAVE AS' menu to save it under the sequin/tbl format.

                            All features are kept, as well as tRNA and rRNA information.

                            hope it may help.

                            seb.

                            Comment


                            • #15
                              Thanks, it helps me, but Artemis can only read and convert the first contig in a muti-genbank file.

                              Originally posted by seb.lees View Post
                              Hi everyone.

                              for people who have Artemis intalled on their computer, you can also open the .gbk with the soft and use the 'SAVE AS' menu to save it under the sequin/tbl format.

                              All features are kept, as well as tRNA and rRNA information.

                              hope it may help.

                              seb.
                              Last edited by wanyu; 06-15-2015, 03:26 AM.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Genetic Variation in Immunogenetics and Antibody Diversity
                                by seqadmin



                                The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                                11-06-2024, 07:24 PM
                              • seqadmin
                                Choosing Between NGS and qPCR
                                by seqadmin



                                Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                10-18-2024, 07:11 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 11:09 AM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Today, 06:13 AM
                              0 responses
                              20 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 11-01-2024, 06:09 AM
                              0 responses
                              30 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-30-2024, 05:31 AM
                              0 responses
                              21 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X