Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convert SAM to GFF

    Hi,

    Does anyone know of an easy way to convert files in the SAM format to GFF (or GFF3)?

    I do RNA-seq and have my output from tophat that I'd like to visualize on gbrowse (v1.7), which as far as I can tell does not support SAM. I know I could upgrade to Gbrowse2, haven't got the courage just yet..

    Any help is appreciated,
    Darwin

  • #2
    Do not try to use GFF for Next Gen sequencing data. You will melt your cpu into a smoking pile of ash. O.K. not really; truth is you browser will time out before GBrowse ever returns.

    GBrowse 1.7 can use the Bio:B::SAM backend. See the poster at this link.

    But really, make the switch to GBrowse 2.0. It is now very easy to install; you install both BioPerl 1.6 and GBrowse 2 through CPAN. Configuring multiple data sources for a single browser makes life much easier.

    Comment


    • #3
      Thanks for the reply.

      I'm a bench biologist just getting my toes wet on the command line and so prefer to keep it as uncomplicated as possible. But I think you're right that there's a shiny new GBrowse in my future.

      Have you or anybody else installed GBrowse2 on OS 10.6 or would I need to set up a virtual ubuntu box? It took me *a while* to install GBrowse 1.7 on this system hence my hesitation to switch.

      Comment


      • #4
        Originally posted by Darwin View Post
        Thanks for the reply. Have you or anybody else installed GBrowse2 on OS 10.6 or would I need to set up a virtual ubuntu box? It took me *a while* to install GBrowse 1.7 on this system hence my hesitation to switch.
        The more times you install BioPerl and GBrowse, the faster you get at it .

        I have installed GBrowse on my Mac which is running 10.5. When installing on a Mac the thing to remember is that the standard locations for the web browser document and cgi-bin directories are different from the standard Linux locations. You will need to know these locations when you are configuring GBrowse on your Mac.

        Also, if you are installing via the CPAN shell, make sure to set the shell to 'verbose', otherwise you will not see the configuration queries and it will appear that the install is stalled. This is true for both Mac and Linux.

        Comment


        • #5
          If you are not limited to GBrowse, you may be interested in trying other browsers (IGV, IGB, CisGenome, etc). A lot of them load SAM/BAM files directly. You can find more info in this thread for instance.

          Comment


          • #6
            I would like to stay on GBrowse because is seems versatile (can host all sorts of genomic features) and that's where I have accumulated whatever experience that I now have.

            IGV does look fairly easy to setup so I'll keep that in mind.

            Thanks for your help guys!

            Comment


            • #7
              Sam to gff3

              Darwin. Have you found a way to convert sam files to gff3. if so please share

              i HAVE INCLUDED MY EMAIL INCASE YOU WANT TO SEND IT TO MY EMAIL [email protected]

              Comment


              • #8
                SAM to GFF

                Have anybody figure out how to convert SAM to GFF? Please share.

                Comment


                • #9
                  GBrowse supports BAM nowadays, so I don't quite see the need... But it should be quite easy to write a converter (e.g. in Perl or Python), depending on which fields you wish to keep. A good exercise for learning a scripting language!
                  Otherwise you can convert the SAM to sorted BAM, use the BAM to BED converter in BEDTools, and convert BED to GFF in Galaxy...
                  Last edited by arvid; 01-30-2012, 12:14 AM.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  25 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  29 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X