Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SAM: a generic alignment format

    For NGS data analysis, an aligner tends to be successful when it comes with utilities for comprehensive downstream analyses such as reference based assembly, SNP/indel calling and alignment viewer. Eland/GAPipeline, Soap and Maq are such examples. Unfortunately, it is non-trivial to implement all these downstream analyses and implementing these for each aligner would be a waste of time and human resources as well. Mostly we want to separate alignment from the downstream analyses after the alignment. To achieve this, we need a generic alignment format that makes all aligners happy. NovoAlign and Bowtie can output Maq alignment format to take the advantage of Maq downstream data processing. However, Maq format does not really suit the goal. It does not support longer reads nor alignment with more than one indel and it is too specific to Maq. To solve this problem, the 1000Genome Project Committee decided to develop a generic alignment format. And now the first version of specification and implementation have come out.

    The new alignment format, SAM (Sequence Alignment/Map), is the collaborative result of several major genome centres. It eliminates the major defects of Maq format while retaining its advantages. We also migrated and improved various downstream data processing implemented in Maq/Maqview, such as indexing, pileup, viewer and consensus caller. For more information, please check website:



    I hope samtools may help aligner developers to promote their own software: once a program can generate alignment in SAM format, Maq-like downstream analysis will be available right now.

  • #2
    Thanks Heng.
    It looks this will be very useful and make it easy to try various new upcoming tools..

    Is it possible to have a workflow like MAQ's easyrun that takes through a user case for SAM/BAM?
    --
    bioinfosm

    Comment


    • #3
      Hey lh3,

      Thanks for posting this here. I'm going to sticky it in the Bioinformatics forum for a while to make sure everyone sees it!

      Comment


      • #4
        The documentation notes that "Only MAQ->SAM converter is implemented." However, I could not find anywhere that referenced this conversion utility. Is there software to perform this conversion?

        Comment


        • #5
          To lparsons:

          After you compile samtools with "make", you will find "maq2sam-short" and "maq2sam-long" in the "misc/" directory. There is also a script "export2sam.pl" that converts Illumina's export to SAM. I have not thoroughly tested this script on all export files, though.

          Comment


          • #6
            I downloaded samtools-0.1.1 but could not find "wgsim" or "wgsim_eval.pl" programs which are noted in bwa-0.3.0 documentation.
            How can I get these programs ?

            Comment


            • #7
              To corthay:

              You are quick. I am planning a new bwa release as I realized that I could improve it a little without much work (PS: the new version is released now). Wgsim, wgsim_eval.pl and converters for soap and bowtie are available from SVN only:

              svn co https://samtools.svn.sourceforge.net...s/dev/samtools samtools
              Last edited by lh3; 01-06-2009, 07:34 AM.

              Comment


              • #8
                indelpe vs samtools indels

                Hi Heng Li.
                Could you comment on how the indel detection works in SAM pileups vs MAQ indelpe? I am seeing many more indels in my SAM pileup generated from a MAQ alignment (as compared to the output from indelpe). Is there a good filtering strategy for these?

                Thanks,

                Ryan

                Comment


                • #9
                  I am planning to release samtools-0.1.2 which fixed some bugs in the old version and added new features. For now you can check out source codes from SVN. It should be quite close to 0.1.2.

                  The new version comes with a Bayesian indel caller, although it is just a prototype at present. The strength of the samtools' caller is that it makes use of reads mapped without indel. Using this information helps to reduce false negatives. In addition, the new caller gives genotype rather than just saying there is an indel. You cannot easily tell from maq's indelpe if the indel is a heterozygote or a homozygote. With the new caller, the filters could be: a) the indel score; b) two indels should not be too close to each other.

                  Comment


                  • #10
                    What's the difference between maq2sam-short and -long?

                    Also, short seems to segfault on 64-bit versions of Red Hat and Ubuntu... Am I missing something?

                    Comment


                    • #11
                      maq2sam-short is for the .map files generated by maq-0.6.x, while maq2sam-long for files generated by maq-0.7.x. Sorry for the confusion, and one of the aims of SAM is to avoid such confusions in future.

                      Comment


                      • #12
                        samtools index seg fault

                        I am using the most current version of samtools from svn.
                        I successfully ran the "samtools import" command on my .sam file from bwa.
                        When I then run "samtools index" on the .bam file, it seg faults.
                        Let me know if you need more information to determine what is causing this.
                        Last edited by webbrewer; 03-05-2009, 08:28 PM.

                        Comment


                        • #13
                          samtools import

                          samtools import is for making a .bam file from a .sam file. Why are you attempting to run this command on a .bam file?

                          Comment


                          • #14
                            Originally posted by myrna View Post
                            samtools import is for making a .bam file from a .sam file. Why are you attempting to run this command on a .bam file?
                            Oops. I meant to say that "samtools index" seg faults.

                            Comment


                            • #15
                              samtools index

                              Have you tried samtools view foo.bam?

                              If you get the sam alignments back, then all should be well. I believe you get a warning if the .bam file is unsorted, but perhaps you should try this if you haven't already:

                              samtools sort foo.bam bar.sort

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Choosing Between NGS and qPCR
                                by seqadmin



                                Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                10-18-2024, 07:11 AM
                              • seqadmin
                                Non-Coding RNA Research and Technologies
                                by seqadmin




                                Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                                Nobel Prize for MicroRNA Discovery
                                This week,...
                                10-07-2024, 08:07 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 11-01-2024, 06:09 AM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-30-2024, 05:31 AM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-24-2024, 06:58 AM
                              0 responses
                              24 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 10-23-2024, 08:43 AM
                              0 responses
                              53 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X