For NGS data analysis, an aligner tends to be successful when it comes with utilities for comprehensive downstream analyses such as reference based assembly, SNP/indel calling and alignment viewer. Eland/GAPipeline, Soap and Maq are such examples. Unfortunately, it is non-trivial to implement all these downstream analyses and implementing these for each aligner would be a waste of time and human resources as well. Mostly we want to separate alignment from the downstream analyses after the alignment. To achieve this, we need a generic alignment format that makes all aligners happy. NovoAlign and Bowtie can output Maq alignment format to take the advantage of Maq downstream data processing. However, Maq format does not really suit the goal. It does not support longer reads nor alignment with more than one indel and it is too specific to Maq. To solve this problem, the 1000Genome Project Committee decided to develop a generic alignment format. And now the first version of specification and implementation have come out.
The new alignment format, SAM (Sequence Alignment/Map), is the collaborative result of several major genome centres. It eliminates the major defects of Maq format while retaining its advantages. We also migrated and improved various downstream data processing implemented in Maq/Maqview, such as indexing, pileup, viewer and consensus caller. For more information, please check website:
I hope samtools may help aligner developers to promote their own software: once a program can generate alignment in SAM format, Maq-like downstream analysis will be available right now.
The new alignment format, SAM (Sequence Alignment/Map), is the collaborative result of several major genome centres. It eliminates the major defects of Maq format while retaining its advantages. We also migrated and improved various downstream data processing implemented in Maq/Maqview, such as indexing, pileup, viewer and consensus caller. For more information, please check website:
I hope samtools may help aligner developers to promote their own software: once a program can generate alignment in SAM format, Maq-like downstream analysis will be available right now.
Comment