Hi folks, I have a question on how to better merge paired-end reads coming from different samples. Basically, we do align genome resequencing fastq with bwa mem, and compress the resulting .sam file into a .bam.
If we have to merge two different samples, there are basically two ways. The first is to merge the fastqs and align the resulting file, the second is to make use of samtools merge to merge the .bam files.
My concern is whether the two procedures are equally valid, or there is some relevant difference in the outcome.
I think that everything revolves around the functioning of the Burrows Wheeler Transform Alignment. I have broadly understood the application of the BWT and of the indexing, but I still wonder if the number of reads affects the results of the alignment, or each read is aligned independently.
Can anyone give me more insights on this?
If we have to merge two different samples, there are basically two ways. The first is to merge the fastqs and align the resulting file, the second is to make use of samtools merge to merge the .bam files.
My concern is whether the two procedures are equally valid, or there is some relevant difference in the outcome.
I think that everything revolves around the functioning of the Burrows Wheeler Transform Alignment. I have broadly understood the application of the BWT and of the indexing, but I still wonder if the number of reads affects the results of the alignment, or each read is aligned independently.
Can anyone give me more insights on this?
Comment