I am novice to NGS data analysis. I use SHRiMP2 to align SOLiD mate pair reads. Because I don't need the mate-pair information but just need the mapping information, I did the alignment of F3 / R3 reads separately. Each read has only the best hit recorded, with the parameter '--strata -o 1'. The read name in the obtained sam files didnot have the postfix, i.e., '_F3' and '_R3'. Then I used 'samtools view -bS' to convert SAM to BAM format and used 'samtools merge' and 'samtools sort' to do the merge and sort process.
The question is, the size of the merged file is nearly two fold of the sorted file. I guess that, because the reads of the same mate pair have identical names, only one of them is left in the sorted BAM file, am I right? How can I keep both reads in the sorted file?
The question is, the size of the merged file is nearly two fold of the sorted file. I guess that, because the reads of the same mate pair have identical names, only one of them is left in the sorted BAM file, am I right? How can I keep both reads in the sorted file?