BBMap will be publicly released soon, pending confirmation with LBL's legal department.
In the meantime feel free to look at these graphs of its performance:
Note that this is a 50MB powerpoint file. It contains graphs of relative performance of BBMap and other short read aligners (bwa, bowtie2, gsnap, smalt) mapping synthetic data.
EDIT:
This thread is now closed; please use this one to post questions.
In the meantime feel free to look at these graphs of its performance:
Note that this is a 50MB powerpoint file. It contains graphs of relative performance of BBMap and other short read aligners (bwa, bowtie2, gsnap, smalt) mapping synthetic data.
EDIT:
This thread is now closed; please use this one to post questions.
In practice, it should make very little difference, though. Using long kmers is important for assembly, as it helps span short repeats that would otherwise cause contigs to terminate. But normalization is much less sensitive to that issue, and very long kmers can cause problems in the presence of errors. With k=31, a 100bp read with 1 error could yield 31 kmers with a depth of 1, out of a total of 70 kmers - in that case, the median depth would not be impacted. With k=63, there could be 63 of the 70 total kmers spanning the error, thus having a depth of 1, so the median depth of the read would look like 1 instead of its correct value. And BBNorm normalizes based on the median kmer depth of a read.
Comment