Hi,
I need to map ~ 400 000 454 reads onto a reference genome. The mean length is 310 bp. The reads contain repetitions as well as the genome. The goal is to obtain variations (SNP and Indels).
I would like to tune the bwasw algorithm using some of the options proposed by the software:
bwasw bwa bwasw [-a matchScore] [-b mmPen] [-q gapOpenPen] [-r gapExtPen] [-t nThreads] [-w bandWidth] [-T thres] [-s hspIntv] [-z zBest] [-N nHspRev] [-c thresCoef] <in.db.fasta> <in.fq>
OPTIONS:
-a INT Score of a match [1]
-b INT Mismatch penalty [3]
-q INT Gap open penalty [5]
-r INT Gap extension penalty. The penalty for a contiguous gap of size k is q+k*r. [2]
-t INT Number of threads in the multi-threading mode [1]
-w INT Band width in the banded alignment [33]
-T INT Minimum score threshold divided by a [37]
-c FLOAT Coefficient for threshold adjustment according to query length. Given an l-long query, the threshold for a hit to be retained is a*max{T,c*log(l)}. [5.5]
-z INT Z-best heuristics. Higher -z increases accuracy at the cost of speed. [1]
-s INT Maximum SA interval size for initiating a seed. Higher -s increases accuracy at the cost of speed. [3]
-N INT Minimum number of seeds supporting the resultant alignment to skip reverse alignment. [5]
But I do not know what options to use and what values to put in the options.
Does anybody have experience with a similar project? In that case, what parameters did you apply?
What would be a minimum score to apply?
Thanks in advance.
Best regards,
Sabrina.
I need to map ~ 400 000 454 reads onto a reference genome. The mean length is 310 bp. The reads contain repetitions as well as the genome. The goal is to obtain variations (SNP and Indels).
I would like to tune the bwasw algorithm using some of the options proposed by the software:
bwasw bwa bwasw [-a matchScore] [-b mmPen] [-q gapOpenPen] [-r gapExtPen] [-t nThreads] [-w bandWidth] [-T thres] [-s hspIntv] [-z zBest] [-N nHspRev] [-c thresCoef] <in.db.fasta> <in.fq>
OPTIONS:
-a INT Score of a match [1]
-b INT Mismatch penalty [3]
-q INT Gap open penalty [5]
-r INT Gap extension penalty. The penalty for a contiguous gap of size k is q+k*r. [2]
-t INT Number of threads in the multi-threading mode [1]
-w INT Band width in the banded alignment [33]
-T INT Minimum score threshold divided by a [37]
-c FLOAT Coefficient for threshold adjustment according to query length. Given an l-long query, the threshold for a hit to be retained is a*max{T,c*log(l)}. [5.5]
-z INT Z-best heuristics. Higher -z increases accuracy at the cost of speed. [1]
-s INT Maximum SA interval size for initiating a seed. Higher -s increases accuracy at the cost of speed. [3]
-N INT Minimum number of seeds supporting the resultant alignment to skip reverse alignment. [5]
But I do not know what options to use and what values to put in the options.
Does anybody have experience with a similar project? In that case, what parameters did you apply?
What would be a minimum score to apply?
Thanks in advance.
Best regards,
Sabrina.
Comment