Hi folks,
I've just upgraded our samtools after finding, to my delight, that there have been improvements/changes/additions to the package that make it possible to generate consensus sequences from the output of mpileup (previously this was only possible, outside of making your own tools) with pileup, which has its own set of issues.
There are three options for mpileup that I'm not sure I understand given the descriptions on the sourceforge page:
http://samtools.sourceforge.net/mpileup.shtml
Or by typing samtools mpileup on the command line. These options are
1) -d for max per-sample depth. The default for this parameter is 8,000. 8,000 of anything per sample seems rather low for NGS. Does this really mean a max of 8,000 reads per-base per-sample? If so, how smart is the program about choosing which 8,000 reads to use?
2) -D for output per-sample DP. How is this different than -d?
3) -S for output per-sample SP. Presumably this does something like trying to adjust for strand bias so that a potentially SNP is being evaluated by an equal number of + and - reads?
Any insight would be much appreciated.
Cheers,
David
I've just upgraded our samtools after finding, to my delight, that there have been improvements/changes/additions to the package that make it possible to generate consensus sequences from the output of mpileup (previously this was only possible, outside of making your own tools) with pileup, which has its own set of issues.
There are three options for mpileup that I'm not sure I understand given the descriptions on the sourceforge page:
http://samtools.sourceforge.net/mpileup.shtml
Or by typing samtools mpileup on the command line. These options are
1) -d for max per-sample depth. The default for this parameter is 8,000. 8,000 of anything per sample seems rather low for NGS. Does this really mean a max of 8,000 reads per-base per-sample? If so, how smart is the program about choosing which 8,000 reads to use?
2) -D for output per-sample DP. How is this different than -d?
3) -S for output per-sample SP. Presumably this does something like trying to adjust for strand bias so that a potentially SNP is being evaluated by an equal number of + and - reads?
Any insight would be much appreciated.
Cheers,
David