We have some mated paired end sequences (50bp) to be mapped to the reference genome for SNP (and maybe indel) discovery.
The data I have are csfasta and qual files.
Could any one let me know if I need to do pre-filtering for the sequences before I use any software to map them?
If I use bowtie, I should remove the orphan reads (and maybe try to map the orphan reads using a different parameter set).
If I use BFAST, should I do the same?
If I do need to filter the sequences based on the quality score, what's the cut-off threshold people normally use? Average of Q10?
How to translate the quality score to the % error rate like the Phred score?
We have some mated paired end sequences (50bp) to be mapped to the reference genome for SNP (and maybe indel) discovery.
The data I have are csfasta and qual files.
Could any one let me know if I need to do pre-filtering for the sequences before I use any software to map them?
If I use bowtie, I should remove the orphan reads (and maybe try to map the orphan reads using a different parameter set).
If I use BFAST, should I do the same?
If I do need to filter the sequences based on the quality score, what's the cut-off threshold people normally use? Average of Q10?
How to translate the quality score to the % error rate like the Phred score?