Hi there,
I have average quality scores from several amplicon FLX and Titanium runs. Based on these postition-specific average quality scores (Q) I want to calculate postition-specific error rates/probabilities (P). If it was Sanger sequencing I could easily use the reverse Phred formula Q=-10*log(P), but I'm not sure what to use for pyrosequencing reads. Could I safely use P=10^(-Q/10)?
I read Brockman et al. (2008 Genome Research) and they say the initial quality score from GS 20 software is based on the "...probability that the base is an overcall, given the observed signal intensity for the corresponding flow". They then propose a much more comprehensive way of scoring quality, e.g. involving oberved noise in the whole read and homopolymer counts.
Does anyone know which quality scoring algorith is acutally used in FLX/Titanium these days? And does the older FLX quality scoring differ from the newer Titanium?
Many thanks in advance for any help!
Regards,
Marcus
I have average quality scores from several amplicon FLX and Titanium runs. Based on these postition-specific average quality scores (Q) I want to calculate postition-specific error rates/probabilities (P). If it was Sanger sequencing I could easily use the reverse Phred formula Q=-10*log(P), but I'm not sure what to use for pyrosequencing reads. Could I safely use P=10^(-Q/10)?
I read Brockman et al. (2008 Genome Research) and they say the initial quality score from GS 20 software is based on the "...probability that the base is an overcall, given the observed signal intensity for the corresponding flow". They then propose a much more comprehensive way of scoring quality, e.g. involving oberved noise in the whole read and homopolymer counts.
Does anyone know which quality scoring algorith is acutally used in FLX/Titanium these days? And does the older FLX quality scoring differ from the newer Titanium?
Many thanks in advance for any help!
Regards,
Marcus
Comment