I'm wondering about the purpose of the --solid_recal_mode options in the GATK, and google doesn't seem to be providing many answers. Here's the documentation:
I'm a little confused about why the reference would be inserted in the first place - I'm guessing it has to do with the dependencies neighboring bases have on each other in color space?
My question is, when would you ever want to use REMOVE_REF_BIAS? It sounds like it would add noise, and the safest option would always be to use the default SET_Q_ZERO...
--solid_recal_mode / -sMode (SOLID_RECAL_MODE with default value SET_Q_ZERO)
How should we recalibrate solid bases in which the reference was inserted? Options = DO_NOTHING, SET_Q_ZERO, SET_Q_ZERO_BASE_N, or REMOVE_REF_BIAS. CountCovariates and TableRecalibration accept a --solid_recal_mode flag which governs how the recalibrator handles the reads which have had the reference inserted because of color space inconsistencies.
The --solid_recal_mode argument is an enumerated type (SOLID_RECAL_MODE), which can have one of the following values:
DO_NOTHING
Treat reference inserted bases as reference matching bases. Very unsafe!
SET_Q_ZERO
Set reference inserted bases and the previous base (because of color space alignment details) to Q0. This is the default option.
SET_Q_ZERO_BASE_N
In addition to setting the quality scores to zero, also set the base itself to 'N'. This is useful to visualize in IGV.
REMOVE_REF_BIAS
Look at the color quality scores and probabilistically decide to change the reference inserted base to be the base which is implied by the original color space instead of the reference.
How should we recalibrate solid bases in which the reference was inserted? Options = DO_NOTHING, SET_Q_ZERO, SET_Q_ZERO_BASE_N, or REMOVE_REF_BIAS. CountCovariates and TableRecalibration accept a --solid_recal_mode flag which governs how the recalibrator handles the reads which have had the reference inserted because of color space inconsistencies.
The --solid_recal_mode argument is an enumerated type (SOLID_RECAL_MODE), which can have one of the following values:
DO_NOTHING
Treat reference inserted bases as reference matching bases. Very unsafe!
SET_Q_ZERO
Set reference inserted bases and the previous base (because of color space alignment details) to Q0. This is the default option.
SET_Q_ZERO_BASE_N
In addition to setting the quality scores to zero, also set the base itself to 'N'. This is useful to visualize in IGV.
REMOVE_REF_BIAS
Look at the color quality scores and probabilistically decide to change the reference inserted base to be the base which is implied by the original color space instead of the reference.
My question is, when would you ever want to use REMOVE_REF_BIAS? It sounds like it would add noise, and the safest option would always be to use the default SET_Q_ZERO...