Hi everyone,
I am having a bit of trouble making ends meet with this two tools.
I have a whole genome realigned with GATK and am trying to get the alignment stats to report. I tried samtools and picard only to find different numbers reported:
Total Reads:
picard: 1206005136
samtools: 1209724136
dif: 3719000
This same difference is also found in the mapped/aligned totals. The percentages aligned are the same (99.7). Also, Picard shows exact numbers
I guess there is some difference in the way the programs handle flags.
Also, I tried removing the duplicates with both programs, samtools and picard, but no methot equated the numbers. More so, flagstat reported duplicates after RMDUP (but not after picard filter reads).
Does anyone know why this is happening and what can I do to equate the summaries, or how to choose between both?
Thanks in advance for all the help.
PS, These are the numbers:
Flagstat:
1209724136 + 0 in total (QC-passed reads + QC-failed reads)
146098358 + 0 duplicates
1206039846 + 0 mapped (99.70%:-nan%)
1209724136 + 0 paired in sequencing
604530215 + 0 read1
605193921 + 0 read2
1172988734 + 0 properly paired (96.96%:-nan%)
1203916910 + 0 with itself and mate mapped
2122936 + 0 singletons (0.18%:-nan%)
23043913 + 0 with mate mapped to a different chr
11749717 + 0 with mate mapped to a different chr (mapQ>=5)
PICARD – METRICS..............................
..............................
CATEGORY..............................FIRST_OF_PAIR..........SECOND_OF_PAIR..........PAIR
TOTAL_READS..............................603002568..........603002568..........1206005136
PF_READS..............................603002568..........603002568..........1206005136
PCT_PF_READS..............................1....................1....................1
PF_NOISE_READS..............................0....................0....................0
PF_READS_ALIGNED....................601970451..........600350395..........1202320846
PCT_PF_READS_ALIGNED....................0.998288..........0.995602..........0.996945
PF_ALIGNED_BASES....................60151224632..........59727879851..........119879104483
PF_HQ_ALIGNED_READS....................561713306..........559548407..........1121261713
PF_HQ_ALIGNED_BASES....................56471172468..........56055351192..........112526523660
PF_HQ_ALIGNED_Q20_BASES....................55582307306..........54668254292..........110250561598
PF_HQ_MEDIAN_MISMATCHES....................0....................0....................0
PF_MISMATCH_RATE....................0.00499....................0.005764..........0.005376
PF_HQ_ERROR_RATE....................0.003378..........0.004162..........0.003768
PF_INDEL_RATE..............................0.000263..........0.000259..........0.000261
MEAN_READ_LENGTH....................101..........101..........101
READS_ALIGNED_IN_PAIRS....................600149693..........600149693..........1200299386
PCT_READS_ALIGNED_IN_PAIRS..........0.996975..........0.999666..........0.998319
BAD_CYCLES..............................0..........0..........0
STRAND_BALANCE..............................0.500892..........0.500784..........0.500838
PCT_CHIMERAS..............................0.019654..........0.019654..........0.019654
PCT_ADAPTER..............................0.000048..........0.000024..........0.000036
SAMPLE..............................
LIBRARY..............................
READ_GROUP..............................
I am having a bit of trouble making ends meet with this two tools.
I have a whole genome realigned with GATK and am trying to get the alignment stats to report. I tried samtools and picard only to find different numbers reported:
Total Reads:
picard: 1206005136
samtools: 1209724136
dif: 3719000
This same difference is also found in the mapped/aligned totals. The percentages aligned are the same (99.7). Also, Picard shows exact numbers
I guess there is some difference in the way the programs handle flags.
Also, I tried removing the duplicates with both programs, samtools and picard, but no methot equated the numbers. More so, flagstat reported duplicates after RMDUP (but not after picard filter reads).
Does anyone know why this is happening and what can I do to equate the summaries, or how to choose between both?
Thanks in advance for all the help.
PS, These are the numbers:
Flagstat:
1209724136 + 0 in total (QC-passed reads + QC-failed reads)
146098358 + 0 duplicates
1206039846 + 0 mapped (99.70%:-nan%)
1209724136 + 0 paired in sequencing
604530215 + 0 read1
605193921 + 0 read2
1172988734 + 0 properly paired (96.96%:-nan%)
1203916910 + 0 with itself and mate mapped
2122936 + 0 singletons (0.18%:-nan%)
23043913 + 0 with mate mapped to a different chr
11749717 + 0 with mate mapped to a different chr (mapQ>=5)
PICARD – METRICS..............................
..............................
CATEGORY..............................FIRST_OF_PAIR..........SECOND_OF_PAIR..........PAIR
TOTAL_READS..............................603002568..........603002568..........1206005136
PF_READS..............................603002568..........603002568..........1206005136
PCT_PF_READS..............................1....................1....................1
PF_NOISE_READS..............................0....................0....................0
PF_READS_ALIGNED....................601970451..........600350395..........1202320846
PCT_PF_READS_ALIGNED....................0.998288..........0.995602..........0.996945
PF_ALIGNED_BASES....................60151224632..........59727879851..........119879104483
PF_HQ_ALIGNED_READS....................561713306..........559548407..........1121261713
PF_HQ_ALIGNED_BASES....................56471172468..........56055351192..........112526523660
PF_HQ_ALIGNED_Q20_BASES....................55582307306..........54668254292..........110250561598
PF_HQ_MEDIAN_MISMATCHES....................0....................0....................0
PF_MISMATCH_RATE....................0.00499....................0.005764..........0.005376
PF_HQ_ERROR_RATE....................0.003378..........0.004162..........0.003768
PF_INDEL_RATE..............................0.000263..........0.000259..........0.000261
MEAN_READ_LENGTH....................101..........101..........101
READS_ALIGNED_IN_PAIRS....................600149693..........600149693..........1200299386
PCT_READS_ALIGNED_IN_PAIRS..........0.996975..........0.999666..........0.998319
BAD_CYCLES..............................0..........0..........0
STRAND_BALANCE..............................0.500892..........0.500784..........0.500838
PCT_CHIMERAS..............................0.019654..........0.019654..........0.019654
PCT_ADAPTER..............................0.000048..........0.000024..........0.000036
SAMPLE..............................
LIBRARY..............................
READ_GROUP..............................
Comment