Hi there,
I found a post that had a similar theme as my question here in GATK forum (http://gatkforums.broadinstitute.org...-recalibration). The answer is the order does not matter so much.
But according to the facts I found in my results, the order seemed to matter a lot. Therefore, I hope someone can help me figure out where the problem lies.
I have a 40 coverge paired-end whole-genome sequencing dataset.
As it is recommended in GATK website, after using samtools to sort and index my bamfile, I use MarkDuplicates.jar from Picard to remove the duplicates, then RealignerTargetCreator and IndelRealigner from GATK to do the local realignment.
Then I use "samtools flagstat" to get an overview of these two bamfiles. I found that it seemed the mapping quality decreased somehow after realignment, compared with that from dedup.
I attached my two files from "flagstat". As you can see, the read numbers only differ in line 8 and line 9.
As I understood, " with itself and mate mapped" is the number of paired reads that is mapped to the reference genome. And "singletons" is the number of reads that itself is unmapped but the mate is mapped.
In my files, the number of singletons increased while the number of mapped reads in line 8 decreased, comparing the results from LR with that from MD.
Can anyone suggest what is going on?
I found a post that had a similar theme as my question here in GATK forum (http://gatkforums.broadinstitute.org...-recalibration). The answer is the order does not matter so much.
But according to the facts I found in my results, the order seemed to matter a lot. Therefore, I hope someone can help me figure out where the problem lies.
I have a 40 coverge paired-end whole-genome sequencing dataset.
As it is recommended in GATK website, after using samtools to sort and index my bamfile, I use MarkDuplicates.jar from Picard to remove the duplicates, then RealignerTargetCreator and IndelRealigner from GATK to do the local realignment.
Then I use "samtools flagstat" to get an overview of these two bamfiles. I found that it seemed the mapping quality decreased somehow after realignment, compared with that from dedup.
I attached my two files from "flagstat". As you can see, the read numbers only differ in line 8 and line 9.
As I understood, " with itself and mate mapped" is the number of paired reads that is mapped to the reference genome. And "singletons" is the number of reads that itself is unmapped but the mate is mapped.
In my files, the number of singletons increased while the number of mapped reads in line 8 decreased, comparing the results from LR with that from MD.
Can anyone suggest what is going on?