Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • jorhodes
    Junior Member
    • Nov 2012
    • 7

    Samtools flagstat 0% properly paired

    Hi,

    I have used BWA 0.75a to map PE data of WGS to reference, and samtools flagstat to check the resulting BAM file. My pipeline includes the usual sorting, fixing malformed bams, and marking duplicates. I have run this pipeline many times before with no problems, but this one genome had presented something I haven't seen before. The flagstat results are as follows:

    36332416 + 0 in total (QC-passed reads + QC-failed reads)
    0 + 0 duplicates
    32959970 + 0 mapped (90.72%:nan%)
    36332416 + 0 paired in sequencing
    18166208 + 0 read1
    18166208 + 0 read2
    354 + 0 properly paired (0.00%:nan%)
    32959970 + 0 with itself and mate mapped
    0 + 0 singletons (0.00%:nan%)
    417710 + 0 with mate mapped to a different chr
    0 + 0 with mate mapped to a different chr (mapQ>=5)

    I'm quite confused, as I have nearly 91% of my reads mapping to my reference, but barely any properly pairing. QC analysis (using FastQC) did not show anything out of the ordinary, and the library prep gives an average fragment that we are used to seeing. Is this just literally a case of the reads having no overlap, and we should re-run this particular genome, or can anyone suggest anything else for me to try to get to the bottom of this?

    Thanks
  • lindenb
    Senior Member
    • Apr 2010
    • 143

    #2
    "36332416 + 0 "

    there is no reads from tne second fastq file in your bam. Did you use single-end instead of paired-end mapping ?

    Comment

    • dpryan
      Devon Ryan
      • Jul 2011
      • 3478

      #3
      Originally posted by lindenb View Post
      "36332416 + 0 "

      there is no reads from tne second fastq file in your bam. Did you use single-end instead of paired-end mapping ?
      You're misreading that, the "+0" are for reads with flag 0x200 set.

      Edit: I should add that one likely cause of this is if the insert size is too big for the aligner to declare reads as having aligned properly paired. Have a look at some of them to see if this might be the case. Another possibility is that paired-reads became out of sync at some point (which is annoying as hell!).

      Comment

      • lindenb
        Senior Member
        • Apr 2010
        • 143

        #4
        Originally posted by dpryan View Post
        You're misreading that, the "+0" are for reads with flag 0x200 set..
        I was talking about the 1st line

        Code:
        36332416 + 0 in total (QC-passed reads + QC-failed reads)
        wich is the same as:

        Code:
        36332416 + 0 paired in sequencing
        from the C code, the 1st line is the total number of reads for/rev with a correct QC :
        Code:
        (...)
                printf("%lld + %lld in total (QC-passed reads + QC-failed reads)\n", s->n_reads[0], s->n_reads[1]);
                printf("%lld + %lld duplicates\n", s->n_dup[0], s->n_dup[1]);
                printf("%lld + %lld mapped (%.2f%%:%.2f%%)\n", s->n_mapped[0], s->n_mapped[1], (float)s->n_mapped[0] / s->n_reads[0] * 100.0, (float)s->n_mapped[1] / s->n_reads[1] * 100.0);
                printf("%lld + %lld paired in sequencing\n", s->n_pair_all[0], s->n_pair_all[1]);
           (...)

        Comment

        • dpryan
          Devon Ryan
          • Jul 2011
          • 3478

          #5
          I think we're talking past each other

          If you have
          Code:
          18166208 + 0 read1
          18166208 + 0 read2
          then a second fastq file was input, the aligner treated things as paired-end, and those alignments exist in the BAM file (though, I suppose the read1 fastq file could have also been specified as the read2 file, which might produce weird results like these).

          Comment

          • jorhodes
            Junior Member
            • Nov 2012
            • 7

            #6
            Originally posted by lindenb View Post
            "36332416 + 0 "

            there is no reads from tne second fastq file in your bam. Did you use single-end instead of paired-end mapping ?
            Hi there, definitely used PE mapping, and two fastqs were inputted.

            Edit: I should add that one likely cause of this is if the insert size is too big for the aligner to declare reads as having aligned properly paired. Have a look at some of them to see if this might be the case. Another possibility is that paired-reads became out of sync at some point (which is annoying as hell!).
            Thanks for this suggestion, I will look into this now.

            Comment

            Latest Articles

            Collapse

            • GATTACAT
              Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by GATTACAT
              Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
              07-01-2026, 11:43 AM
            • SEQadmin2
              Nine Things a Sample Prep Scientist Thinks About Before Sequencing
              by SEQadmin2


              I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

              Here are nine questions we think about, in roughly the order they matter, before...
              06-18-2026, 07:11 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 07-02-2026, 11:08 AM
            0 responses
            7 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-30-2026, 05:37 AM
            0 responses
            12 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-26-2026, 11:10 AM
            0 responses
            20 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-17-2026, 06:09 AM
            0 responses
            54 views
            0 reactions
            Last Post SEQadmin2  
            Working...