Announcement

Collapse
No announcement yet.

Issue with new samtools 1.0 index and idxstats

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue with new samtools 1.0 index and idxstats

    When using samtools 1.19, I would index my .bam, and use idxstatst to make a file of read counts per sequence in my reference, and the unmapped reads would be at the bottom of the list.

    The line would look like

    * 0 0 1027820

    To show 1027820 reads did not align

    When using samtools 1.0 to index, now, the idxstats file always claims there is exactly 1 read that mapped to nothing, which is clearly wrong.

    Can anyone else replicate this? This can't be the desired behavior, can it?

  • #2
    I just tested on an old BAM file (reindexed and ran idxstats) and got 0, as appropriate. Perhaps this only happens if there are unmapped reads in the file? I should note that you should ensure that both htslib and samtools are on the 1.0 tag release. When I just recloned them (just to ensure that any local changes weren't in there) I couldn't even compile samtools since the most current commit breaks compilation with the version of htslib I had from earlier in the day.

    Comment


    • #3
      I went into the htslib-1.0 folder of my samtools 1.0 folder, and "make" seems to finish without errors, but samtools 1.0 still does the same thing: idxstats run from the v 1.0 indexing shows only 1 unmapped read. Rebuilding the index with 0.1.19 and rerunning idxstats shows the correct number.
      Last edited by swbarnes2; 08-19-2014, 02:32 PM.

      Comment


      • #4
        Originally posted by swbarnes2 View Post
        Can anyone else replicate this? This can't be the desired behavior, can it?
        I just upgraded to samtools-1.0.

        Indeed the same happens to me:

        Code:
        samtools --version
        samtools 1.0
        Using htslib 1.0
        Copyright (C) 2014 Genome Research Ltd.
        
        samtools index fk042_F5_14_CHEM1.bam
        samtools idxstats fk042_F5_14_CHEM1.bam | tail -n1
        *	0	0	1
        
        ## Now with samtools 0.18.0:
        ~/applications/samtools/samtools-0.1.18/samtools index fk042_F5_14_CHEM1.bam
        samtools idxstats fk042_F5_14_CHEM1.bam | tail -n1
        *	0	0	8882

        Comment


        • #5
          @dariober: In fact this is the issue I alluded to over on biostars :P

          It turns out that this only happens if there are unmapped read in the BAM file (otherwise, it correctly reports 0).

          Comment


          • #6
            I've submitted a bug report.

            Comment


            • #7
              Originally posted by dpryan View Post
              I've submitted a bug report.
              Thank you. I didn't want to do that until I had some confirmation that other people were seeing the problem.

              Comment


              • #8
                I created a pull request to fix this.

                Comment


                • #9
                  Originally posted by dpryan View Post
                  I created a pull request to fix this.
                  Thanks a lot for taking care of this issue. Here's my comment to your post on github:

                  I'm quite keen on querying the index to get the number of unmapped reads, so I favour the second solution. (With the disclaimer the I'm not the one going to fix it though!)

                  Comment

                  Working...
                  X