Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bug in samtools view -s?

    Hi, I'm experiencing difficulties trying to downsample .bam files using samtools view -s. Specifically some of the commands fail while others work; this seems sometimes to be correlated with the -s float argument being > 0.5 (but not always). Here I'm c/p'ing some of the code that worked and some which failed.

    Thanks to any helpful suggestions!

    samtools view -b -s 0.271 1.bam > 1_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.5077 2.bam > 2_ds.bam # gives 0 reads unexpectedly
    samtools view -b -s 0.2113 3.bam > 3_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.3322 4.bam > 4_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.5306 5.bam > 5_ds.bam# gives 0 reads unexpectedly
    samtools view -b -s 0.204 6.bam > 6_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.3841 7.bam > 7_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.4691 8.bam > 8_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.6861 9.bam > 9_ds.bam # gives 0 reads unexpectedly
    samtools view -b -s 0.2261 10.bam > 10_ds.bam # gives 730697 reads as expected

    samtools view -b -s 0.6653 23.bam > 23_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.0444 24.bam > 24_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.0492 25.bam > 25_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.1648 26.bam > 26_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.0801 27.bam > 27_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.171 28.bam > 28_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.0979 29.bam > 29_ds.bam # gives 730697 reads as expected
    samtools view -b -s 0.0511 30.bam > 30_ds.bam # gives 730697 reads as expected

  • #2
    Not answering your question directly but you could use "reformat.sh" from BBMap suite to do this as well. You can specify sampling parameters with more granularity (even as certain number of reads etc).

    Comment


    • #3
      Originally posted by jkzebrafish View Post
      Hi, I'm experiencing difficulties trying to downsample .bam files using samtools view -s. Specifically some of the commands fail while others work; this seems sometimes to be correlated with the -s float argument being > 0.5 (but not always). Here I'm c/p'ing some of the code that worked and some which failed.
      It seems there are specific read alignments that are causing the failures. You could confirm this by using taking one of the .bam's that failed, use different random seeds w/ a small sample fraction, and you should see the failure some percentage on of the time.

      Are these alignments of very long reads? (> 65k bp). Alignments with cigar strings longer than the 16-bit integer limit (65,535) can behave strangely

      Comment


      • #4
        Thanks cstack for the response. These are paired end 75bp reads, nothing crazy.

        Here is a little more information:

        samtools view -b -s 0.6861 9.bam > 9_ds.bam # gives 0 reads
        samtools view -b -s 0.4861 9.bam > 9_ds.bam # gives ~50k reads
        samtools view -b -s 0.5 9.bam > 9_ds.bam # gives ~50k reads
        samtools view -b -s 0.5001 9.bam > 9_ds.bam # gives 0 reads
        samtools view -b -s 1.6861 9.bam > 9_ds.bam # gives 0 reads
        samtools view -b -s 5.6861 9.bam > 9_ds.bam # gives 0 reads
        samtools view -b -s 100.6861 9.bam > 9_ds.bam # gives 0 reads

        No errors or warnings are given, hence my confusion. Thanks for any insight.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advanced Methods for the Detection of Infectious Disease
          by seqadmin




          The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
          ...
          Yesterday, 01:15 PM
        • seqadmin
          Strategies for Investigating the Microbiome
          by seqadmin




          Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
          11-09-2023, 07:02 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:12 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-22-2023, 09:29 AM
        1 response
        51 views
        0 likes
        Last Post VilliamPast  
        Started by seqadmin, 11-22-2023, 08:53 AM
        0 responses
        58 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-21-2023, 08:24 AM
        0 responses
        31 views
        0 likes
        Last Post seqadmin  
        Working...
        X