Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie : More reference sequences Less aligned reads

    Hi All,

    I use Bowtie1 (version 1.0.0 for MacOSX)

    In order to discard some reads, I mapped reads to multiple reference sequences which I want to remove.

    I have a problem that Bowtie gave me fewer aligned reads, when I use more reference sequences.

    To be specific....
    Total sequences I want to discard are 21 sequences, and there are three different groups of sequences, and each groups have 7 sequences.

    Group A: A1,A2,A3,A4,A5,A6,A7. -> similarity:53%~99%, seq length: 1550nt
    Group B: B1,B2,B3,B4,B5,B6,B7. -> similarity:49%~99%, seq length: 2900nt
    Group C: C1,C2,C3,C4,C5,C6,C7. -> similarity:51%~99%, seq length: 120nt
    ====> Major targets are A1 and B1

    By using major two sequences, A1 & B1, I built a index file, and then did bowtie1.
    Its log file reports that:
    10.00% reads were reported as aligned reads,
    00.01% reads were reported as suppressed reads, and
    89.99% reads were reported as failed reads.

    After that, I did the same process with all 21 sequences : built a index, ran bowtie1.
    And I expected that this result would have more aligned reads than former result. However, it was absolutely wrong!

    Latter log file reports that:
    00.20% reads were reported as aligned reads,
    11.00% reads were reported as suppressed reads, and
    88.80% reads were reported as failed reads.

    I can not understand the reason why more reference sequences have fewer aligned reads.
    At least, it should have more or even reads than former result.
    Thankfully, # failed reads to align are similar each other.

    I used some options :
    bowtie `INDEX` -5 1 -n 0 -n 0 -k 1 -m 1 -l 20 --best --phred33-quals --un `UNMAPPED` -q `INPUT` -S `OUT` 2>> `LOG` -t

    Thank you!

    Jiyoung

  • #2
    I'm not an expert at Bowtie, but a couple things stand out to me. First, you have -n 0 -n 0 (-n 0 repeated) so is there an option missing and you wrote -n 0 instead?

    But the main issue is tied to the -m 1 option. You are telling Bowtie to only report reads that have a single valid alignment, otherwise suppress them. So when you include all the sequences in the index, in which sequences within the group have high similarity, you are making it very likely that Bowtie will find more than 1 valid alignment and suppress the reporting.
    Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

    Comment


    • #3
      Make sense! But why so many reads were suppressed?

      Originally posted by SNPsaurus View Post
      I'm not an expert at Bowtie, but a couple things stand out to me. First, you have -n 0 -n 0 (-n 0 repeated) so is there an option missing and you wrote -n 0 instead?

      But the main issue is tied to the -m 1 option. You are telling Bowtie to only report reads that have a single valid alignment, otherwise suppress them. So when you include all the sequences in the index, in which sequences within the group have high similarity, you are making it very likely that Bowtie will find more than 1 valid alignment and suppress the reporting.
      SNPsaurus, thanks!

      Yes, your explanation makes sense. So latter index with more reference sequences showed a few reduced failed reads.

      BUt still, it is unclear that why so many reads were suppressed ?
      Okay, it will be helpful to compare two output files! Thank you!

      Jiyoung

      Comment


      • #4
        No, the suppressed reads are the ones that are not reported because of your -m 1 option. In your first try (using A1 and B1) very few are suppressed because very few reads align to both A1 and B1. In the second try many more are suppressed because nearly every read that aligns, aligns to A1 and A2 and A3,4,5,6,7, or B1 and B2 and B3,4,5,6,7. When the read aligns to multiple index sequences, then it fails the -m 1 option and becomes suppressed.
        Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

        Comment

        Latest Articles

        Collapse

        • seqadmin
          New Genomics Tools and Methods Shared at AGBT 2025
          by seqadmin


          This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

          The Headliner
          The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
          03-03-2025, 01:39 PM
        • seqadmin
          Investigating the Gut Microbiome Through Diet and Spatial Biology
          by seqadmin




          The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
          02-24-2025, 06:31 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 12:50 PM
        0 responses
        6 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-03-2025, 01:15 PM
        0 responses
        181 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-28-2025, 12:58 PM
        0 responses
        275 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-24-2025, 02:48 PM
        0 responses
        663 views
        0 likes
        Last Post seqadmin  
        Working...
        X