Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bowtie : More reference sequences Less aligned reads

    Hi All,

    I use Bowtie1 (version 1.0.0 for MacOSX)

    In order to discard some reads, I mapped reads to multiple reference sequences which I want to remove.

    I have a problem that Bowtie gave me fewer aligned reads, when I use more reference sequences.

    To be specific....
    Total sequences I want to discard are 21 sequences, and there are three different groups of sequences, and each groups have 7 sequences.

    Group A: A1,A2,A3,A4,A5,A6,A7. -> similarity:53%~99%, seq length: 1550nt
    Group B: B1,B2,B3,B4,B5,B6,B7. -> similarity:49%~99%, seq length: 2900nt
    Group C: C1,C2,C3,C4,C5,C6,C7. -> similarity:51%~99%, seq length: 120nt
    ====> Major targets are A1 and B1

    By using major two sequences, A1 & B1, I built a index file, and then did bowtie1.
    Its log file reports that:
    10.00% reads were reported as aligned reads,
    00.01% reads were reported as suppressed reads, and
    89.99% reads were reported as failed reads.

    After that, I did the same process with all 21 sequences : built a index, ran bowtie1.
    And I expected that this result would have more aligned reads than former result. However, it was absolutely wrong!

    Latter log file reports that:
    00.20% reads were reported as aligned reads,
    11.00% reads were reported as suppressed reads, and
    88.80% reads were reported as failed reads.

    I can not understand the reason why more reference sequences have fewer aligned reads.
    At least, it should have more or even reads than former result.
    Thankfully, # failed reads to align are similar each other.

    I used some options :
    bowtie `INDEX` -5 1 -n 0 -n 0 -k 1 -m 1 -l 20 --best --phred33-quals --un `UNMAPPED` -q `INPUT` -S `OUT` 2>> `LOG` -t

    Thank you!

    Jiyoung

  • #2
    I'm not an expert at Bowtie, but a couple things stand out to me. First, you have -n 0 -n 0 (-n 0 repeated) so is there an option missing and you wrote -n 0 instead?

    But the main issue is tied to the -m 1 option. You are telling Bowtie to only report reads that have a single valid alignment, otherwise suppress them. So when you include all the sequences in the index, in which sequences within the group have high similarity, you are making it very likely that Bowtie will find more than 1 valid alignment and suppress the reporting.
    Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

    Comment


    • #3
      Make sense! But why so many reads were suppressed?

      Originally posted by SNPsaurus View Post
      I'm not an expert at Bowtie, but a couple things stand out to me. First, you have -n 0 -n 0 (-n 0 repeated) so is there an option missing and you wrote -n 0 instead?

      But the main issue is tied to the -m 1 option. You are telling Bowtie to only report reads that have a single valid alignment, otherwise suppress them. So when you include all the sequences in the index, in which sequences within the group have high similarity, you are making it very likely that Bowtie will find more than 1 valid alignment and suppress the reporting.
      SNPsaurus, thanks!

      Yes, your explanation makes sense. So latter index with more reference sequences showed a few reduced failed reads.

      BUt still, it is unclear that why so many reads were suppressed ?
      Okay, it will be helpful to compare two output files! Thank you!

      Jiyoung

      Comment


      • #4
        No, the suppressed reads are the ones that are not reported because of your -m 1 option. In your first try (using A1 and B1) very few are suppressed because very few reads align to both A1 and B1. In the second try many more are suppressed because nearly every read that aligns, aligns to A1 and A2 and A3,4,5,6,7, or B1 and B2 and B3,4,5,6,7. When the read aligns to multiple index sequences, then it fails the -m 1 option and becomes suppressed.
        Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advanced Methods for the Detection of Infectious Disease
          by seqadmin




          The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
          ...
          Yesterday, 01:15 PM
        • seqadmin
          Strategies for Investigating the Microbiome
          by seqadmin




          Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
          11-09-2023, 07:02 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:12 AM
        0 responses
        14 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-22-2023, 09:29 AM
        1 response
        46 views
        0 likes
        Last Post VilliamPast  
        Started by seqadmin, 11-22-2023, 08:53 AM
        0 responses
        30 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-21-2023, 08:24 AM
        0 responses
        23 views
        0 likes
        Last Post seqadmin  
        Working...
        X