Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mask x number of bases WITHIN sequence prior to alignment

    Message moved to correct section
    http://seqanswers.com/forums/showthread.php?t=22014

    Hi all,

    As you may see from the picture I have this QC from all R2 reads of my Paired End sequenced samples. Due to a technical error during the sequencing I am ending up with 30+ R2 reads with serious errors in the middle of the sequence.

    Do you know any way to mask (or to allow mismatch at) a specific number of bases (2-3) at a specific position WITHIN the fragment length prior to alignment? Biostrings is an option that I would prefer not to use for reasons of speed.

    Can what you propose be selectively applied to only one of the two reads in the paired end samples?

    It would be ideal if this could be directly applied directly with Bowtie like the trimming left/right that already exists as an inherent option.



    Last edited by SEQond; 07-27-2012, 05:45 AM. Reason: moved to correct section

  • #2
    ELAND should be able to mask these bases during alignment, since the initial error-free segment is longer than the seed. Include USE_BASES Y41n3Y*n in the config file to mask bases 42-44 (plus the terminal base, which is the default). The one caveat is if the error results from fluidics/chemistry issues, then the phasing after the error may be incorrect.

    If this doesn't work or you prefer an alternative aligner, you could convert the reads into pseudo-paired end data. Use bases 1-41 as read one, then use bases 45-100 as read two. Filter the aligned data on expected criteria (i.e., both reads map to same chromosome and orientation, position of read 2 = read 1 + 44 [or + 41-44 if phasing is off]).

    Comment


    • #3
      Originally posted by HESmith View Post
      ELAND should be able to mask these bases during alignment, since the initial error-free segment is longer than the seed. Include USE_BASES Y41n3Y*n in the config file to mask bases 42-44 (plus the terminal base, which is the default). The one caveat is if the error results from fluidics/chemistry issues, then the phasing after the error may be incorrect.

      If this doesn't work or you prefer an alternative aligner, you could convert the reads into pseudo-paired end data. Use bases 1-41 as read one, then use bases 45-100 as read two. Filter the aligned data on expected criteria (i.e., both reads map to same chromosome and orientation, position of read 2 = read 1 + 44 [or + 41-44 if phasing is off]).
      Can ELAND in this way align Paired End sequences while at the same time masking selectively bases of only a one of the two reads?

      To be honest I would prefer a BW based aligner (Bowtie,BWA, and SOAP2)
      Thanks for your answer

      Comment


      • #4
        Possibly bowtie 2 is the answer to the issue

        also look here or here
        Last edited by SEQond; 08-13-2012, 08:18 AM.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 12:08 PM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        17 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        14 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        43 views
        0 likes
        Last Post seqadmin  
        Working...
        X