Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    I already said even with a sensitive setting, bwa-sw may not work well. But at least with a proper setting, it should map more than 1.5% of reads.

    SOAP2 would not work well. Neither bowtie/bwa-short. We also need to tune hash table based implementations. We have 1 error per 5bp in average, while these mappers typically use 11-14bp seeds: we may not find even one correct seed hit.

    Comment


    • #17
      Originally posted by feederbing View Post
      I have 101 base reads and expect up to 20 mismatches to reference. My reads are not pairs. I have tried bwa bwasw -a 1 -b 1 -T 60 but it only aligns 1.5% of the reads. And those have only a couple mismatches. I know from other tests ~ 30% should be aligned with 20 mismatches. Is this just something bwa is not designed for? What would be a better aligner? Or am I not using the right settings?
      You could try Stampy,
      http://www.well.ox.ac.uk/project-stampy

      From the Stampy webpage,
      Stampy excels in the mapping of reads that contain sequence variation relative to the reference, in particular for those containing insertions or deletions. It can map reads from a highly divergent species to a reference genome for instance.
      I have used it to do Human Re-sequencing data alignment, it is very accurate, though have not tried it with divergent species.
      Last edited by gprakhar; 09-14-2011, 01:45 AM. Reason: new info

      Comment


      • #18
        So far as I know stampy does glocal alignment, but for cross-species alignment, we more like to use local alignment. In addition, stampy uses 1-mismatch 15-mer seeds, 5bp skip. I doubt this will work well for 20% divergence. I guess "highly divergent" refers to ~5-10% divergence (human-chimp has 1%).
        Last edited by lh3; 09-14-2011, 05:20 AM.

        Comment


        • #19
          Originally posted by lh3 View Post
          BTW, to map high error rate with bwa-sw, you should decrease "-T" and increase "-z" to 10 or 100. ...
          Thanks, I'll try that (though, as you mention, it may still not work well).

          Comment


          • #20
            Originally posted by cdry7ue View Post
            I think you should go with BFAST with the super small mask like (11111111) to find candidate local alignments.
            Thanks. I have been running some tests with BFAST. I had initially posted a question about generating masks, http://seqanswers.com/forums/showthr...0855#post50855 . I've made some progress after posting that.

            I think the right approach with BFAST is not to make a short no zero mask, but instead to make long masks, following the advice about the number of 1s in the bfast guide, but to only use masks with spaces. Following the guide, my masks should have 21 1s. A mask of that length with no zeros is not going to find much with 15 to 20% divergence. It might find a few things that are very highly conserved, nothing more. The mask search procedure assumes it should include a no zero mask as the starting point, I think for this problem that is a poor assumption, that mask will find little and just slow it down. I've made some progress by just using one of the longer masks with many zeros.

            While BFAST is ok for speed on my test sample, and it gives me about the number of alignments I expect, the alignments are poor, with many gaps. I should be able to control this with the alignment scores, I haven't tried that yet.

            Comment


            • #21
              Thanks for all the suggestions. I apologize for not replying sooner. I missed the email notification for any posts after my reply to zee. I will try all the tools suggested, on my sample.

              Comment


              • #22
                Yes,
                There is a way to set up a matrix of penalties for the smith Watermann step.
                Also using a large mask with several zeros would mean that you are probably only dealing with substitution type changes, and not anticipating gaps.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Essential Discoveries and Tools in Epitranscriptomics
                  by seqadmin




                  The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                  04-22-2024, 07:01 AM
                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 11:49 AM
                0 responses
                13 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 08:47 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                61 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                60 views
                0 likes
                Last Post seqadmin  
                Working...
                X