I have 101 base reads and expect up to 20 mismatches to reference. My reads are not pairs. I have tried bwa bwasw -a 1 -b 1 -T 60 but it only aligns 1.5% of the reads. And those have only a couple mismatches. I know from other tests ~ 30% should be aligned with 20 mismatches. Is this just something bwa is not designed for? What would be a better aligner? Or am I not using the right settings?
Seqanswers Leaderboard Ad
Collapse
X
-
Maybe you try bfast or ssaha. They are not very fast but should perform ways better on your data. Bfast seems to be faster (from what I heard) but I think ssaha is a good startingpoint to get a first estimate of the alignment rate, because its very easy to use. Maybe you just try as subset at the beginning (100-1000kreads), because it's really not that fast.
Comment
-
-
Originally posted by moritzhess View PostMaybe you try bfast or ssaha. They are not very fast but should perform ways better on your data. Bfast seems to be faster (from what I heard) but I think ssaha is a good startingpoint to get a first estimate of the alignment rate, because its very easy to use. Maybe you just try as subset at the beginning (100-1000kreads), because it's really not that fast.
Comment
-
-
Originally posted by szilva View PostFor 20 mismatches per reads I would prefer something that is not based on Burrows-Wheeler, especially if you are expecting indels. Even for 101 bp long reads this amount of mismatches is pretty high.
Comment
-
-
Feederbing, you can try novoalign. It will allow up to 10 high quality mismatches. Also have a look at some of the trimming options that could improve your mapping rate.
Do you have a good idea of the quality profile to see where quality starts dropping off ? FastqC is a good tool for examining this.
Comment
-
-
Originally posted by feederbing View Postzee, just to be clear, the reason I expect so many mismatches is because of evolution, not sequencing quality.
Good luck
Dario
Comment
-
-
101bp, 20% mismatches. I believe you will have lots of misalignments if you are aligning against human (fine if against a small genome). If you want to do that anyway, I would vote ssaha2.
BTW, to map high error rate with bwa-sw, you should decrease "-T" and increase "-z" to 10 or 100. Your setting may even make bwa-sw less sensitivity than the default setting. Nonetheless, even for -z100, probably bwa-sw would not work well for 100bp+20% mismatches.
For mammalian genomes, another option is BWT-SW. If you have short reference genome, you may try cross_match, fasta and SSE2-based smith-waterman.
If you have high coverage, you should assemble the reads first and then do alignment. That will be much better.
Comment
-
-
Maybe Heng can correct me, but isn't bwasw for longer reads and should not be used for 100bp reads, especially with that expected error rate? (At least that's what I remember from his paper.)
As for increasing the -z value, I barely see improvements for values above 10 and the run time for higher values is not really worth it. It sometimes helps to rerun the program with the remaining reads to get more aligned.
Anybody has experience with SOAP2?
Comment
-
Latest Articles
Collapse
-
by seqadmin
The COVID-19 pandemic highlighted the need for proactive pathogen surveillance systems. As ongoing threats like avian influenza and newly emerging infections continue to pose risks, researchers are working to improve how quickly and accurately pathogens can be identified and tracked. In a recent SEQanswers webinar, two experts discussed how next-generation sequencing (NGS) and machine learning are shaping efforts to monitor viral variation and trace the origins of infectious...-
Channel: Articles
03-24-2025, 11:48 AM -
-
by seqadmin
This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.
The Headliner
The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...-
Channel: Articles
03-03-2025, 01:39 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-20-2025, 05:03 AM
|
0 responses
42 views
0 reactions
|
Last Post
by seqadmin
03-20-2025, 05:03 AM
|
||
Started by seqadmin, 03-19-2025, 07:27 AM
|
0 responses
53 views
0 reactions
|
Last Post
by seqadmin
03-19-2025, 07:27 AM
|
||
Started by seqadmin, 03-18-2025, 12:50 PM
|
0 responses
39 views
0 reactions
|
Last Post
by seqadmin
03-18-2025, 12:50 PM
|
||
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
194 views
0 reactions
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
Comment