bwa:how to align color space reads

kbhit replied

01-08-2013, 06:27 AM
Hi Jeremy,
We normally get about 55% mappability on good quality long RNA using Shrimp2. Prior to calling Shrimp2 I use the latest version of cutadapt to do quality trimming (q of 15 normally) - this helps boost the quality. For us, when compared to Lifescope (calculated manually), Shrimp unusually performs better with regards to uniq mapping percentage.

Also, for COLORSPACE, be careful when using BWA & Bowtie, they don't handle color space correctly (which might be why there latest versions may be abandoning support for it). It's definitely trickier to handle color-space and it takes more brainpower to get it right. For example, they aren't able to work with the first and last nt of the read which lowers specificity. Crossover handling can also be problematic there. Shrimp and Lifescope don't' have those problems.
Leave a comment:
JeremyDay replied

01-07-2013, 02:55 PM
Thanks kbhit! I'll have to take a look at this.

Do you mind if I ask what you are getting for mappability using Shrimp2?

I appreciate your input. Finding a proper pipeline for Solid data is becoming a daunting task. If we use Lifescope (and we havent looked thoroughly), our Bioinformaticians' initial thoughts are similar to yours, and/or they believe that it is low quality mapping. If we use something like Bowtie, it brings are mapping to 65% and below. That's a lot of wasted reads that potentially could be meaningful data. With Wildfire data we are down in the 40's with Bowtie, and Lifescope is still almost 90%.
Leave a comment:
kbhit replied

01-07-2013, 01:41 PM
Hi Jeremy,
I found that the stats that Lifescope can be misleading. Instead, when I compare it with other aligners like BWA and Shrimp (I like Shrimp2 a lot), I calculate the Lifescope mapping percentage manually. To do this I use (uniquely mapped reads / total starting reads ). In order to get the numerator, I look at the raw SAM file that Lifescope produces to get that value (rather than looking at their automatic report).

Something like:

cat <Lifescope's output sam file> |
grep -v "^@.. " | # remove headers
awk '{if (and($2, 4) == 0) print}' | # mapped
wc -l | # get the total count

I can't remember off hand but you may want to remove the ones with mapping qualities of 0.

If you need more information please let me know and I'll dig a little more
Leave a comment:
JeremyDay replied

01-07-2013, 12:04 PM
Solid mapping

Originally posted by kbhit View Post

Be careful, bioscope/lifescope can be misleading on the mapping rate if you're not careful what you look at. If you look at the main summary, it always seems really high. But look at the SAM file it generates and do your own calculation. Most of the time, it maps about half as much as BWA does. Lifescope does give the 'real' stats but you have to dig much deeper to get it - it's highly misleading.

KBhit- Do you mind elaborating on this? I have searched and searched for a better way to map Solid data. When we use Lifescope compared to something like Bowtie, its a difference of 90% and 60%. No one seems to be getting better than 60% mappability with Solid Colorspace, and Lifescope always reports higher. Do you believe Lifescope is misrepresenting it's metrics somehow?

Does anyone have suggestions for the best way to Map Solid data without tossing tons of reads?
Leave a comment:
gigigou replied

09-04-2012, 06:28 PM
Originally posted by kexin View Post

Hi everyone.As we know Bowtie is a software in which we need edit. I want to know if there is a software we don't need edit to map reads to map billions of short reads onto genomes. ThanK you

What do you mean by "edit"?
As far as I know, in my opinion, bowtie is the easiest to use among all the align tools I have used
Leave a comment:
kbhit replied

06-08-2012, 08:41 AM
Be careful, bioscope/lifescope can be misleading on the mapping rate if you're not careful what you look at. If you look at the main summary, it always seems really high. But look at the SAM file it generates and do your own calculation. Most of the time, it maps about half as much as BWA does. Lifescope does give the 'real' stats but you have to dig much deeper to get it - it's highly misleading.
Leave a comment:
SOLiDance replied

01-11-2012, 05:14 PM
Originally posted by colindaven View Post

I'm not too sure bwa was working too well with colour space data.

Just a brief result from some trial 60bp exome data alignments to hg19 with default settings:

bioscope - 85 % reads mapped (albeit with iterative read trimming)
bwa ~ 40 %
bowtie ~ 33%
NovoalignCS ~59%

Now I know there is a lot of optimisation to be done but the raw results are extremely diverse

emm~ Me too,bioscope can always get obvious higher map rate, I doubt maybe it contains more false positive mapped reads
Leave a comment:
colindaven replied

01-09-2012, 04:31 AM
I'm not too sure bwa was working too well with colour space data.

Just a brief result from some trial 60bp exome data alignments to hg19 with default settings:

bioscope - 85 % reads mapped (albeit with iterative read trimming)
bwa ~ 40 %
bowtie ~ 33%
NovoalignCS ~59%

Now I know there is a lot of optimisation to be done but the raw results are extremely diverse
Leave a comment:
SOLiDance replied

01-05-2012, 06:06 PM
Originally posted by Bukowski View Post

But this is mentioned in the NEWS file of the release.

Release 0.6.1 (28 November, 2011)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Notable changes to BWA-short:

* Bugfix: duplicated alternative hits in the XA tag.

* Bugfix: when trimming enabled, bwa-aln trims 1bp less.

* Disabled the color-space alignment. 0.6.x is not working with SOLiD reads at
present.

Which is a timely reminder to read all the documentation, and not just what is on potentially infrequently updated web pages

Thanks for the info. I indeed Not noticed there's a NEWS file~ shoot,My fault!
Leave a comment:
SOLiDance replied

01-05-2012, 05:53 PM
Originally posted by NestorNotabilis View Post

I see you used the -c flag to indicate color-space whilst generating the reference database but did you also use the -c flag with the bwa aln command?

e.g.

bwa aln -c -f <sai output> <ref> <fastq input>

Both the indexing and the aligning require the -c flag. bwa samse, in contrast, does not.

Incidentally, unfortunately as of release 0.6, BWA has dropped color-space support (although the online documentation makes no mention of this) so BWA may no longer be the best mapper to invest time in for the longer term. This is unfortunate given it's usefulness

Thanks for yr help! Actually, I used the -c ,even tried -n 3 or -n4 when proceed bwa aln.Sorry for forget to mention it~
I checked my bwa version, it's 0.6.1, maybe here is the reason,what a shame~
Leave a comment:
kexin replied

01-05-2012, 04:46 AM
Hi everyone.As we know Bowtie is a software in which we need edit. I want to know if there is a software we don't need edit to map reads to map billions of short reads onto genomes. ThanK you
Leave a comment:
Bukowski replied

01-05-2012, 04:10 AM
Originally posted by NestorNotabilis View Post

Incidentally, unfortunately as of release 0.6, BWA has dropped color-space support (although the online documentation makes no mention of this) so BWA may no longer be the best mapper to invest time in for the longer term. This is unfortunate given it's usefulness

But this is mentioned in the NEWS file of the release.

Release 0.6.1 (28 November, 2011)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Notable changes to BWA-short:

* Bugfix: duplicated alternative hits in the XA tag.

* Bugfix: when trimming enabled, bwa-aln trims 1bp less.

* Disabled the color-space alignment. 0.6.x is not working with SOLiD reads at
present.

Which is a timely reminder to read all the documentation, and not just what is on potentially infrequently updated web pages
Leave a comment:
NestorNotabilis replied

01-05-2012, 01:59 AM
I see you used the -c flag to indicate color-space whilst generating the reference database but did you also use the -c flag with the bwa aln command?

e.g.

bwa aln -c -f <sai output> <ref> <fastq input>

Both the indexing and the aligning require the -c flag. bwa samse, in contrast, does not.

Incidentally, unfortunately as of release 0.6, BWA has dropped color-space support (although the online documentation makes no mention of this) so BWA may no longer be the best mapper to invest time in for the longer term. This is unfortunate given it's usefulness
Leave a comment:
SOLiDance started a topic bwa:how to align color space reads

01-05-2012, 12:14 AM
bwa:how to align color space reads

Hi,everybody~
This puzzled me for days : I tried to use bwa on SOLiD seq results. But when I finished the manual, couldn't find a in-detail workflow about color space reads alignment. According to some post, I took these steps below:
1 solid2fastq: used the script in the bwa suite(color to double encoded:ACGTN);
2 index the fasta reference,with -c ;
3 bwa aln;
4 bwa samse (my SOLiD reads is fragment library)
5 parse sam , and I found all the beads were Unmapped,But then I used same reads & reference with other tools,such as bioscope , bFast . The results are just fine , thousands of mapped reads.
Then I tried with color space fastq(which means the sequence line is consisted of 1234.), All reads unmapped too~
Maybe this workflow is not suitable? Could anyone please show me how to deal with color space reads with bwa?
Many thanks!
Tags: bwa align color solid

Previous template Next

Exploring the Dynamics of the Tumor Microenvironment

by seqadmin

The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
- Channel: Articles
07-08-2024, 03:19 PM

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 26 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 159 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: