Announcement

Collapse
No announcement yet.

bwa:how to align color space reads

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • kbhit
    replied
    Hi Jeremy,
    We normally get about 55% mappability on good quality long RNA using Shrimp2. Prior to calling Shrimp2 I use the latest version of cutadapt to do quality trimming (q of 15 normally) - this helps boost the quality. For us, when compared to Lifescope (calculated manually), Shrimp unusually performs better with regards to uniq mapping percentage.

    Also, for COLORSPACE, be careful when using BWA & Bowtie, they don't handle color space correctly (which might be why there latest versions may be abandoning support for it). It's definitely trickier to handle color-space and it takes more brainpower to get it right. For example, they aren't able to work with the first and last nt of the read which lowers specificity. Crossover handling can also be problematic there. Shrimp and Lifescope don't' have those problems.

    Leave a comment:


  • JeremyDay
    replied
    Thanks kbhit! I'll have to take a look at this.

    Do you mind if I ask what you are getting for mappability using Shrimp2?

    I appreciate your input. Finding a proper pipeline for Solid data is becoming a daunting task. If we use Lifescope (and we havent looked thoroughly), our Bioinformaticians' initial thoughts are similar to yours, and/or they believe that it is low quality mapping. If we use something like Bowtie, it brings are mapping to 65% and below. That's a lot of wasted reads that potentially could be meaningful data. With Wildfire data we are down in the 40's with Bowtie, and Lifescope is still almost 90%.

    Leave a comment:


  • kbhit
    replied
    Hi Jeremy,
    I found that the stats that Lifescope can be misleading. Instead, when I compare it with other aligners like BWA and Shrimp (I like Shrimp2 a lot), I calculate the Lifescope mapping percentage manually. To do this I use (uniquely mapped reads / total starting reads ). In order to get the numerator, I look at the raw SAM file that Lifescope produces to get that value (rather than looking at their automatic report).

    Something like:

    cat <Lifescope's output sam file> |
    grep -v "^@.. " | # remove headers
    awk '{if (and($2, 4) == 0) print}' | # mapped
    wc -l | # get the total count

    I can't remember off hand but you may want to remove the ones with mapping qualities of 0.

    If you need more information please let me know and I'll dig a little more

    Leave a comment:


  • JeremyDay
    replied
    Solid mapping

    Originally posted by kbhit View Post
    Be careful, bioscope/lifescope can be misleading on the mapping rate if you're not careful what you look at. If you look at the main summary, it always seems really high. But look at the SAM file it generates and do your own calculation. Most of the time, it maps about half as much as BWA does. Lifescope does give the 'real' stats but you have to dig much deeper to get it - it's highly misleading.
    KBhit- Do you mind elaborating on this? I have searched and searched for a better way to map Solid data. When we use Lifescope compared to something like Bowtie, its a difference of 90% and 60%. No one seems to be getting better than 60% mappability with Solid Colorspace, and Lifescope always reports higher. Do you believe Lifescope is misrepresenting it's metrics somehow?

    Does anyone have suggestions for the best way to Map Solid data without tossing tons of reads?

    Leave a comment:


  • gigigou
    replied
    Originally posted by kexin View Post
    Hi everyone.As we know Bowtie is a software in which we need edit. I want to know if there is a software we don't need edit to map reads to map billions of short reads onto genomes. ThanK you
    What do you mean by "edit"?
    As far as I know, in my opinion, bowtie is the easiest to use among all the align tools I have used

    Leave a comment:


  • kbhit
    replied
    Be careful, bioscope/lifescope can be misleading on the mapping rate if you're not careful what you look at. If you look at the main summary, it always seems really high. But look at the SAM file it generates and do your own calculation. Most of the time, it maps about half as much as BWA does. Lifescope does give the 'real' stats but you have to dig much deeper to get it - it's highly misleading.

    Leave a comment:


  • SOLiDance
    replied
    Originally posted by colindaven View Post
    I'm not too sure bwa was working too well with colour space data.

    Just a brief result from some trial 60bp exome data alignments to hg19 with default settings:

    bioscope - 85 % reads mapped (albeit with iterative read trimming)
    bwa ~ 40 %
    bowtie ~ 33%
    NovoalignCS ~59%

    Now I know there is a lot of optimisation to be done but the raw results are extremely diverse
    emm~ Me too,bioscope can always get obvious higher map rate, I doubt maybe it contains more false positive mapped reads

    Leave a comment:


  • colindaven
    replied
    I'm not too sure bwa was working too well with colour space data.

    Just a brief result from some trial 60bp exome data alignments to hg19 with default settings:

    bioscope - 85 % reads mapped (albeit with iterative read trimming)
    bwa ~ 40 %
    bowtie ~ 33%
    NovoalignCS ~59%

    Now I know there is a lot of optimisation to be done but the raw results are extremely diverse

    Leave a comment:


  • SOLiDance
    replied
    Originally posted by Bukowski View Post
    But this is mentioned in the NEWS file of the release.

    Release 0.6.1 (28 November, 2011)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Notable changes to BWA-short:

    * Bugfix: duplicated alternative hits in the XA tag.

    * Bugfix: when trimming enabled, bwa-aln trims 1bp less.

    * Disabled the color-space alignment. 0.6.x is not working with SOLiD reads at
    present.


    Which is a timely reminder to read all the documentation, and not just what is on potentially infrequently updated web pages
    Thanks for the info. I indeed Not noticed there's a NEWS file~ shoot,My fault!

    Leave a comment:


  • SOLiDance
    replied
    Originally posted by NestorNotabilis View Post
    I see you used the -c flag to indicate color-space whilst generating the reference database but did you also use the -c flag with the bwa aln command?

    e.g.

    bwa aln -c -f <sai output> <ref> <fastq input>

    Both the indexing and the aligning require the -c flag. bwa samse, in contrast, does not.


    Incidentally, unfortunately as of release 0.6, BWA has dropped color-space support (although the online documentation makes no mention of this) so BWA may no longer be the best mapper to invest time in for the longer term. This is unfortunate given it's usefulness
    Thanks for yr help! Actually, I used the -c ,even tried -n 3 or -n4 when proceed bwa aln.Sorry for forget to mention it~
    I checked my bwa version, it's 0.6.1, maybe here is the reason,what a shame~

    Leave a comment:


  • kexin
    replied
    Hi everyone.As we know Bowtie is a software in which we need edit. I want to know if there is a software we don't need edit to map reads to map billions of short reads onto genomes. ThanK you

    Leave a comment:


  • Bukowski
    replied
    Originally posted by NestorNotabilis View Post
    Incidentally, unfortunately as of release 0.6, BWA has dropped color-space support (although the online documentation makes no mention of this) so BWA may no longer be the best mapper to invest time in for the longer term. This is unfortunate given it's usefulness
    But this is mentioned in the NEWS file of the release.

    Release 0.6.1 (28 November, 2011)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    Notable changes to BWA-short:

    * Bugfix: duplicated alternative hits in the XA tag.

    * Bugfix: when trimming enabled, bwa-aln trims 1bp less.

    * Disabled the color-space alignment. 0.6.x is not working with SOLiD reads at
    present.


    Which is a timely reminder to read all the documentation, and not just what is on potentially infrequently updated web pages

    Leave a comment:


  • NestorNotabilis
    replied
    I see you used the -c flag to indicate color-space whilst generating the reference database but did you also use the -c flag with the bwa aln command?

    e.g.

    bwa aln -c -f <sai output> <ref> <fastq input>

    Both the indexing and the aligning require the -c flag. bwa samse, in contrast, does not.


    Incidentally, unfortunately as of release 0.6, BWA has dropped color-space support (although the online documentation makes no mention of this) so BWA may no longer be the best mapper to invest time in for the longer term. This is unfortunate given it's usefulness

    Leave a comment:


  • SOLiDance
    started a topic bwa:how to align color space reads

    bwa:how to align color space reads

    Hi,everybody~
    This puzzled me for days : I tried to use bwa on SOLiD seq results. But when I finished the manual, couldn't find a in-detail workflow about color space reads alignment. According to some post, I took these steps below:
    1 solid2fastq: used the script in the bwa suite(color to double encoded:ACGTN);
    2 index the fasta reference,with -c ;
    3 bwa aln;
    4 bwa samse (my SOLiD reads is fragment library)
    5 parse sam , and I found all the beads were Unmapped,But then I used same reads & reference with other tools,such as bioscope , bFast . The results are just fine , thousands of mapped reads.
    Then I tried with color space fastq(which means the sequence line is consisted of 1234.), All reads unmapped too~
    Maybe this workflow is not suitable? Could anyone please show me how to deal with color space reads with bwa?
    Many thanks!
Working...
X