Unconfigured Ad

**dawe** · 11-09-2009, 08:36 AM

If you specify '--phred64-quals' or '--solexa1.3-quals' option you can use thos illumina reads without conversion

d

**Layla** · 11-09-2009, 08:43 AM

Just seen it!

Thanks

L

**tujchl** · 11-11-2009, 12:23 AM

I try data directlly from solexa without '--phred64-quals' or '--solexa1.3-quals' option. and the output looks well.

**dawe** · 11-11-2009, 01:02 AM

Originally posted by tujchl View Post

I try data directlly from solexa without '--phred64-quals' or '--solexa1.3-quals' option. and the output looks well.

Of course you can, but in that case you're probably estimating base qualities in a wrong way... I guess low quality bases are overestimated by a ~1000x factor...

**tujchl** · 11-11-2009, 01:52 AM

hi dawe
thank you for you replying, I just have two more questions
1. what do you mean by "overestimated by a ~1000x factor", could you please explain in detail?
2. I just test bowtie and it`s my feeling that bowtie do NOT use quality while running. so the quality control could been done before bowtie.
Thank you in advance

**dawe** · 11-11-2009, 02:03 AM

Originally posted by tujchl View Post

hi dawe
thank you for you replying, I just have two more questions
1. what do you mean by "overestimated by a ~1000x factor", could you please explain in detail?

phred-33 and phred-64 scores are different by a 31 offset in ASCII code. As this code is -10log10(p) (plus the offset), a difference in 30 is a difference in 1000x on probability values. The worst illumina score is "@" which means (and correct me if I'm wrong) p = 1. In a Sanger framework 64 is p~0.001 which is 1000x smaller.
For qualities in the "mid-range" the difference is not relevant.

Originally posted by tujchl View Post

2. I just test bowtie and it`s my feeling that bowtie do NOT use quality while running. so the quality control could been done before bowtie.

That's probably because you have lot of good quality reads, AFAIK bowtie uses qualities (I wonder why Ben included the phred33/phred64 option after all).

**Layla** · 11-11-2009, 03:28 AM

From looking into Bowtie's defaults --phred33 -quals is "on" and hence assumes you are providing reads in the standard sanger format (phred33). If you are providing data with quality scores in phred64 you should specify --phred64 -quals which is "off" by default. --solexa1.3 -quals is a good option which assumes you are providing unconverted data from the solexa GA 1.3 pipeline or later.

Alternatively you could use maq to convert the reads from phred64 to phred33 and simply put this through bowtie using bowtie's defaults!

Hope this helps

L

p.s A slight digression - I cannot unzip the hg18 version of the pre-built index h_sapiens_asm.ebwt.zip. I tried both part 1, part 2 and the entire genome, but I get an error saying
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive.

Any ideas?

**tujchl** · 11-11-2009, 05:53 PM

thank you dawe:
tow more questions:
1. accordding to your words, Can I consider that bowtie indeed ues the quality and filter some reads that can not pass?
2. where can I get the ASCII code of phred64 and phred33?

and thank Layla for your suggestions and poster this thread

I build my human genome index by myself for I don`t have so powerful computer that I build index chr by chr and run chr by chr ........

**dawe** · 11-12-2009, 02:02 AM

Originally posted by Layla View Post

p.s A slight digression - I cannot unzip the hg18 version of the pre-built index h_sapiens_asm.ebwt.zip. I tried both part 1, part 2 and the entire genome, but I get an error saying
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive.

Any ideas?

Try to index your own genome. I'm dowloading the ebwt right now but it will take more than indexing (at least here...).
BTW, you should ask bowtie webmaster the md5sum for the zip files.

**dawe** · 11-12-2009, 02:08 AM

Originally posted by tujchl View Post

1. accordding to your words, Can I consider that bowtie indeed ues the quality and filter some reads that can not pass?

You should ask bowtie developers, but AFAIK bowtie doesn't apply quality filters *before* the alignment. Base quality is used at alignment time to score mismatches.

Originally posted by tujchl View Post

2. where can I get the ASCII code of phred64 and phred33?

Code:

man ascii

look at the decimal set.

**svl** · 11-12-2009, 02:17 AM

perl script comparison table

Originally posted by tujchl View Post

2. where can I get the ASCII code of phred64 and phred33?

If you run the perl code below, you'll see a table with a comparison.

Code:

#!/usr/bin/env perl
################################################
# prints a table with phred, ASCII, phred+33, phred+64, p
################################################
use strict;
use warnings;

my @phreds = (0..62);
my $step = 2;

printf "%6s  %6s  %6s  %6s  %10s\n", 'phred', 'ASCII', 'Ill33', 'Ill64', 'p'; 

for (my $i = 0; $i < @phreds; $i+=$step ){
   my $phred = $phreds[$i];
   printf "%6d  %6d  %6s  %6s  %10f\n", $phred, $phred+64, chr($phred+33), chr($phred+64), phred2p($phred);
}

sub phred2p{
   return 10 ** (-(shift) / 10.0 );
}

**tujchl** · 11-13-2009, 12:05 AM

Thank all of you, I learned lots from you.
and two more questions:
1. when I used data directly from solexa as bowtie input, should I specify "--phred64" or "--solexa1.3" or both?
2. when I used option "--concise" to save my disk space and the output is like this
1-:<0,2852852,1>
and there is 0 other than my ref_index name !!! (I build my ref_index chr by chr and run bowtie chr by chr as well), could you please tell me how to get my ref_index name?
(ref_index name such as "chr1" wiil be back if I run bowtie without --concise ).

**dawe** · 11-13-2009, 12:34 AM

Originally posted by tujchl View Post

Thank all of you, I learned lots from you.
and two more questions:
1. when I used data directly from solexa as bowtie input, should I specify "--phred64" or "--solexa1.3" or both?

As stated in the bowtie help

Code:

--phred64-quals    input quals are Phred+64 (same as --solexa1.3-quals)

They are synonyms.

Originally posted by tujchl View Post

2. when I used option "--concise" to save my disk space and the output is like this
1-:<0,2852852,1>

Sorry, I can't help. To save space and get valuable information from my results I keep all in BAM format (directly from bowtie output).

Topics	Statistics	Last Post
UC San Diego Bioengineers Map Gene Function in Human Stem Cells by SEQadmin2 Started by SEQadmin2, 07-13-2026, 10:26 AM	0 responses 25 views 0 reactions	Last Post by SEQadmin2 07-13-2026, 10:26 AM
New Analysis Splits Leukemia Into 16 Epigenomic Subgroups by SEQadmin2 Started by SEQadmin2, 07-09-2026, 10:04 AM	0 responses 35 views 0 reactions	Last Post by SEQadmin2 07-09-2026, 10:04 AM
Genome-Wide CRISPR Screen Uncovers Unlikely Psoriasis Target by SEQadmin2 Started by SEQadmin2, 07-08-2026, 10:08 AM	0 responses 22 views 0 reactions	Last Post by SEQadmin2 07-08-2026, 10:08 AM
Engineered Protein Motor Takes Its First Steps Along DNA Track by SEQadmin2 Started by SEQadmin2, 07-07-2026, 11:05 AM	0 responses 34 views 0 reactions	Last Post by SEQadmin2 07-07-2026, 11:05 AM

Unconfigured Ad

BOWTIE input

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News