Seqanswers Leaderboard Ad

**SillyPoint** · 04-17-2009, 06:30 AM

I didn't know Gerald could produce fastq files directly. We use a perl script to extract information from the *_ub_custom_qseq.txt files produced by Gerald and convert it to fastq format (discarding the non-PF reads in the process). The ascii scores in the qseq files are scaled by 64.

Can you post the Gerald config file you used to create the fastq?

SillyPoint

**cbrennan** · 04-17-2009, 10:45 AM

Gerald can generate fasta, fastq, or scarf (default) files.

for fastq files put the line:

12345678:SEQUENCE_FORMAT --fastq

in your Gerald config file.

Christine

**Sylphide** · 02-25-2011, 02:54 AM

I looked for the meaning of illumina quality scores and couldn't find any direct translation so here it is (in case it is of any use to someone else)

Illumina quality score dictionary :

ASCII / numeric / base probability to be wrong
@ 0 1
A 1 0.7943282347
B 2 0.6309573445
C 3 0.5011872336
D 4 0.3981071706
E 5 0.316227766
F 6 0.2511886432
G 7 0.1995262315
H 8 0.1584893192
I 9 0.1258925412
J 10 0.1
K 11 0.0794328235
L 12 0.0630957344
M 13 0.0501187234
N 14 0.0398107171
O 15 0.0316227766
P 16 0.0251188643
Q 17 0.0199526231
R 18 0.0158489319
S 19 0.0125892541
T 20 0.01
U 21 0.0079432823
V 22 0.0063095734
W 23 0.0050118723
X 24 0.0039810717
Y 25 0.0031622777
Z 26 0.0025118864
[ 27 0.0019952623
\ 28 0.0015848932
] 29 0.0012589254
^ 30 0.001
_ 31 0.0007943282
` 32 0.0006309573
a 33 0.0005011872
b 34 0.0003981072
c 35 0.0003162278
d 36 0.0002511886
e 37 0.0001995262
f 38 0.0001584893
g 39 0.0001258925
h 40 0.0001
i 41 7.94328234724282E-005
j 42 6.30957344480193E-005
k 43 5.01187233627272E-005
l 44 3.98107170553497E-005
m 45 3.16227766016837E-005
n 46 2.51188643150957E-005
o 47 1.99526231496888E-005
p 48 1.58489319246111E-005
q 49 1.25892541179417E-005
r 50 0.00001
s 51 7.94328234724281E-006
t 52 6.30957344480192E-006
u 53 5.01187233627272E-006
v 54 3.98107170553497E-006
w 55 3.16227766016838E-006
x 56 2.51188643150958E-006
y 57 1.99526231496888E-006
z 58 1.58489319246111E-006
{ 59 1.25892541179417E-006
| 60 0.000001
} 61 7.9432823472428E-007
~ 62 0.000000631

**amitm** · 02-25-2011, 01:18 PM

for converting SCARF format to fastq

Originally posted by Sylphide View Post

I looked for the meaning of illumina quality scores and couldn't find any direct translation so here it is (in case it is of any use to someone else)

Illumina quality score dictionary :

text illumina_score
@ 0
A 1
B 2
.
.
.

hello Sylphide,
Just to reconfirm. Can I use this conversion table to convert quality score in SCARF ASCII format to SCARF numeric, so that I can then use 'fq_all2std.pl' (from Maq site) to generate standard fastq format. The script assumes the quality score in .scarf file to be in numeric form whereas I have the files with scores in ASCII form.
I'm a beginner in sequencing data analysis. Kindly help out

thanks

**Sylphide** · 02-28-2011, 12:54 AM

hello
I'm also a beginner but I'll try to help.
You can use the conversion table I wrote to convert ASCII to numeric if you want to program it yourself. There must be some tool to make the conversion automatically but I couldn't find any.

ps : I added the probability for a base to be wrong in my previous message.

**amitm** · 03-01-2011, 12:09 AM

hello Sylphide,
I cleared my confusion from here. Basically what I understood is Solexa quality in ASCII is encoded with an offset of 33 whereas Illumina 1.3+ quality has an offset of 64. Now I can parse the .scarf file if I have to.
There are many tools to convert between qualities, but I know of only one which is free and accepts .scarf input. Thats the "fq_all2std.pl" from Maq site.
thanks anyways! I started hunt around about quality encoding from your post :-)

Topics	Statistics	Last Post
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 57 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM
Mapping the snoRNAome in Zebrafish to Advance Disease Research by seqadmin Started by seqadmin, 03-18-2025, 12:50 PM	0 responses 50 views 0 reactions	Last Post by seqadmin 03-18-2025, 12:50 PM
TIGR Systems Offer a Compact Alternative to CRISPR for Gene Editing by seqadmin Started by seqadmin, 03-03-2025, 01:15 PM	0 responses 201 views 0 reactions	Last Post by seqadmin 03-03-2025, 01:15 PM

Seqanswers Leaderboard Ad

Illumina quality scores

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News