Unconfigured Ad

**nicolallias** · 12-01-2010, 02:04 AM

Hi,
The only difference between those formats is the quality, if you are familiar with Perl, try the following :

strange Illumina txt format - SEQanswers

http://seqanswers.com/forums/showthread.php?t=5192

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

$q_line =~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;

Which could be written in Python as
q_line = "".join([chr(ord(i)-31) for i in q_line])

Or do you prefer an awk line ?
Or a full script ready-to-use ?

**maubp** · 12-01-2010, 02:17 AM

Originally posted by angelpie View Post

Could you please help me?

See http://en.wikipedia.org/wiki/FASTQ_format and:

Cock et al (2009) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, http://dx.doi.org/10.1093/nar/gkp1137

From the point of view of conversion, FASTQ files from Illumina 1.5 are basically the same as Illumina 1.3 and 1.4 except the meaning of some low qualities, see:

Illumina FASTQ Quality Scores - Missing Value - SEQanswers

http://seqanswers.com/forums/showpost.php?p=17491&postcount=3

Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

404 Page not found

http://news.open-bio.org/news/2010/04/illumina-q2-trim-fastq/

**angelpie** · 12-01-2010, 03:46 AM

Thank nicolallias and maubp for your quick reply.

Although I think I know formulae of these formats,
I don't know how do I convert between them
because I am just a user of existent scripts/programs.

Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?

I tried to use perl script in refered thread.
However, I found errors.

Or a full script ready-to-use ?

If possible, please teach me.

**maubp** · 12-01-2010, 03:58 AM

Originally posted by angelpie View Post

Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?

Yes.

Originally posted by angelpie View Post

I am just a user of existent scripts/programs.

Try EMBOSS seqret if you want a command line tool for converting file formats. Use fastq-illumina as the input format, fastq-sanger as the output format.

If you are happier with Python, Perl, Java, or Ruby then try Biopython, BioPerl, BioJava or BioRuby for existing libraries for reading, writing and converting FASTQ files (see the paper I linked to before).

Originally posted by angelpie View Post

I tried to use perl script in refered thread.
However, I found errors.

What errors?

**angelpie** · 12-01-2010, 04:42 AM

Error messages said
"Use of uninitialized value $(variables) in concatenation (.) or string at .....".

**maubp** · 12-01-2010, 05:42 AM

I don't know enough Perl to help you - but I don't think nicolallias' example was standalone, it was more of a hint for someone familiar with Perl.

Do you have EMBOSS installed? The EMBOSS tool seqret is an easy way to do this at the command line.

**nicolallias** · 12-01-2010, 06:02 AM

Originally posted by maubp View Post

it was more of a hint for someone familiar with Perl

Exact, and using already written tools is your best option.
angelpie: if you wish to learn more, you really should visit the wikipedia page about Fastq format.

**epigen** · 03-01-2011, 07:51 AM

convert Illumina scores to Phred in a BAM file

If you already have a BAM file, you can transform the scores in it as follows:

samtools view -h Illumina_score.bam | perl -lane '$"="\t"; if (/^@/) {print;} else {$F[10]=~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;print "@F"}' | samtools view -Sbh - > Phred_score.bam

Thanks nicolallias for providing the very efficient trick. It saved us a lot of fastq file transformations and we did not have to run all the BWA alignments again.

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, 07-02-2026, 11:08 AM	0 responses 12 views 0 reactions	Last Post by SEQadmin2 07-02-2026, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

converting_fastq_file

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News