Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • angelpie
    Junior Member
    • Nov 2010
    • 8

    converting_fastq_file

    I have illumina v1.5+ type fastq files.

    I learned fastq file consists of 4 types, sanger, illumina v1.0, v1.3 and v1.5.
    Then, I also learned many programs require sanger type fastq files.

    Method of converting illumina v1.0 (solexa) into sanger is found occasionally.
    However, I can't find actual procedures of converting illumina v1.5 into sanger.

    Could you please help me?
  • nicolallias
    Member
    • Jan 2010
    • 23

    #2
    Hi,
    The only difference between those formats is the quality, if you are familiar with Perl, try the following :
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    $q_line =~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;

    Which could be written in Python as
    q_line = "".join([chr(ord(i)-31) for i in q_line])

    Or do you prefer an awk line ?
    Or a full script ready-to-use ?

    Comment

    • maubp
      Peter (Biopython etc)
      • Jul 2009
      • 1544

      #3
      Originally posted by angelpie View Post
      Could you please help me?
      See http://en.wikipedia.org/wiki/FASTQ_format and:

      Cock et al (2009) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, http://dx.doi.org/10.1093/nar/gkp1137

      From the point of view of conversion, FASTQ files from Illumina 1.5 are basically the same as Illumina 1.3 and 1.4 except the meaning of some low qualities, see:

      Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

      Last edited by maubp; 12-01-2010, 02:17 AM. Reason: making DOI into a link

      Comment

      • angelpie
        Junior Member
        • Nov 2010
        • 8

        #4
        Thank nicolallias and maubp for your quick reply.

        Although I think I know formulae of these formats,
        I don't know how do I convert between them
        because I am just a user of existent scripts/programs.

        Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?

        I tried to use perl script in refered thread.
        However, I found errors.

        Or a full script ready-to-use ?
        If possible, please teach me.
        Last edited by angelpie; 12-01-2010, 03:48 AM.

        Comment

        • maubp
          Peter (Biopython etc)
          • Jul 2009
          • 1544

          #5
          Originally posted by angelpie View Post
          Can I use procedures for illumina v1.3 to convert illumina v1.5+ files?
          Yes.


          Originally posted by angelpie View Post
          I am just a user of existent scripts/programs.
          Try EMBOSS seqret if you want a command line tool for converting file formats. Use fastq-illumina as the input format, fastq-sanger as the output format.

          If you are happier with Python, Perl, Java, or Ruby then try Biopython, BioPerl, BioJava or BioRuby for existing libraries for reading, writing and converting FASTQ files (see the paper I linked to before).

          Originally posted by angelpie View Post
          I tried to use perl script in refered thread.
          However, I found errors.
          What errors?

          Comment

          • angelpie
            Junior Member
            • Nov 2010
            • 8

            #6
            Error messages said
            "Use of uninitialized value $(variables) in concatenation (.) or string at .....".

            Comment

            • maubp
              Peter (Biopython etc)
              • Jul 2009
              • 1544

              #7
              I don't know enough Perl to help you - but I don't think nicolallias' example was standalone, it was more of a hint for someone familiar with Perl.

              Do you have EMBOSS installed? The EMBOSS tool seqret is an easy way to do this at the command line.

              Comment

              • nicolallias
                Member
                • Jan 2010
                • 23

                #8
                Originally posted by maubp View Post
                it was more of a hint for someone familiar with Perl
                Exact, and using already written tools is your best option.
                angelpie: if you wish to learn more, you really should visit the wikipedia page about Fastq format.

                Comment

                • epigen
                  Senior Member
                  • May 2010
                  • 101

                  #9
                  convert Illumina scores to Phred in a BAM file

                  If you already have a BAM file, you can transform the scores in it as follows:

                  samtools view -h Illumina_score.bam | perl -lane '$"="\t"; if (/^@/) {print;} else {$F[10]=~ tr/\x40-\xff\x00-\x3f/\x21-\xe0\x21/;print "@F"}' | samtools view -Sbh - > Phred_score.bam

                  Thanks nicolallias for providing the very efficient trick. It saved us a lot of fastq file transformations and we did not have to run all the BWA alignments again.

                  Comment

                  Latest Articles

                  Collapse

                  • GATTACAT
                    Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by GATTACAT
                    Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                    07-01-2026, 11:43 AM
                  • SEQadmin2
                    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                    by SEQadmin2


                    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                    Here are nine questions we think about, in roughly the order they matter, before...
                    06-18-2026, 07:11 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by SEQadmin2, 07-02-2026, 11:08 AM
                  0 responses
                  12 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-30-2026, 05:37 AM
                  0 responses
                  14 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-26-2026, 11:10 AM
                  0 responses
                  20 views
                  0 reactions
                  Last Post SEQadmin2  
                  Started by SEQadmin2, 06-17-2026, 06:09 AM
                  0 responses
                  54 views
                  0 reactions
                  Last Post SEQadmin2  
                  Working...