Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ALLPATHS-LG: Using different quality scorings

    Working on creating my .csv files for a large assembly in ALLPATHS and I recall someone telling me that ALLPATHS needs special attention when utilizing libraries with different phred scorings? How do I incorporate that into my input files?

    Illumina 1.5 = phred 64?
    Illumina 1.9 = phred 33?

    Currently using a .ppt by Mike Schatz as a reference as this is my first use of ALLPATHS and it is a bit different than other assemblers I have used in the past.

  • #2
    Wow! 140+ views thus far...my question must be pretty "off the beaten path".

    I found a reference to this on the Trinity mailing list of all places:


    Excerpt:
    The PHRED+33 vs +64 issue is known. Sanger uses +33, and Illumina uses +64
    (1.3+, 1.5+). However, Illumina 1.8+ uses phred+33 again
    (http://en.wikipedia.org/wiki/Fastq). Since that's the default in our
    environment here, it has become the default for FastqToFastbQualb. In most
    situations one can auto-detect which is which, but not in general. Therefore
    the user must specify PHRED_64=True if that is the case.

    Comment


    • #3
      Looks like best bet here is to simply convert the old Illumina 1.5 Phred 64 scored files to the newer Illumina 1.9 Phred 33 scoring.

      LOOKS...like bbmap will do this, but I'll have to try it in a bit.

      Maybe: bbmap qin=64 old-phred.fastq qout=33 phred-new.fastq

      Comment


      • #4
        So, I ended up trying seqtk as a colleague had suggested that first...

        seqtk seq -VQ64 s1_R1_PE.fastq > s1_R1_PE_Phred33.fastq

        A quick check of the resulting file with FastQC seems to confirm the change:

        ##FastQC 0.11.1
        >>Basic Statistics pass
        #Measure Value
        Filename s1_R1_PE_Phred33.fastq
        File type Conventional base calls
        Encoding Sanger / Illumina 1.9
        Total Sequences 98578990
        Sequences flagged as poor quality 0
        Sequence length 101
        %GC 38
        >>END_MODULE

        However, Phred64 must contail a LOT more text as the file sizes diminished significantly as a result of the conversion:

        -rw-r--r-- 1 jpummil jpummil 27G Mar 25 2014 s1_R1_PE.fastq
        -rw-rw-r-- 1 jpummil jpummil 23G Oct 8 14:12 s1_R1_PE_Phred33.fastq

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:47 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X