Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • What type of phred quality scores does samtools 1.3 expect

    I am working with NG sequence data that was generated in 2012 using the Illumina HiSeq 1000. I have had some trouble getting samtools to accurately identify heterozygotes and phase bam files generated from these reads. I am wondering if said problems are a result of samtools misreading the phred scores in the fastq/bam files. Does anyone know the default method of phred-score encoding that samtools 1.3 expects?

    e.g.
    Sanger/Illumina 1.8 (ASCII 33 to 126)
    Solexa/Illumina 1.0 (ASCII 59 to 126)
    Illumina 1.3 (ASCII 64 to 126)

  • #2
    All SAM/BAM/CRAM files expect Sanger/Illumina 1.8 format (aka, "phred + 33"). If you aligned phred+64 fastq files and didn't tell your aligner to fix the phred scores then you're going to have problems. I happen to have written a smaller conversion program if you happen to have done this (https://github.com/dpryan79/Answers/...iostars_133825).

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Understanding Genetic Influence on Infectious Disease
      by seqadmin




      During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

      Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
      09-09-2024, 10:59 AM
    • seqadmin
      Addressing Off-Target Effects in CRISPR Technologies
      by seqadmin






      The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
      08-27-2024, 04:44 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 01:02 PM
    0 responses
    8 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, Yesterday, 06:39 AM
    0 responses
    10 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 09-11-2024, 02:44 PM
    0 responses
    13 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 09-06-2024, 08:02 AM
    0 responses
    148 views
    0 likes
    Last Post seqadmin  
    Working...
    X