I am working with NG sequence data that was generated in 2012 using the Illumina HiSeq 1000. I have had some trouble getting samtools to accurately identify heterozygotes and phase bam files generated from these reads. I am wondering if said problems are a result of samtools misreading the phred scores in the fastq/bam files. Does anyone know the default method of phred-score encoding that samtools 1.3 expects?
e.g.
Sanger/Illumina 1.8 (ASCII 33 to 126)
Solexa/Illumina 1.0 (ASCII 59 to 126)
Illumina 1.3 (ASCII 64 to 126)
e.g.
Sanger/Illumina 1.8 (ASCII 33 to 126)
Solexa/Illumina 1.0 (ASCII 59 to 126)
Illumina 1.3 (ASCII 64 to 126)
Comment