Hi everybody,
I have a question concerning the SAMtools pileup format. I performed SNP (and Indel) calling for 75 bp PE Illumina data with hg19 by following the protocol explained here.
I noticed in the filtered output the following SNP call:
chr3 57627500 t C 241 241 60 71 C$c$c$c$ccccccccccccccCccccccCcccccccccccccccccccccccccccccccccccccccccccc^]c BCCCCCCCCCCCCCCCCCBCCCCCCBCCCCC@CCCCCCCCCC?C=CCCCCCCCC>CC?CBCCCBCC@CCC@
Why is the reference base (t) written in lower case? I read that in some of MAQ's tools (eg. cns2fq) "bases in lower case are essentially repeats or do not have sufficient coverage; bases in upper case indicate regions where SNPs can be reliably called."
I doubt that this works in this case because it seems like the coverage is ok (71), the SNP appears on both strands, the alignments are reliable (RMS MQ = 60), and, according to UCSC, the position where the SNP is called has quite a good mappability.
Additionally, Indel lines do have more than 13 columns. Does anybody know what the additional 14th and 15th column mean?
Any hint/help will be greatly appreciated!
Best regards
I have a question concerning the SAMtools pileup format. I performed SNP (and Indel) calling for 75 bp PE Illumina data with hg19 by following the protocol explained here.
I noticed in the filtered output the following SNP call:
chr3 57627500 t C 241 241 60 71 C$c$c$c$ccccccccccccccCccccccCcccccccccccccccccccccccccccccccccccccccccccc^]c BCCCCCCCCCCCCCCCCCBCCCCCCBCCCCC@CCCCCCCCCC?C=CCCCCCCCC>CC?CBCCCBCC@CCC@
Why is the reference base (t) written in lower case? I read that in some of MAQ's tools (eg. cns2fq) "bases in lower case are essentially repeats or do not have sufficient coverage; bases in upper case indicate regions where SNPs can be reliably called."
I doubt that this works in this case because it seems like the coverage is ok (71), the SNP appears on both strands, the alignments are reliable (RMS MQ = 60), and, according to UCSC, the position where the SNP is called has quite a good mappability.
Additionally, Indel lines do have more than 13 columns. Does anybody know what the additional 14th and 15th column mean?
Any hint/help will be greatly appreciated!
Best regards
Comment