Hello,
Does anyone have tips for genotyping longer indels? I have a test BAM that contains a 316 base deletion. I've been unable to get this deletion called using UnifiedGenotyper, HaplotypeCaller or samtools / bcftools. I've tried a variety of options in each, but here are some examples:
java -Xmx4G -jar GenomeAnalysisTK.jar -R GRCh37-lite.fa -T UnifiedGenotyper -I small_test.bam -o raw.vcf --genotype_likelihoods_model INDEL -rf BadCigar -L 11:7714903-7718903 --max_deletion_fraction 2 -stand_emit_conf 1.0
java -Xmx4G -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R GRCh37-lite.fa -I small_test.bam -L 11:7714903-7718903 -o hc.abra.vcf
samtools mpileup -B -u -f GRCh37-lite.fa small_test.bam | bcftools view -vcg - > raw.samtools.vcf
Looking at the mpileup output, the deletion is clearly there and always appears at the same position. Base qualities surrounding the deletion are high as are the mapping qualities. An IGV screenshot is attached.
The Isaac Variant Caller (starling) does call this deletion. However the called allele depth is extremely low (the indel is observed in 123 reads).
11 7716903 . TAAGAATGGATGAGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCCGGCTAAAACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGAGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAA T 624 PASS CIGAR=1M316D;RU=.;REFREP=1;IDREP=0 GT:GQ:GQX
PI:AD 1/1:62:59:119:0,13
Is anyone aware of tweaks that can be made to the above callers to handle this case properly? Are there other good software alternatives out there?
Thanks!
Does anyone have tips for genotyping longer indels? I have a test BAM that contains a 316 base deletion. I've been unable to get this deletion called using UnifiedGenotyper, HaplotypeCaller or samtools / bcftools. I've tried a variety of options in each, but here are some examples:
java -Xmx4G -jar GenomeAnalysisTK.jar -R GRCh37-lite.fa -T UnifiedGenotyper -I small_test.bam -o raw.vcf --genotype_likelihoods_model INDEL -rf BadCigar -L 11:7714903-7718903 --max_deletion_fraction 2 -stand_emit_conf 1.0
java -Xmx4G -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R GRCh37-lite.fa -I small_test.bam -L 11:7714903-7718903 -o hc.abra.vcf
samtools mpileup -B -u -f GRCh37-lite.fa small_test.bam | bcftools view -vcg - > raw.samtools.vcf
Looking at the mpileup output, the deletion is clearly there and always appears at the same position. Base qualities surrounding the deletion are high as are the mapping qualities. An IGV screenshot is attached.
The Isaac Variant Caller (starling) does call this deletion. However the called allele depth is extremely low (the indel is observed in 123 reads).
11 7716903 . TAAGAATGGATGAGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCCGGCTAAAACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGAGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAA T 624 PASS CIGAR=1M316D;RU=.;REFREP=1;IDREP=0 GT:GQ:GQX
PI:AD 1/1:62:59:119:0,13Is anyone aware of tweaks that can be made to the above callers to handle this case properly? Are there other good software alternatives out there?
Thanks!
Comment