Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mozack
    Junior Member
    • Jun 2013
    • 8

    Genotyping longer indels

    Hello,

    Does anyone have tips for genotyping longer indels? I have a test BAM that contains a 316 base deletion. I've been unable to get this deletion called using UnifiedGenotyper, HaplotypeCaller or samtools / bcftools. I've tried a variety of options in each, but here are some examples:

    java -Xmx4G -jar GenomeAnalysisTK.jar -R GRCh37-lite.fa -T UnifiedGenotyper -I small_test.bam -o raw.vcf --genotype_likelihoods_model INDEL -rf BadCigar -L 11:7714903-7718903 --max_deletion_fraction 2 -stand_emit_conf 1.0

    java -Xmx4G -jar GenomeAnalysisTK.jar -T HaplotypeCaller -R GRCh37-lite.fa -I small_test.bam -L 11:7714903-7718903 -o hc.abra.vcf

    samtools mpileup -B -u -f GRCh37-lite.fa small_test.bam | bcftools view -vcg - > raw.samtools.vcf

    Looking at the mpileup output, the deletion is clearly there and always appears at the same position. Base qualities surrounding the deletion are high as are the mapping qualities. An IGV screenshot is attached.

    The Isaac Variant Caller (starling) does call this deletion. However the called allele depth is extremely low (the indel is observed in 123 reads).

    11 7716903 . TAAGAATGGATGAGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCCGGCTAAAACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGAGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAA T 624 PASS CIGAR=1M316D;RU=.;REFREP=1;IDREP=0 GT:GQ:GQXPI:AD 1/1:62:59:119:0,13


    Is anyone aware of tweaks that can be made to the above callers to handle this case properly? Are there other good software alternatives out there?

    Thanks!
    Attached Files
  • swbarnes2
    Senior Member
    • May 2008
    • 910

    #2
    Short read technology is always going to have a hard time with deletions like that.

    If you provide the aligner with both alleles, then the aligner will correctly assign the reads to the right alleles.

    That looks like an intron, not a deletion. Is that RNAseq data?

    Comment

    • mozack
      Junior Member
      • Jun 2013
      • 8

      #3
      Yes, I understand that longer deletions are difficult for short read technology. However, IMHO the heavy lifting has already been done for this case. The alignments clearly reflect the deletion.

      This test case already includes both alleles. Isaac correctly reports a genotype of 1/1. However the allele count is a rather low 13. I would expect it to be closer to the number of reads containing the deletion.

      This is DNA data.

      Thanks...

      Comment

      • mozack
        Junior Member
        • Jun 2013
        • 8

        #4
        Here's a little more info in case anyone else is interested:

        HaplotypeCaller does call this deletion (once activeRegionMaxSize is increased - thanks to the GATK team for the tip). However, it also calls 2 other large deletions that appear to be in conflict with the expected deletion.

        FreeBayes calls this deletion with accurate allele depth and no conflicting calls:

        11 7716903 . TAAGAATGGATGAGGCCGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGAGGCGGGCGGATCACGAGGTCAGGAGATCGAGACCATCCCGGCTAAAACGGTGAAACCCCGTCTCTACTAAAAATACAAAAAATTAGCCGGGCGTGGTGGCGGGCGCCTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGAGAGGCGGAGCTTGCAGTGAGCCGAGATCCCGCCACTGCACTCCAGCCTGGGCGACAGAGCGAGACTCCGTCTCAAAAAAAAAAAAAAAAAAAAAAA T 2113.61 . AB=0;ABP=0;AC=2;AF=1;AN=2;AO=122;CIGAR=1M316D;DP=122;DPB=5.9434;DPRA=0;EPP=3.29508;EPPR=0;GTI=0;HWE=-0;LEN=316;MEANALT=1;MQM=59.4262;MQMR=0;NS=1;NUMALT=1;ODDS=172.347;PAIRED=0.114754;PAIREDR=0;PAO=4.5;PQA=171.5;PQR=3404.5;PRO=90.5;QA=2780;QR=0;RO=0;RPP=267.93;RPPR=0;RUN=1;SAP=3.29508;SRP=0;TYPE=del;XAI=0.010374;XAM=0.0213936;XAS=0.0110196;XRI=0;XRM=0;XRS=0;technology.illumina=1;BVAR GTP:RO:QR:AO:QA:GL 1/1:122:0:0:122:2780:-5,-5,0

        Comment

        • m_two
          Member
          • Mar 2010
          • 50

          #5
          Use pindel: http://tvap.genome.wustl.edu/tools/pindel/

          Here is a related publication



          Detection of FLT3 internal tandem duplication in targeted, short-read-length, next-generation sequencing data.

          Comment

          Latest Articles

          Collapse

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by SEQadmin2, Today, 11:58 AM
          0 responses
          9 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-05-2026, 10:09 AM
          0 responses
          25 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-04-2026, 08:59 AM
          0 responses
          35 views
          0 reactions
          Last Post SEQadmin2  
          Started by SEQadmin2, 06-02-2026, 12:03 PM
          0 responses
          56 views
          0 reactions
          Last Post SEQadmin2  
          Working...