Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • yvancouver
    Junior Member
    • Jan 2010
    • 4

    Allele frequency in sample below 10%

    Hello,

    In our pipeline we have a quality control step where 23 snps are called with GATK and with realtime PCR. Lately as the coverage depth increases, we start to see that TaqMan is calling a homozygous snp and the high throughput sequencing method a heterozygous snp. And this mostly in case where one allele is present below 10%.

    I have difficulty to imagine cases where the allele frequency in one individual is below 10%.

    Can someone help me to make sense of this?

    Thanks
  • Brian Bushnell
    Super Moderator
    • Jan 2014
    • 2709

    #2
    What platform are you using, is it exome-capture or WGS, and how much was the DNA amplified? Biases can greatly reduce the rate of one allele.

    Of course, there are other possibilities like repeats, chimerism, and contamination that can cause odd allelic ratios.

    Comment

    • yvancouver
      Junior Member
      • Jan 2010
      • 4

      #3
      Thanks for the reply.

      We are using a Illumina HiSeq, with exaome capture, Agilent SureSelect version 5. The lab technicians aim for 18 picomolar and this particular run produced around 100 10^6 reads.

      I am not sure chimerism can explain this only 1 or 3 snps out of 23 show this genotype and this within a batch of 20 samples. I did not check if the region where the snps were called are repeat rich. Will check.

      Comment

      • Brian Bushnell
        Super Moderator
        • Jan 2014
        • 2709

        #4
        This can also be caused by errors in the reads. It's a good idea to remove duplicates if your data was PCR-amplified; this can reduce the rate of false-positive variants. Also, requiring a variant to be seen both on plus- and minus-mapped reads, and requiring the base to be called with some minimum average or maximum quality, are also filters that can reduce false positives caused by errors or biases. And it's possible for called variants to be mapping artifacts, too; you may want to try a different aligner and see if you get the same results.

        Comment

        • yvancouver
          Junior Member
          • Jan 2010
          • 4

          #5
          The reads went to the full GATK/Picard pipeline, from FixMate and markDuplicate to the indel realignment and base recalibration. They passed all tests, no strand bias for example or anything else. They look valid... But as you suggest I will try with BWA as the aligner, currently we are using novoalign.

          Comment

          • yvancouver
            Junior Member
            • Jan 2010
            • 4

            #6
            We might have found the cause .... All samples came from the same lane and the lane was almost over-clustered. The genotype in the HTS pipeline is the same in all samples pointing to a contamination. So to conclude, we suspect a contamination but we don't know where exactly.

            Thanks a lot Brian for your helpful comments.

            Comment

            Latest Articles

            Collapse

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, Yesterday, 11:58 AM
            0 responses
            13 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            25 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            36 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 12:03 PM
            0 responses
            60 views
            0 reactions
            Last Post SEQadmin2  
            Working...