Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • wariobrega
    replied
    Originally posted by Zaag View Post
    GQ does not tell you anything about the variant quality. GQ tells you about how certain the HC is about the zygosity.

    About the deletions: do they disappear if you use this option in the variant calling step?

    --dontUseSoftClippedBases
    Hi Zaag and apologies for the late reply (I happened to be on holyday these last couple of weeks!),

    Part of the deletions disappeared after I used your options, although new ones reappeared. coll thing though, many of the FP InDels were among the one that were cleansed.

    I am now also trying to use Picard CleanSam to filter them BEFORE the GATK pipeline and compare the differences (also to see why these new InDels appear). Thanks a lot for your reply, was very helpful!

    Originally posted by Linnea View Post
    No sorry, I can't explain that part, maybe someone else has an idea?

    But actually, why can't they just be real indels with a very high quality? (99 seems to be the highest quality you can get: "Because the most likely PL is always 0, GQ = second highest PL - 0. If the second most likely PL is greater than 99, we still assign a GQ of 99, so the highest value of GQ is 99." -from the GATK webpage). Maybe it will be clear after the realignment? (And sorry if I misunderstood something, I am really no indel expert..)
    Hi Linnea! again, apoologies for the late reply.

    I am quite confident these InDels were not real beacuase the same regions were validated with Sanger before the experiment Also, seeing a lot of inDel nvery close to each other (3-4 bps at the most) and considering the nature of the disease, as long as the conservation of these regions makes me think these are FP. Again, I'm not an InDel expert as well, so we're on the same boat! Thanks a lot for your contribution though, it was really helpful!

    Daniele
    Last edited by wariobrega; 08-12-2015, 06:50 AM. Reason: Forgot to reply to Linnea!

    Leave a comment:


  • Zaag
    replied
    GQ does not tell you anything about the variant quality. GQ tells you about how certain the HC is about the zygosity.

    About the deletions: do they disappear if you use this option in the variant calling step?

    --dontUseSoftClippedBases

    Leave a comment:


  • Linnea
    replied
    No sorry, I can't explain that part, maybe someone else has an idea?

    But actually, why can't they just be real indels with a very high quality? (99 seems to be the highest quality you can get: "Because the most likely PL is always 0, GQ = second highest PL - 0. If the second most likely PL is greater than 99, we still assign a GQ of 99, so the highest value of GQ is 99." -from the GATK webpage). Maybe it will be clear after the realignment? (And sorry if I misunderstood something, I am really no indel expert..)

    Leave a comment:


  • wariobrega
    replied
    Hi Linnea, and thanks for the very quick reply!

    I actually found out a similar answer on the GATK forum after I posted it, but yours was concise and very explanatory, so thank you again!

    I'm now running the --bamOutput option on my samples in order to check ho HaplotypeCaller realigned the reads.

    However, something still does not add up, specifically, why the GenotypeQuality (GQ) of these Indels is always 99 (checked multiple times on multiple samples)?


    Thanks in advance!

    Daniele
    Last edited by wariobrega; 07-01-2015, 01:30 AM.

    Leave a comment:


  • Linnea
    replied
    Hello Daniele,

    To at least partly answer your question: HaplotypeCaller performs local realignment within the run, so your "final bam file" is actually not the one you input to HaplotypeCaller but something you don't see.. So HC might realign a region and find an indel, while in your input bam you see nothing or one ore more SNPs (which likely is caused by mismatches in an anyway wrong alignment).

    If you want to see how it looks like AFTER HaplotypeCaller has realigned the reads, you can rerun it using the flag:
    --bamOutput newbamfile.bam
    (takes quite a while to run).

    You can also make it print out all possible haplotypes to the bam with:
    --bamWriterType ALL_POSSIBLE_HAPLOTYPES
    and then see them in IGV (choose "color alignment by: tag" and then write "HC" in the box).

    Hope this helps at least a bit,
    Linnéa

    Leave a comment:


  • GATK Haplotype Caller calls Indels in SOLID reads that IGV does not Display

    Hello everyone,

    I am Daniele and I'm a Junior Researcher in a private foundation in Rome.

    Before explaining my problem, let me say that I am pretty new to the SOLID technology and to Variant Calling in general so forgive me if the question sounds dumb, but I couldn't find an answer nowhere!

    That said, I am analyzing SOLID reads for a target resequencing experiments. The files were given me as BAM, already aligned to my reference genome using the lifescope suite provided by AB.

    I used a classical approach for variant calling, so i preprocessed the reads, marked duplicates with Picard, and run the GATK pipeline using the Best Practices for Variant Calling (so I recalibrate the base QSs and I realigned around INDELS. ).

    I have then used HaplotypeCaller for the variant calling and outputted the VCF files for my experiments.

    Thing is that HaplotypeCaller does call several InDels that, when i check my final bam file (the one i give to Haplotype caller for calling variants) are not presents.

    specifically, any InDel in my vcf is not seen in IGV, but some of them appears as single nucleotide variants when I unchecked the "Quality weight allele fraction" in the Alignment Panel inside the IGV preferences. I thought this was an IGV issue and played a little bit with the options, but I found no solution. Notably, the Genotype Quality of these position is always around 99 I checked around the web, but I cannot find any explanation to this behavior.

    Can someone provide some help?


    Thanks in advance!

    Daniele

Latest Articles

Collapse

  • seqadmin
    Recent Developments in Metagenomics
    by seqadmin





    Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
    09-23-2024, 06:35 AM
  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 10-02-2024, 04:51 AM
0 responses
8 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-01-2024, 07:10 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-30-2024, 08:33 AM
0 responses
18 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-26-2024, 12:57 PM
0 responses
16 views
0 likes
Last Post seqadmin  
Working...
X