Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SNV calling using GATK with data from multiple lanes

    Hi,

    I am using exome sequencing data to call SNVs with unifiedgenotyper of GATK. I have two lanes for each sample. So I merged two bam files into one with two read groups. But in the VCF file, I got two columns like GT:ADP:GQ:PL 0/1:20,3:23:56:56,0,576 0/1:23,9:32:99:153,0,676.
    My questions are
    (1) whether GATK treated these two as two samples because there are two read groups?
    (2) does GATK called SNVs in these two lanes separately or merge the reads of them?
    (3) when I calculate the minor allele frequency, shall I use both columns of GT:ADP:GQ:PL?

    Eager to know the answer.
    Thank you in advance.

  • #2
    In regards to your first question, I do think GATK UnifiedGenotyper would have treated each different read group as a different sample (http://gatkforums.broadinstitute.org...bout-bam-files).

    Is there a particular reason you're using the UnifiedGenotyper? HaplotypeCaller is it's successor (http://www.broadinstitute.org/gatk/g...-discovery-ovw).

    Comment


    • #3
      Hi N311V,

      Thank you very much. If they treat different read groups as different samples, then the read groups of each lane are supposed to be the same, right? But this is not mentioned at all in GATK website.

      I just called SNPs not indels. So unified genotyper seems to be faster. Did HaplotyperCaller run better than Unified Genotyper in your project?

      Comment


      • #4
        From the GATK web page:

        The HaplotypeCaller is a more recent and sophisticated tool than the UnifiedGenotyper. Its ability to call SNPs is equivalent to that of the UnifiedGenotyper, and its ability to call indels is far superior. We recommend using HaplotypeCaller in all cases, with only a few exceptions:

        If you want to analyze more than 100 samples at a time (for performance reasons)
        If you are working with non-diploid organisms (UG can handle different levels of ploidy while HC cannot)
        If you are working with pooled samples (also due to the HC’s limitation regarding ploidy)
        In those cases, we recommend using UnifiedGenotyper instead of HaplotypeCaller.
        Personally I am not sure which is better. Getting different results bioinformatically is not a proof of correctness.

        Comment


        • #5
          Originally posted by N311V View Post
          In regards to your first question, I do think GATK UnifiedGenotyper would have treated each different read group as a different sample (http://gatkforums.broadinstitute.org...bout-bam-files).
          If you look at the desc of the SM tag in that page, its seems GATK would treat all read groups with the same SM as coming from the same sample

          GATK tools treat all read groups with the same SM value as containing sequencing data for the same sample. Therefore it's critical that the SM field be correctly specified, especially when using multi-sample tools like the Unified Genotyper.

          Comment


          • #6
            Originally posted by Jolin View Post
            If they treat different read groups as different samples, then the read groups of each lane are supposed to be the same, right? But this is not mentioned at all in GATK website.
            I did read somewhere on the GATK website that each sample needs a unique read group, sorry don't have a link right now. To keep track of lane perhaps you could use picard tools AddOrReplaceReadGroups.jar and specify the library name as the lane.

            Originally posted by Jolin View Post
            I just called SNPs not indels. So unified genotyper seems to be faster. Did HaplotyperCaller run better than Unified Genotyper in your project?
            I was interested in SNPs and indels which made HaplotypeCaller an great all-in-one solution. Also, I was only interested in a couple of genes so speed was not a concern. I haven't compared the SNP results from HaplotypeCaller to UnifiedGenotyper so can't say if they're the same. I assume so but better check.

            Comment


            • #7
              Hi Westerman, Thank you. Actually our lab used Unified Genotyper all the time and did some PCR validation on the predicted SNVs. It seems that UG works well in SNV detection.

              Comment


              • #8
                Thanks a lot, N311V

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Advanced Methods for the Detection of Infectious Disease
                  by seqadmin




                  The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
                  ...
                  11-27-2023, 01:15 PM
                • seqadmin
                  Strategies for Investigating the Microbiome
                  by seqadmin




                  Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
                  11-09-2023, 07:02 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 09:55 AM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 11-30-2023, 10:48 AM
                0 responses
                17 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 11-29-2023, 08:26 AM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 11-29-2023, 08:12 AM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Working...
                X