Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SNPs at Interval Boundaries in GATK HaplotypeCaller

    I'm using HaplotypeCaller with the -L option, where I explicitly state the interval. I am trying to break my task into pieces and assign it to different jobs for it to complete faster. In one run, I am running it, from say,

    Chr1:1-1000000

    and another run, I am running it twice, from

    Chr1:1-500000 on one jobs and
    Chr1:500001-1000000 on another

    essentially splitting the interval into 2 different jobs. What I am seeing in this second run is that there are SNPs identified +/- 100 bases from 500000 that are not found in the first run.

    My guess is that asking HC to focus only on a region (ie -L 1-500000) does not allow the local aligner to reassemble properly the reads, and hence results in spurious reads and SNPs. I was hoping that by specifying -L, it does the local aligner in a larger region and just report the SNPs in the -L region.

    Has anybody heard of this or have a way around?

Latest Articles

Collapse

  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM
  • seqadmin
    Addressing Off-Target Effects in CRISPR Technologies
    by seqadmin






    The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
    08-27-2024, 04:44 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 06:25 AM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 01:02 PM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-18-2024, 06:39 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-11-2024, 02:44 PM
0 responses
14 views
0 likes
Last Post seqadmin  
Working...
X