Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • gavin.oliver
    replied
    Originally posted by Jon_Keats View Post
    Do you have any paired-end data as opposed to single-ended as you methods suggest? The indel alignment should be better with paired-ends than single ends
    This particular dataset is all single end. I am pretty certain the larger indels can still be detected though...

    Leave a comment:


  • Jon_Keats
    replied
    Do you have any paired-end data as opposed to single-ended as you methods suggest? The indel alignment should be better with paired-ends than single ends

    Leave a comment:


  • gavin.oliver
    replied
    Originally posted by oiiio View Post
    I have been trying to call indels with GATK UnifiedGenotyper from BWA-mapped BAMs for some time now, but with no success.

    Did you have to use anything outside of the default parameters with UnifiedGenotyper or COuntCovariates/TableRecalibration? Others with this problem have found that it could be sequencing error rates in the sample were too high.

    If you dont mind, could you post a couple command lines from your pipeline? I'm particularly interested in your UnifiedGenotyper and base recalibration commands. It would be an immense help.
    I am pretty sure my commands are very standard. Nonetheless, you are welcome to have a look!

    for file in *fastq; do bwa aln -e 50 -f ${file%%.fastq}.sai chr17hg19 ${file}; done

    for file in *sai; do bwa samse chr17hg19 ${file} ${file%%.sai}.fastq > ${file%%.sai}.sam; done

    for file in *bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/SortSam.jar I=${file} O=${file%%.bam}_sorted.bam SO=coordinate; done

    for file in *_sorted.bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/MarkDuplicates.jar I=${file} O=${file%%.bam}_ndup.bam M=metric TMP_DIR=./tmp REMOVE_DUPLICATES=TRUE VALIDATION_STRINGENCY=LENIENT; done

    for file in *ndup.bam; do java -jar /home/goliver/ngs_software/picard-tools-1.53/AddOrReplaceReadGroups.jar I=${file} O=${file%%.bam}_rg.bam SO=coordinate ID=1 LB=Z PL=illumina PU=Z SM=Z; done

    for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/BuildBamIndex.jar I=${file} O=${file}.bai; done

    for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -T RealignerTargetCreator -R ../ref_chr17.hg19.fa -o ${file%%.bam}.intervals -I ${file}; done

    for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -I ${file} -R ../ref_chr17.hg19.fa -T IndelRealigner -o ${file%%.bam}_2.bam -targetIntervals ${file%%.bam}.intervals --known ../GATK/dbsnp_132.b37.vcf; done

    for file in *_2.bam; do java -Xmx20g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -R ../ref_chr17.hg19.fa -knownSites ../GATK/dbsnp_132.b37.vcf -I ${file} -T CountCovariates -cov QualityScoreCovariate -cov DinucCovariate -cov ReadGroupCovariate -cov CycleCovariate -recalFile ${file%%.bam}.recal.csv --default_read_group 1 --default_platform illumina -nt 4; done

    for file in *_2.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -l INFO -R ../ref_chr17.hg19.fa -T TableRecalibration -I ${file} -o ${file%%.bam}.final.bam -recalFile ${file%%.bam}.recal.csv --default_read_group 1 --default_platform illumina; done

    for file in *final.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -T UnifiedGenotyper -glm BOTH -I ${file} -R ../ref_chr17.hg19.fa -o ${file%%.bam}.vcf; done

    Leave a comment:


  • oiiio
    replied
    I have been trying to call indels with GATK UnifiedGenotyper from BWA-mapped BAMs for some time now, but with no success.

    Did you have to use anything outside of the default parameters with UnifiedGenotyper or COuntCovariates/TableRecalibration? Others with this problem have found that it could be sequencing error rates in the sample were too high.

    If you dont mind, could you post a couple command lines from your pipeline? I'm particularly interested in your UnifiedGenotyper and base recalibration commands. It would be an immense help.

    Leave a comment:


  • gavin.oliver
    replied
    Originally posted by genericforms View Post
    Do you perform a base recalibration step with GATK before calling indels?
    Indeed I do.

    Leave a comment:


  • adaptivegenome
    replied
    Do you perform a base recalibration step with GATK before calling indels?

    Leave a comment:


  • gavin.oliver
    replied
    I am now getting all Indels up to 29bp in length. I achieved this by increasing the maximum number of permitted gap extensions with bwa aln -e 50.

    I will continue to experiment in order to get the larger indels.

    Leave a comment:


  • gavin.oliver
    started a topic 6-99bp indels with BWA/GATK

    6-99bp indels with BWA/GATK

    Hi,

    I am using BWA and GATK to detect mutations in BRCA1. The BRCA1 sequences have been Sanger validated and contain known mutations. I am achieving a fair degree of accuracy so far, successfully detecting 99% of SNPs and over 90% of Indels. The majority of false negatives are for Indels over 5 bp in size. These range from 6-99bp in length. Can anyone recommend what command line parameters/values could be used to get the aligner to pick up some of the larger indels?

    Thanks in advance.

Latest Articles

Collapse

  • seqadmin
    Recent Developments in Metagenomics
    by seqadmin





    Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
    09-23-2024, 06:35 AM
  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 10-02-2024, 04:51 AM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-01-2024, 07:10 AM
0 responses
19 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-30-2024, 08:33 AM
0 responses
23 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-26-2024, 12:57 PM
0 responses
17 views
0 likes
Last Post seqadmin  
Working...
X