Seqanswers Leaderboard Ad

**gavin.oliver** · 12-05-2011, 03:48 AM

I am now getting all Indels up to 29bp in length. I achieved this by increasing the maximum number of permitted gap extensions with bwa aln -e 50.

I will continue to experiment in order to get the larger indels.

**adaptivegenome** · 12-05-2011, 06:30 AM

Do you perform a base recalibration step with GATK before calling indels?

**gavin.oliver** · 12-05-2011, 06:40 AM

Originally posted by genericforms View Post

Do you perform a base recalibration step with GATK before calling indels?

Indeed I do.

**oiiio** · 12-05-2011, 08:10 AM

I have been trying to call indels with GATK UnifiedGenotyper from BWA-mapped BAMs for some time now, but with no success.

Did you have to use anything outside of the default parameters with UnifiedGenotyper or COuntCovariates/TableRecalibration? Others with this problem have found that it could be sequencing error rates in the sample were too high.

If you dont mind, could you post a couple command lines from your pipeline? I'm particularly interested in your UnifiedGenotyper and base recalibration commands. It would be an immense help.

**gavin.oliver** · 12-05-2011, 08:23 AM

Originally posted by oiiio View Post

I have been trying to call indels with GATK UnifiedGenotyper from BWA-mapped BAMs for some time now, but with no success.

Did you have to use anything outside of the default parameters with UnifiedGenotyper or COuntCovariates/TableRecalibration? Others with this problem have found that it could be sequencing error rates in the sample were too high.

If you dont mind, could you post a couple command lines from your pipeline? I'm particularly interested in your UnifiedGenotyper and base recalibration commands. It would be an immense help.

I am pretty sure my commands are very standard. Nonetheless, you are welcome to have a look!

for file in *fastq; do bwa aln -e 50 -f ${file%%.fastq}.sai chr17hg19 ${file}; done

for file in *sai; do bwa samse chr17hg19 ${file} ${file%%.sai}.fastq > ${file%%.sai}.sam; done

for file in *bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/SortSam.jar I=${file} O=${file%%.bam}_sorted.bam SO=coordinate; done

for file in *_sorted.bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/MarkDuplicates.jar I=${file} O=${file%%.bam}_ndup.bam M=metric TMP_DIR=./tmp REMOVE_DUPLICATES=TRUE VALIDATION_STRINGENCY=LENIENT; done

for file in *ndup.bam; do java -jar /home/goliver/ngs_software/picard-tools-1.53/AddOrReplaceReadGroups.jar I=${file} O=${file%%.bam}_rg.bam SO=coordinate ID=1 LB=Z PL=illumina PU=Z SM=Z; done

for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/picard-tools-1.53/BuildBamIndex.jar I=${file} O=${file}.bai; done

for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -T RealignerTargetCreator -R ../ref_chr17.hg19.fa -o ${file%%.bam}.intervals -I ${file}; done

for file in *rg.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -I ${file} -R ../ref_chr17.hg19.fa -T IndelRealigner -o ${file%%.bam}_2.bam -targetIntervals ${file%%.bam}.intervals --known ../GATK/dbsnp_132.b37.vcf; done

for file in *_2.bam; do java -Xmx20g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -R ../ref_chr17.hg19.fa -knownSites ../GATK/dbsnp_132.b37.vcf -I ${file} -T CountCovariates -cov QualityScoreCovariate -cov DinucCovariate -cov ReadGroupCovariate -cov CycleCovariate -recalFile ${file%%.bam}.recal.csv --default_read_group 1 --default_platform illumina -nt 4; done

for file in *_2.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -l INFO -R ../ref_chr17.hg19.fa -T TableRecalibration -I ${file} -o ${file%%.bam}.final.bam -recalFile ${file%%.bam}.recal.csv --default_read_group 1 --default_platform illumina; done

for file in *final.bam; do java -Xmx3g -jar /home/goliver/ngs_software/GenomeAnalysisTK-1.2-24-g6478681/GenomeAnalysisTK.jar -T UnifiedGenotyper -glm BOTH -I ${file} -R ../ref_chr17.hg19.fa -o ${file%%.bam}.vcf; done

**Jon_Keats** · 12-05-2011, 09:00 PM

Do you have any paired-end data as opposed to single-ended as you methods suggest? The indel alignment should be better with paired-ends than single ends

**gavin.oliver** · 12-06-2011, 12:52 AM

Originally posted by Jon_Keats View Post

Do you have any paired-end data as opposed to single-ended as you methods suggest? The indel alignment should be better with paired-ends than single ends

This particular dataset is all single end. I am pretty certain the larger indels can still be detected though...

Topics	Statistics	Last Post
Gene Misexpression in the Healthy Human Population by seqadmin Started by seqadmin, Yesterday, 06:46 AM	0 responses 9 views 0 likes	Last Post by seqadmin Yesterday, 06:46 AM
New Method for Rapid Genetic Diagnosis of Mendelian Disorders by seqadmin Started by seqadmin, 07-24-2024, 11:09 AM	0 responses 25 views 0 likes	Last Post by seqadmin 07-24-2024, 11:09 AM
Advancing Nanopore Technology for Portable Sensing Devices by seqadmin Started by seqadmin, 07-19-2024, 07:20 AM	0 responses 159 views 0 likes	Last Post by seqadmin 07-19-2024, 07:20 AM
New RNA-Based Gene Writing Technology Achieves Precise Gene Integration by seqadmin Started by seqadmin, 07-16-2024, 05:49 AM	0 responses 127 views 0 likes	Last Post by seqadmin 07-16-2024, 05:49 AM

Seqanswers Leaderboard Ad

Announcement

6-99bp indels with BWA/GATK

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News