I am not affiliated with VAAST in any way, but I have used it extensively and absolutely love it. I don't want to derail this thread in any way but I can certainly answer questions about it.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by mirabilia View Postthanks a lot ulz_peter!
Could you please clarify which steps of your pipeline are specifically for diploid genomes in order I can customize for my purposes?
Comment
-
Problem with SNP-calling
Following ulz_peter's original doc, I have some problem when doing the SNP-calling.
java -Xmx4g -jar /path/GenomeAnalysisTK-1.1-35-ge253f6f/GenomeAnalysisTK.jar \
-glm BOTH \
-R hg18.fa \
-T UnifiedGenotyper \
-I myinput.marked.realigned.fixed.recal.bam \
-D dbsnp132_hg18.txt \
-o myoutput.snps.vcf \
-metrics snps.metrics \
-stand_call_conf 50.0 \
-stand_emit_conf 10.0 \
-dcov 1000 \
-A DepthOfCoverage \
-A AlleleBalance \
-L hg18_exonIntervals.bed
This "-L" option does not work.
I got the hg18_exonIntervals.bed from UCSC as ulz_peter's original doc shows.
I run the SNP-calling without the "-L" line.
Then the variant quality score recalibration step does not work, generating an empty output.tranches file.
Can somebody help me out? Thanks a lot.Last edited by liu_xt005; 10-17-2011, 10:19 AM.
Comment
-
What is the error message when you specify the -L argument?
I actually stopped using the Variant quality Score recalibration as it often did not work out for me (I never work on more than 2 exomes at a time).
I out the version withouth the recalibration on the SEQanswers Wiki/How-To section. You may have a look there, as I will update that in the future and stop uploading newer versions of the PDF file...
Comment
-
Originally posted by liu_xt005 View PostFollowing ulz_peter's original doc, I have some problem when doing the SNP-calling.
java -Xmx4g -jar /path/GenomeAnalysisTK-1.1-35-ge253f6f/GenomeAnalysisTK.jar \
-glm BOTH \
-R hg18.fa \
-T UnifiedGenotyper \
-I myinput.marked.realigned.fixed.recal.bam \
-D dbsnp132_hg18.txt \
-o myoutput.snps.vcf \
-metrics snps.metrics \
-stand_call_conf 50.0 \
-stand_emit_conf 10.0 \
-dcov 1000 \
-A DepthOfCoverage \
-A AlleleBalance \
-L hg18_exonIntervals.bed
This "-L" option does not work.
I got the hg18_exonIntervals.bed from UCSC as ulz_peter's original doc shows.
I run the SNP-calling without the "-L" line.
Then the variant quality score recalibration step does not work, generating an empty output.tranches file.
Can somebody help me out? Thanks a lot.
By the way I couldn't figure out how to use this on version 1.2
Comment
-
Originally posted by ulz_peter View PostI didn't find it yet, but there was a statement on the GATK homepage that the options descirbed there (which are basically pretty mcuh the same as I use) only work for diploid genomes and expected shifts of allele frequency must be adressed. So the question is: what are you planning to do: find rare alleles within some strains, sequence a genetically homogeneous strain...
Obviously any kind of suggestion, it's really appreciate!
Comment
-
-L option problem solved
Originally posted by ulz_peter View PostWhat is the error message when you specify the -L argument?
I actually stopped using the Variant quality Score recalibration as it often did not work out for me (I never work on more than 2 exomes at a time).
I out the version withouth the recalibration on the SEQanswers Wiki/How-To section. You may have a look there, as I will update that in the future and stop uploading newer versions of the PDF file...
Thanks VERY MUCH to both of you.
The problem seems to be solved by removing the random and hap intervals from the .bed file.
ulz_peter,
raonyguimaraes posted a similar pipeline by Gayle Philip.
SNPs and Indels are recalibrated/filtered separately and combined after.
I am trying that, and think that it is a good idea to exclude Indels from Gaussian models.
Comment
-
Great resource!!!
Just a few comments:
In the picard/MarkDuplicates.jar, the option 'CREATE_INDEX=true' should be added.
With respect to adding read group information, instead of using the bwa sampe -r option, picard AddOrReplaceReadGroups.jar is an easier way to go as it tells you which options are required. Thanks for sharing!
Comment
-
reference dictionary
When I tried to use GATK to do the local realignment according to ulz_peter's instruction, one error message occurred: Invalid command line: Failed to load reference dictionary. Could anybody let me know where to get this reference dictionary? how to use it in the command line?
Thanks in advance.
Comment
-
Originally posted by emilyjia2000 View PostWhen I tried to use GATK to do the local realignment according to ulz_peter's instruction, one error message occurred: Invalid command line: Failed to load reference dictionary. Could anybody let me know where to get this reference dictionary? how to use it in the command line?
Thanks in advance.
Comment
-
Thanks for all of your quick response. I used the command line:
java -Xmx4g -jar /path/to/GenomeAnalysisTK.jar -T RealignerTargetCreator -R hg19.fa -o output.interval -I /path/to/reorder_dedup.bam
I already copied the ucsc.hg19.dict in the same directory.
When I run this command, the error message:
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 1.2-26-g43b0c98):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR Visit our wiki for extensive documentation http://www.broadinstitute.org/gsa/wiki
##### ERROR Visit our forum to view answers to commonly asked questions http://getsatisfaction.com/gsa
##### ERROR
##### ERROR MESSAGE: Invalid command line: Failed to load reference dictionary
##### ERROR ------------------------------------------------------------------------------------------
Comment
Latest Articles
Collapse
-
by seqadmin
Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...-
Channel: Articles
10-18-2024, 07:11 AM -
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
Nobel Prize for MicroRNA Discovery
This week,...-
Channel: Articles
10-07-2024, 08:07 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks
by seqadmin
Started by seqadmin, Yesterday, 05:31 AM
|
0 responses
10 views
0 likes
|
Last Post
by seqadmin
Yesterday, 05:31 AM
|
||
Started by seqadmin, 10-24-2024, 06:58 AM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
10-24-2024, 06:58 AM
|
||
New AI Model Designs Synthetic DNA Switches for Targeted Gene Expression in Specific Cell Types
by seqadmin
Started by seqadmin, 10-23-2024, 08:43 AM
|
0 responses
50 views
0 likes
|
Last Post
by seqadmin
10-23-2024, 08:43 AM
|
||
Started by seqadmin, 10-17-2024, 07:29 AM
|
0 responses
58 views
0 likes
|
Last Post
by seqadmin
10-17-2024, 07:29 AM
|
Comment