Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • wzhangvv
    replied
    Hi,
    This is the whole genome data and I don't know the exact targeted regions...So I can't start with a known VCF file.
    Thanks for your reply!



    Originally posted by Zaag View Post
    use targeted regions so it doesn't walk over the entire genome (unless it's whole genome data)

    Leave a comment:


  • Zaag
    replied
    use targeted regions so it doesn't walk over the entire genome (unless it's whole genome data)
    Last edited by Zaag; 10-14-2012, 06:10 AM.

    Leave a comment:


  • wzhangvv
    replied
    Hi,
    Do you think there is anything wrong in my workflow? Can you give me some advice? I really don't know how to improve and debug it because the previous steps finished smoothly.
    Really appreciate your reply!

    Originally posted by adaptivegenome View Post
    I'm surprised that TargetCreator is the limiting step. Typically runs quite fast.

    Leave a comment:


  • adaptivegenome
    replied
    I'm surprised that TargetCreator is the limiting step. Typically runs quite fast.

    Leave a comment:


  • wzhangvv
    replied
    Hi,
    Thanks for your reply!
    I just did de novo assembly and got millions of contigs without any chromosome info... The frog species I worked on doesn't have an assembled genome...
    I tried to get more scaffolds, but it was difficult to my rad data.

    Originally posted by qtrinh View Post
    Hi,
    What about running RealignerTargetCreator in parallel on each of the chromosomes ? This should speed things up for you.

    Q

    Leave a comment:


  • qtrinh
    replied
    Hi,
    What about running RealignerTargetCreator in parallel on each of the chromosomes ? This should speed things up for you.

    Q

    Leave a comment:


  • wzhangvv
    replied
    Hi,

    Thanks for your reply!

    I just read the menu of OpenGE. It can't help me because its localrealign step requires the intervals file which need to be generated by GATK RealignerTargetCreator, which was the very slow step I mentioned before.

    Do you think my RealignerTargetCreator speed is normal (~10+ days per sample)? If it really is, I have to change my strategy.




    Originally posted by adaptivegenome View Post
    Try OpenGE for realignment:

    www.github.com/adaptivegenome/OpenGE

    Leave a comment:


  • adaptivegenome
    replied
    Try OpenGE for realignment:

    Leave a comment:


  • wzhangvv
    started a topic Help! about GATK realignment speed

    Help! about GATK realignment speed

    Hi All,
    I try to do SNP calling using GATK. This is my first time to do such things and I generated the work flow as follows. Everything went well till I was blocked by RealignerTargetCreator. It seemed to cost 15 days per sample! I don't know whether it was a normal speed with 300MB reference and 1GB bam file or not. Could anybody help me figure it out? I have 80 samples and obviously I don't have enough time to run this step.
    Thanks for your time~!

    My work flow (till RealignTargetCreator):
    I used sga to do de novo assembly and I used the output file contigs.fa as reference.

    bwa index -P contigs.fa -a bwtsw contigs.fa

    bwa aln -t 4 contigs.fa R1.fq > R1.sai

    bwa aln -t 4 contigs.fa R2.fq > R2.sai

    bwa sampe contigs.fa R1.sai R2.sai R1.fq R2.fq > A.sam

    samtools view -bST contigs.fa -o A_noRG.bam A.sam

    java -Xmx20g -XX:PermSize=10g -XX:MaxPermSize=10g -jar /usr/share/picard/lib/AddOrReplaceReadGroups.jar INPUT=A_noRG.bam OUTPUT=A_std.bam SORT_ORDER=coordinate RGID=lib1_A RGLB=AA RGPL=illumina RGSM=lib1_A RGPU=none VALIDATION_STRINGENCY=LENIENT

    java -Xmx20g -XX:PermSize=10g -XX:MaxPermSize=10g -jar /usr/share/picard/lib/MarkDuplicates.jar INPUT=A_std.bam OUTPUT=A_std_noduplicates.bam METRICS_FILE=A_std.duplicate_matrics REMOVE_DUPLICATES=true ASSUME_SORTED=true VALIDATION_STRINGENCY=LENIENT

    java -Xmx20g -XX:PermSize=10g -XX:MaxPermSize=10g -jar /usr/share/picard/lib/BuildBamIndex.jar INPUT=A_std_noduplicates.bam VALIDATION_STRINGENCY=LENIENT

    java -Xmx20g -XX:PermSize=10g -XX:MaxPermSize=10g -jar /usr/share/GenomeAnalysisTK-2.1-10-gdbc86ec/GenomeAnalysisTK.jar -T RealignerTargetCreator -nt 8 -I A_std_noduplicates.bam -R contigs.fa -o A_forIndelAligner.intervals

Latest Articles

Collapse

  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM
  • seqadmin
    Addressing Off-Target Effects in CRISPR Technologies
    by seqadmin






    The first FDA-approved CRISPR-based therapy marked the transition of therapeutic gene editing from a dream to reality1. CRISPR technologies have streamlined gene editing, and CRISPR screens have become an important approach for identifying genes involved in disease processes2. This technique introduces targeted mutations across numerous genes, enabling large-scale identification of gene functions, interactions, and pathways3. Identifying the full range...
    08-27-2024, 04:44 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 01:02 PM
0 responses
8 views
0 likes
Last Post seqadmin  
Started by seqadmin, Yesterday, 06:39 AM
0 responses
10 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-11-2024, 02:44 PM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-06-2024, 08:02 AM
0 responses
148 views
0 likes
Last Post seqadmin  
Working...
X