Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Allpath-LG stuck on CloseUnipathsGaps?

    Hello all,

    I am working on a de novo assembly with Allpaths-LG. I have a 300bp paired-end Illumina MiSeq run with ~400-500 bp fragment size for the overlapping fragment library (~50X), and several 100bp Illumina Hiseq jumping libraries (250bp, 500bp, 2kb, 5kb, 10kb). Allpaths overlaps the Miseq fragment library well, with 99% overlap (I also verified overlaps with PEAR program).

    I am running allpaths with 12 threads on a machine with 2 Intel Xeon X5660, and 192 GB of RAM available - so the machine is decent, and I'm the only user utilizing the machine.

    When Allpaths gets to the CloseUnipathGaps module, it is seeming to get hung up on the "Building those extenders" step. Allpaths ran on this building extenders step for 6 days before I decided to stop it, and try again. The second run is also stuck here for a day now. In other peoples' allpaths logs I have seen, this step only takes a couple minutes to complete.

    Code:
    --------------------------------------------------------------------------------
    Mon Nov 24 18:20:08 2014 run on d017, pid=25701 [Jun  5 2014 15:37:46 R49856 ]
    CloseUnipathGaps NUM_THREADS=12                                                \
                     DIR=/data/FG-183/pulicaria/allpaths-assembly2/run1            \
                     IN_HEAD=frag_reads_corr_cpd                                   \
                     UNIBASES=filled_reads.unibases.k96 UNIBASES_K=96              \
                     USE_LINKS=False OUT_HEAD=filled_reads.gap_closed WORKDIR=tmp  \
                     MM=True _MM_INTERVAL=10 _MM_SUMMARY=False                     \
                     _MM_OUT=/data/FG-183/pulicaria/allpaths-assembly2/run1/makein \
                     fo/filled_reads.gap_closed.CloseUnipathGaps.log.mm.CloseUnipa \
                     thGaps
    --------------------------------------------------------------------------------
    Mon Nov 24 18:20:23 2014: Finding seeds for unipath gaps
    Mon Nov 24 18:20:23 2014: Creating KmerParcel files for FindUnipathGapSeeds
    Mon Nov 24 18:20:23 2014: n_reads = 22919962
    Mon Nov 24 18:21:56 2014: Creating seeds...
    Mon Nov 24 18:21:56 2014: compute seeds[query_ID].size() for each query_ID.
    Mon Nov 24 18:22:03 2014: reserve space for seeds[query_ID].
    Mon Nov 24 18:22:04 2014: store seeds in seeds[query_ID].
    Mon Nov 24 18:22:12 2014: 22919962 number of reads processed.
    Mon Nov 24 18:22:12 2014: 46415402 seeds created.
    Mon Nov 24 18:22:33 2014: Creating KmerParcel files for FindUnipathGapSeeds
    Mon Nov 24 18:22:33 2014: n_reads = 22919962
    Mon Nov 24 18:24:00 2014: Creating seeds...
    Mon Nov 24 18:24:00 2014: compute seeds[query_ID].size() for each query_ID.
    Mon Nov 24 18:24:06 2014: reserve space for seeds[query_ID].
    Mon Nov 24 18:24:07 2014: store seeds in seeds[query_ID].
    Mon Nov 24 18:24:15 2014: 22919962 number of reads processed.
    Mon Nov 24 18:24:15 2014: 46685568 seeds created.
    Mon Nov 24 18:24:37 2014: Loading read quality scores so we can build extenders
    Mon Nov 24 18:24:45 2014: Building those extenders
    
    Sun Nov 30 15:11:45 2014.  Interrupt received (perhaps a ctrl-c).  Stopping.
    I had previously run Allpaths on the Illumina Hiseq jumping library data only, without the MiSeq library, and this problem was not encountered during that assembly. Only with the addition of the MiSeq library did this issue appear.

    Does anyone have experience with Allpaths and have any suggestions? Do you think the 2x300 MiSeq library is too much for Allpaths to handle? (I know the fragment library code is designed for 2x100 reads with 180bp)

    I'd like to be able to try out Allpaths for this assembly. But if it can't complete, I do have the merged overlapping MiSeq reads from PEAR paired-end merger - so I can try out other assemblers, using the merged MiSeq reads as single-end reads. Any recommendations on this front are also appreciated.
    Last edited by CraigJ; 12-01-2014, 11:31 AM.

  • #2
    Try 'CLOSE_UNIPATH_GAPS=False'.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Non-Coding RNA Research and Technologies
      by seqadmin


      Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

      [Article Coming Soon!]...
      Today, 08:07 AM
    • seqadmin
      Recent Developments in Metagenomics
      by seqadmin





      Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
      09-23-2024, 06:35 AM
    • seqadmin
      Understanding Genetic Influence on Infectious Disease
      by seqadmin




      During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

      Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
      09-09-2024, 10:59 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 10-02-2024, 04:51 AM
    0 responses
    14 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 10-01-2024, 07:10 AM
    0 responses
    24 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 09-30-2024, 08:33 AM
    1 response
    31 views
    0 likes
    Last Post EmiTom
    by EmiTom
     
    Started by seqadmin, 09-26-2024, 12:57 PM
    0 responses
    19 views
    0 likes
    Last Post seqadmin  
    Working...
    X