Header Leaderboard Ad

Collapse

Assembling solid data in parallel

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assembling solid data in parallel

    I've got 120 million 50bp SOLiD reads with a 2100bp insert size. Velvet is off the table for assembly as it eats up the 48 gig ram on my workstation pretty quickly. I've got a 160-cpu MPI cluster that should do the trick but the parallel assemblers I'm aware of (abyss, Forge) seem to be mostly untested with ABI data.. does anyone have any more recommendations? I pruned the data down to ~60 million reads and recompiled velvet with a maxk of 21 to use less ram, which runs a little farther but still exhausts system memory. Anyone have other parallel suggestions for assembling my data, or is it time to invest in a giant 512gb machine?

    Thanks!

  • #2
    Remove duplicate entries (also reverse-complement). Break the data into 1000 subsets and assemble each and use the resulting contigs as sanger reads with remaining reads (reads not in contigs)?

    Comment


    • #3
      I thought about that, but I'd be losing a fair amount of my paired end data that way, correct? Pairs that aren't placed in a larger contig will just be represented as singletons after each subset is assembled and treated as sanger?

      Comment


      • #4
        Velvet, and possibly MIRA, will allow you to keep the pair-end info. One problem may be contigs too long to be used as Sanger reads, but those can be shredded into artificial reads in such a way that they will be reconstructed.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          A Brief Overview and Common Challenges in Single-cell Sequencing Analysis
          by seqadmin


          ​​​​​​The introduction of single-cell sequencing has advanced the ability to study cell-to-cell heterogeneity. Its use has improved our understanding of somatic mutations1, cell lineages2, cellular diversity and regulation3, and development in multicellular organisms4. Single-cell sequencing encompasses hundreds of techniques with different approaches to studying the genomes, transcriptomes, epigenomes, and other omics of individual cells. The analysis of single-cell sequencing data i...

          01-24-2023, 01:19 PM
        • seqadmin
          Introduction to Single-Cell Sequencing
          by seqadmin
          Single-cell sequencing is a technique used to investigate the genome, transcriptome, epigenome, and other omics of individual cells using high-throughput sequencing. This technology has provided many scientific breakthroughs and continues to be applied across many fields, including microbiology, oncology, immunology, neurobiology, precision medicine, and stem cell research.

          The advancement of single-cell sequencing began in 2009 when Tang et al. investigated the single-cell transcriptomes
          ...
          01-09-2023, 03:10 PM
        • seqadmin
          AVITI from Element Biosciences: Latest Sequencing Technologies—Part 6
          by seqadmin
          Element Biosciences made its sequencing market debut this year when it released AVITI, its first sequencer. The AVITI System uses avidity sequencing, a novel sequencing chemistry that delivers higher quality data, decreases cycle times, and requires lower reagent concentrations. This new instrument reportedly features lower operating and start-up costs while maintaining quality sequencing.

          Read type and length
          AVITI is a short-read benchtop sequencer that also offers an innovative...
          12-29-2022, 10:44 AM

        ad_right_rmr

        Collapse
        Working...
        X