Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Newbler2.3 & cDNA assembly

    Hi all,

    I am running newbler (v2.3) cDNA one-step assembly for about a week using 04 shotgun files, ~2.4 GB each. It seems just hanging over at detangling alignment phase for over 03 days. The status are shown as below-
    ---------------------------------------------------------------------------------------
    Setting up long overlap detection...
    -> 3142131 of 3142131, 3130211 reads to align
    Building a tree for 15374046 seeds...
    Computing long overlap alignments...
    -> 3130211 of 3130211
    Setting up overlap detection...
    -> 3142131 of 3142131, 1032064 reads to align
    Building a tree for 29430104 seeds...
    Computing alignments...
    -> 3130211 of 3130211
    Checkpointing...
    Detangling alignments...
    -> Level 2, Phase 8, Round 1...
    ---------------------------------------------------------------------------------------
    I used 08 cpus and at present it takes 01 cpu and 49% memory. Can anyone tell me would it complete at all or what should I do now?

    Moinul

  • #2
    2.3 is "old". You should upgrade to Newbler 2.5 as it is known to work a lot better for cDNA assemblies. In summer 2.6 will follow.

    hth, Sven

    Comment


    • #3
      thanks hth, Sven, I also tried with Newbler 2.5 cDNA assembly but couldn't resolve the problem arose there. I think if you see my older thread "Newbler de novo assembly" posted on 28-05-2011, you can help me regarding this ..........

      Moinul

      Comment


      • #4
        Oh, well OK, I see.
        I got this error as well reported it to Roche and it is still not resolved.
        I wanted to assemble a non-normalised cDNA library with some transcripts having a really high coverage. I just "reduced" the dataset using 'cd-hit-est', some kind of "in-silico-normalisation" ;-) Reducing the coverage of certain transcripts helped, though this approach is not applicable to all kind of experiments (depending on the questions you want to answer).

        Sven

        Comment


        • #5
          Have you removed ribosomal RNA sequences present in the data (e.g. by adding them via the -vs option)? That might also help.

          Comment


          • #6
            thanks flxlex, but how do I get the rRNA sequences? simply download the fasta databases and mention with -vs flag?

            And why should I remove them? they are also a part of my transcriptome ie. total RNA. Would it reflect the actual transcriptome size if I exclude them?

            Moinul

            Comment


            • #7
              Originally posted by moinul View Post
              thanks flxlex, but how do I get the rRNA sequences? simply download the fasta databases and mention with -vs flag?
              I would try to get some rRNA sequences specific for your organism, or from closely related ones. Don't use the entire database, that would slow things down considerably...

              And why should I remove them? they are also a part of my transcriptome ie. total RNA. Would it reflect the actual transcriptome size if I exclude them?
              They are represented to such a high coverage, that they cause problems for the whole assembly. Are you using total RNA? Then they make up the majority of your sample, I guess... Doing an assembly with a fraction of the reads will get you the rRNA sequences of your organism (these can also be used for the filtering, BTW), and the lengths of these you can add to the remainder to get total transcriptome size.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM
              • seqadmin
                Genetic Variation in Immunogenetics and Antibody Diversity
                by seqadmin



                The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                11-06-2024, 07:24 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-02-2024, 09:29 AM
              0 responses
              151 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-02-2024, 09:06 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-02-2024, 08:03 AM
              0 responses
              43 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 11-22-2024, 07:36 AM
              0 responses
              76 views
              0 likes
              Last Post seqadmin  
              Working...
              X