Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GS De Novo Assembler (Newbler) -large option for transcriptomes

    Hi all,

    Has anyone ever tried the -large option for de-novo assembly of 454 transcriptome data.
    The issue for the question is that the -large flag (flag for large of complex genomes) has no more documentation apart form that phrase I just wrote in the parentheses.
    I understand that this is an option for genome assemblies (mostly.. only...???) but what is the influnce of this flag if one use it for transcriptomes.

    The question has occured when I (for curiosity purposes) tried the -large flag for a transcriptome assembly (together with the -cdna flag of course) and then I observed a significant difference on the size and the constitution of the isotigs generated. No something significant in the number but significant difference in the lengths of the isotigs and how they have been put together.

    Has anybody gone to the bottom of how this flag works?

    Many thanks

  • #2
    Originally posted by cbouyio View Post
    The question has occured when I (for curiosity purposes) tried the -large flag for a transcriptome assembly (together with the -cdna flag of course) and then I observed a significant difference on the size and the constitution of the isotigs generated. No something significant in the number but significant difference in the lengths of the isotigs and how they have been put together.
    I don't have answer for you, but a question. Do you think your assembly was made better or worse by using the -large option?

    Comment


    • #3
      -large is supposed to be used for large genome assemblies, which won't finish 'ever' without the -large option set. On occasion, I needed it for transcriptome assemblies, otherwise they would take way too long.

      Generally, one wants to avoid -large, as it shortcuts some steps and thereby can lead to worse results (shorter contigs, more reads mared as repeat, for instance).

      Comment


      • #4
        Guys thanks for the replies.

        @kmcarr I can not give a straight answer to your question for I can not tell from the numbers only wich transcriptome assembly was "better". The number, the n50 and the distribution of the lengths of the *isotigs* was marginaly "better' without the -large option, however the -large option gave me a better resolution for an individual multi copy gene family that we are after. I need to wait for the PCR aplicons from the wet lab guys to coroborate that, but the indications so far was that for a particular family (which BTW contains sevelar repeats) the -large option might give us better resolution.

        @flxlex both with and without -large the assemblies run relative fine (about a couple of hours each in a 4core 32gb RAM machine) so finishing of the assembly is not an issue for us. However I take seriously into account your comment that -large "shortcuts some steps" and marks some reads as repeats and I ll have a manual look at the .ace files of the protein family we are after. The contigs number and lenght distributions as I mentioned are not significantly different. So with the lack of any other formal way, I ll go with the empirical assesment here and I manualy (and together with some wet lab confirmation) check which option give us better resolution for the family we are after.

        Thanks again for your replies.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Quality Control Essentials for Next-Generation Sequencing Workflows
          by seqadmin




          Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

          Nucleic Acid Quality Control
          Preparing for NGS starts with isolating the...
          Yesterday, 01:58 PM
        • seqadmin
          An Introduction to the Technologies Transforming Precision Medicine
          by seqadmin


          In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...
          01-27-2025, 07:46 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 02-07-2025, 09:30 AM
        0 responses
        19 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-05-2025, 10:34 AM
        0 responses
        37 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-03-2025, 09:07 AM
        0 responses
        37 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 01-31-2025, 08:31 AM
        0 responses
        41 views
        0 likes
        Last Post seqadmin  
        Working...
        X