Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assembly and sequencing errrors

    Dear all,

    I got a small question regarding the assembly of transcriptomes, or genomes.

    I understand that the assembly definitely favours the absence of errors in the reads.
    But let's assume some Illumina sequenced data. What would be the outcome of the assembly if ...

    ... I had more perfect reads, but overall more substitutions remaining in the other reads?
    ... I had fewer perfect reads as in the above scenario, but also less substitutions reamaining in the other reads?

    How do assembler react in both scenarios?

    Assuming a k-mer graph assembly, from what I understand the first scenario favours the general graph structure, possibly speeding up the assembly and creating less contigs (or in general longer ones?)?
    The second scenario could be better correctable by the assembler, leading to the same results?

    This question really puzzles me, and I'd be happy about your comments/experience. I couldn't find any paper that answers this question directly, but maybe you know one where my answer is hidden?

    Thanks

  • #2
    This greatly depends on the assembler, and the specific depth and error rate, and what you are assembling (single-cell, metagenome, isolate, transcriptome, etc), repeat content, and more.

    Metagenomes, transcriptomes, single-cells, and highly-repetitive isolates tend to be the most difficult (possibly in that order). The more highly variable your coverage is - whether due to community composition, amplification, gene expression, or repeats - the harder it is to tell the difference between low-coverage genomic sequence and error kmers. Some assemblers are better at this than others.

    Informatically, the signal-to-noise ratio is more important than raw coverage. However, coverage is discrete so if you have 2X coverage with some errors, you will probably get a better assembly than with 1X coverage and no errors, since that has no overlaps and cannot possibly assemble, even though it has a better SNR.

    In other words, there are no strict rules about whether it is good to increase coverage at the expense of accepting reads with higher error rates; you can find scenarios with directly contradictory best practices. Only once you decide on a specific sequencing platform, experiment type, organism, assembler, and sequencing volume, is it possible to objectively answer the question.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Recent Advances in Sequencing Technologies
      by seqadmin







      Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

      Long-Read Sequencing
      Long-read sequencing has...
      12-02-2024, 01:49 PM
    • seqadmin
      Genetic Variation in Immunogenetics and Antibody Diversity
      by seqadmin



      The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
      11-06-2024, 07:24 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 12-02-2024, 09:29 AM
    0 responses
    146 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-02-2024, 09:06 AM
    0 responses
    51 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-02-2024, 08:03 AM
    0 responses
    42 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 11-22-2024, 07:36 AM
    0 responses
    73 views
    0 likes
    Last Post seqadmin  
    Working...
    X