Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Brian Bushnell
    replied
    This greatly depends on the assembler, and the specific depth and error rate, and what you are assembling (single-cell, metagenome, isolate, transcriptome, etc), repeat content, and more.

    Metagenomes, transcriptomes, single-cells, and highly-repetitive isolates tend to be the most difficult (possibly in that order). The more highly variable your coverage is - whether due to community composition, amplification, gene expression, or repeats - the harder it is to tell the difference between low-coverage genomic sequence and error kmers. Some assemblers are better at this than others.

    Informatically, the signal-to-noise ratio is more important than raw coverage. However, coverage is discrete so if you have 2X coverage with some errors, you will probably get a better assembly than with 1X coverage and no errors, since that has no overlaps and cannot possibly assemble, even though it has a better SNR.

    In other words, there are no strict rules about whether it is good to increase coverage at the expense of accepting reads with higher error rates; you can find scenarios with directly contradictory best practices. Only once you decide on a specific sequencing platform, experiment type, organism, assembler, and sequencing volume, is it possible to objectively answer the question.

    Leave a comment:


  • mjoppich
    started a topic Assembly and sequencing errrors

    Assembly and sequencing errrors

    Dear all,

    I got a small question regarding the assembly of transcriptomes, or genomes.

    I understand that the assembly definitely favours the absence of errors in the reads.
    But let's assume some Illumina sequenced data. What would be the outcome of the assembly if ...

    ... I had more perfect reads, but overall more substitutions remaining in the other reads?
    ... I had fewer perfect reads as in the above scenario, but also less substitutions reamaining in the other reads?

    How do assembler react in both scenarios?

    Assuming a k-mer graph assembly, from what I understand the first scenario favours the general graph structure, possibly speeding up the assembly and creating less contigs (or in general longer ones?)?
    The second scenario could be better correctable by the assembler, leading to the same results?

    This question really puzzles me, and I'd be happy about your comments/experience. I couldn't find any paper that answers this question directly, but maybe you know one where my answer is hidden?

    Thanks

Latest Articles

Collapse

  • seqadmin
    The Impact of AI in Genomic Medicine
    by seqadmin



    Article Coming Soon......
    Today, 02:07 PM
  • seqadmin
    Multiomics Techniques Advancing Disease Research
    by seqadmin


    New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

    A major leap in the field has
    ...
    02-08-2024, 06:33 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 02-23-2024, 04:11 PM
0 responses
33 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-21-2024, 08:52 AM
0 responses
46 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-20-2024, 08:57 AM
0 responses
37 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-14-2024, 09:19 AM
0 responses
63 views
0 likes
Last Post seqadmin  
Working...
X