Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Brian Bushnell
    replied
    This greatly depends on the assembler, and the specific depth and error rate, and what you are assembling (single-cell, metagenome, isolate, transcriptome, etc), repeat content, and more.

    Metagenomes, transcriptomes, single-cells, and highly-repetitive isolates tend to be the most difficult (possibly in that order). The more highly variable your coverage is - whether due to community composition, amplification, gene expression, or repeats - the harder it is to tell the difference between low-coverage genomic sequence and error kmers. Some assemblers are better at this than others.

    Informatically, the signal-to-noise ratio is more important than raw coverage. However, coverage is discrete so if you have 2X coverage with some errors, you will probably get a better assembly than with 1X coverage and no errors, since that has no overlaps and cannot possibly assemble, even though it has a better SNR.

    In other words, there are no strict rules about whether it is good to increase coverage at the expense of accepting reads with higher error rates; you can find scenarios with directly contradictory best practices. Only once you decide on a specific sequencing platform, experiment type, organism, assembler, and sequencing volume, is it possible to objectively answer the question.

    Leave a comment:


  • mjoppich
    started a topic Assembly and sequencing errrors

    Assembly and sequencing errrors

    Dear all,

    I got a small question regarding the assembly of transcriptomes, or genomes.

    I understand that the assembly definitely favours the absence of errors in the reads.
    But let's assume some Illumina sequenced data. What would be the outcome of the assembly if ...

    ... I had more perfect reads, but overall more substitutions remaining in the other reads?
    ... I had fewer perfect reads as in the above scenario, but also less substitutions reamaining in the other reads?

    How do assembler react in both scenarios?

    Assuming a k-mer graph assembly, from what I understand the first scenario favours the general graph structure, possibly speeding up the assembly and creating less contigs (or in general longer ones?)?
    The second scenario could be better correctable by the assembler, leading to the same results?

    This question really puzzles me, and I'd be happy about your comments/experience. I couldn't find any paper that answers this question directly, but maybe you know one where my answer is hidden?

    Thanks

Latest Articles

Collapse

  • seqadmin
    Exploring the Dynamics of the Tumor Microenvironment
    by seqadmin




    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
    07-08-2024, 03:19 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 07-25-2024, 06:46 AM
0 responses
9 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-24-2024, 11:09 AM
0 responses
26 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-19-2024, 07:20 AM
0 responses
160 views
0 likes
Last Post seqadmin  
Started by seqadmin, 07-16-2024, 05:49 AM
0 responses
127 views
0 likes
Last Post seqadmin  
Working...
X