Dear all,
I got a small question regarding the assembly of transcriptomes, or genomes.
I understand that the assembly definitely favours the absence of errors in the reads.
But let's assume some Illumina sequenced data. What would be the outcome of the assembly if ...
... I had more perfect reads, but overall more substitutions remaining in the other reads?
... I had fewer perfect reads as in the above scenario, but also less substitutions reamaining in the other reads?
How do assembler react in both scenarios?
Assuming a k-mer graph assembly, from what I understand the first scenario favours the general graph structure, possibly speeding up the assembly and creating less contigs (or in general longer ones?)?
The second scenario could be better correctable by the assembler, leading to the same results?
This question really puzzles me, and I'd be happy about your comments/experience. I couldn't find any paper that answers this question directly, but maybe you know one where my answer is hidden?
Thanks
I got a small question regarding the assembly of transcriptomes, or genomes.
I understand that the assembly definitely favours the absence of errors in the reads.
But let's assume some Illumina sequenced data. What would be the outcome of the assembly if ...
... I had more perfect reads, but overall more substitutions remaining in the other reads?
... I had fewer perfect reads as in the above scenario, but also less substitutions reamaining in the other reads?
How do assembler react in both scenarios?
Assuming a k-mer graph assembly, from what I understand the first scenario favours the general graph structure, possibly speeding up the assembly and creating less contigs (or in general longer ones?)?
The second scenario could be better correctable by the assembler, leading to the same results?
This question really puzzles me, and I'd be happy about your comments/experience. I couldn't find any paper that answers this question directly, but maybe you know one where my answer is hidden?
Thanks
Comment