I've been trying to figure this one out for a little while. Most of the genomes that have been assembled using exclusively Illumina short PE reads (such as the Giant Panda) have an error correction step (i.e. Quake, Reptile, SOAPec, or ECHO) before assembly, as reads in Illumina sequences can have basepair substitutions. Error correcting can reduce the total number of unique K-mers to help assembly.
But if I'm already trimming sequences before assembly for Phred quality scores > 20 (in other words, > 99% base call accuracy) is using an error correction software really necessary? Thanks for any thoughts.
But if I'm already trimming sequences before assembly for Phred quality scores > 20 (in other words, > 99% base call accuracy) is using an error correction software really necessary? Thanks for any thoughts.
Comment