Hello all,
I am getting some strange results for the SOAPdenovo scaffolding process. I discovered that if I use the ouput of the prepare (http://soap.genomics.org.cn/down/prepare.tgz) tool, I get different, arguably better, results.
After error correction (http://soap.genomics.org.cn/down/correction.tar.gz) I ran pregraph and contig with a kmer of 49. This generates “original.contig”. Following the normal steps, of map and scaff results in “original.scafSeq”. However, after remapping with BWA I noticed what seemed to be strong evidence for the scaffolding of many contigs.
So, I copied “original.contig” to a new directory and ran “prepare –g rescaf –K 49 –c original.contig”. Running map and scaf on rescaf.contig results in the file rescaf.scafSeq. The following is a summary of the assemblies.
Does anyone know the explanation for this difference? I assume there must be some information generated by pregraph and contig that prevents the scaffolding.
Regards, Keith
I am getting some strange results for the SOAPdenovo scaffolding process. I discovered that if I use the ouput of the prepare (http://soap.genomics.org.cn/down/prepare.tgz) tool, I get different, arguably better, results.
After error correction (http://soap.genomics.org.cn/down/correction.tar.gz) I ran pregraph and contig with a kmer of 49. This generates “original.contig”. Following the normal steps, of map and scaff results in “original.scafSeq”. However, after remapping with BWA I noticed what seemed to be strong evidence for the scaffolding of many contigs.
So, I copied “original.contig” to a new directory and ran “prepare –g rescaf –K 49 –c original.contig”. Running map and scaf on rescaf.contig results in the file rescaf.scafSeq. The following is a summary of the assemblies.
Code:
File Contigs Total_length Longest N50 original.contig 2459270 144845786 5254 54 rescaf.contig 2459270 144845786 5254 54 original.scafSeq 15289 4964403 5254 532 rescaf.scafSeq 9875 5416552 36403 2579
Regards, Keith