I recently paired-end sequenced ~8Mbp bacterial genome using Illumina HiSeq. FastQC quality statistics boxplots are attached both R1 (forward) and R2 (reverse).
I used A5 to assemble a bacterial genome because of it's automated data cleaning, error correction, contig assembly, scaffolding and quality control. Using A5, one can input the raw reads, which I find a big help, and also that they support the Nextera prep protocol in their quality control.
My question is, what is an acceptable N50 value? I know that we should aim for high values, which would indicate good assembly. But what is the range of acceptable N50 and what is considered high? Is there a minimum number that we should aim for?
My A5 summary file says:
After scaffolding raw1:
Total number of scaffolds = 5782
Sum (bp) = 7350401
Max scaffold size = 54663
Min scaffold size = 254
Average scaffold size = 1271
N50 = 1820
Any help and feedback is greatly appreciated! Thanks!
I used A5 to assemble a bacterial genome because of it's automated data cleaning, error correction, contig assembly, scaffolding and quality control. Using A5, one can input the raw reads, which I find a big help, and also that they support the Nextera prep protocol in their quality control.
My question is, what is an acceptable N50 value? I know that we should aim for high values, which would indicate good assembly. But what is the range of acceptable N50 and what is considered high? Is there a minimum number that we should aim for?
My A5 summary file says:
After scaffolding raw1:
Total number of scaffolds = 5782
Sum (bp) = 7350401
Max scaffold size = 54663
Min scaffold size = 254
Average scaffold size = 1271
N50 = 1820
Any help and feedback is greatly appreciated! Thanks!
Comment