I have de novo assembled pre-processed illumina paired-end reads (interleaved as single read) of size 108Gbp in file size of plant genome. The estimated genome size is around 2Gbp. I have used Minia to assemble the genome with kmer 47 and minimum abdudance is 3 (estimated through Kmer genie). The Minia outputs as contigs. Before, I do scaffolding, I need suggestion from you. I have done genome assembly evaluation using quast tool. Also, I have mapped back paried-end reads to assembled genome and got back results from qualimap. I understand coverage is low, is it possible make a publication with this genome assembly?. Any recommendation to improve genome assembly with available data?
Quast results:
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).
Assembly output_prefix.contigs
# contigs (>= 0 bp) 9285711
# contigs (>= 1000 bp) 88260
Total length (>= 0 bp) 1590477304
Total length (>= 1000 bp) 146312325
# contigs 316519
Largest contig 12582
Total length 300873518
GC (%) 34.45
N50 977
N75 677
L50 92434
L75 186047
# N's per 100 kbp 0.00
Qualimap- Mapping results:
>>>>>>> Reference
number of bases = 1,590,477,304 bp
number of contigs = 9285711
>>>>>>> Globals
number of windows = 9286108
number of reads = 102,927,571
number of mapped reads = 102,654,076 (99.73%)
>>>>>>> Mapping quality
mean mapping quality = 37.18
>>>>>>> ACTG content
number of A's = 2,874,099,790 bp (31.76%)
number of C's = 1,783,726,295 bp (19.71%)
number of T's = 2,718,363,373 bp (30.04%)
number of G's = 1,672,166,807 bp (18.48%)
number of N's = 0 bp (0%)
>>>>>>> Coverage
mean coverageData = 5.69X
std coverageData = 57.95X
Quast results:
All statistics are based on contigs of size >= 500 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).
Assembly output_prefix.contigs
# contigs (>= 0 bp) 9285711
# contigs (>= 1000 bp) 88260
Total length (>= 0 bp) 1590477304
Total length (>= 1000 bp) 146312325
# contigs 316519
Largest contig 12582
Total length 300873518
GC (%) 34.45
N50 977
N75 677
L50 92434
L75 186047
# N's per 100 kbp 0.00
Qualimap- Mapping results:
>>>>>>> Reference
number of bases = 1,590,477,304 bp
number of contigs = 9285711
>>>>>>> Globals
number of windows = 9286108
number of reads = 102,927,571
number of mapped reads = 102,654,076 (99.73%)
>>>>>>> Mapping quality
mean mapping quality = 37.18
>>>>>>> ACTG content
number of A's = 2,874,099,790 bp (31.76%)
number of C's = 1,783,726,295 bp (19.71%)
number of T's = 2,718,363,373 bp (30.04%)
number of G's = 1,672,166,807 bp (18.48%)
number of N's = 0 bp (0%)
>>>>>>> Coverage
mean coverageData = 5.69X
std coverageData = 57.95X
Comment