Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weird stats output of SOAP de Novo

    Dear all,

    I have run SOAP de Novo to assemble a nematode genome.
    SOAP de Novo output a stats file .scafStatistics which I do not understand. I am especially confused about the N50 values. Why are there 2 values?

    Here is the output:

    <-- Information for assembly Scaffold 'SoapOutput-SB372.scafSeq'.(cut_off_length < 100bp) -->

    Size_includeN 76698637
    Size_withoutN 65796073
    Scaffold_Num 15259
    Mean_Size 5026
    Median_Size 160
    Longest_Seq 1079942
    Shortest_Seq 100
    Singleton_Num 11712
    Average_length_of_break(N)_in_scaffold 714

    Known_genome_size NaN
    Total_scaffold_length_as_percentage_of_known_genome_size NaN

    scaffolds>100 15047 98.61%
    scaffolds>500 4059 26.60%
    scaffolds>1K 3004 19.69%
    scaffolds>10K 688 4.51%
    scaffolds>100K 216 1.42%
    scaffolds>1M 1 0.01%

    Nucleotide_A 18790176 24.50%
    Nucleotide_C 14159122 18.46%
    Nucleotide_G 14204857 18.52%
    Nucleotide_T 18641918 24.31%
    GapContent_N 10902564 14.21%
    Non_ACGTN 0 0.00%
    GC_Content 43.11% (G+C)/(A+C+G+T)

    N10 488263 12
    N20 319382 32
    N30 235181 60
    N40 182496 96
    N50 138908 144
    N60 102282 210
    N70 73846 298
    N80 43901 429
    N90 5795 899

    NG50 NaN NaN
    N50_scaffold-NG50_scaffold_length_difference NaN

    <-- Information for assembly Contig 'SoapOutput-SB372.contig'.(cut_off_length < 100bp) -->

    Size_includeN 66764916
    Size_withoutN 66764916
    Contig_Num 69780
    Mean_Size 956
    Median_Size 458
    Longest_Seq 33978
    Shortest_Seq 100

    Contig>100 69392 99.44%
    Contig>500 33098 47.43%
    Contig>1K 20004 28.67%
    Contig>10K 138 0.20%
    Contig>100K 0 0.00%
    Contig>1M 0 0.00%

    Nucleotide_A 19146203 28.68%
    Nucleotide_C 14420728 21.60%
    Nucleotide_G 14387230 21.55%
    Nucleotide_T 18810755 28.17%
    GapContent_N 0 0.00%
    Non_ACGTN 0 0.00%
    GC_Content 43.15% (G+C)/(A+C+G+T)

    N10 6141 779
    N20 4338 2094
    N30 3326 3858
    N40 2586 6144
    N50 2011 9076
    N60 1536 12880
    N70 1122 17959
    N80 755 25179
    N90 410 37034

    NG50 NaN NaN
    N50_contig-NG50_contig_length_difference NaN

    Number_of_contigs_in_scaffolds 58068
    Number_of_contigs_not_in_scaffolds(Singleton) 11712
    Average_number_of_contigs_per_scaffold 16.4

    I have looked all over for the answer but didnĀ“t manage to find it.

    All the best,
    Sophie

  • #2
    The first one is scaffold N50, the second is contig N500. Look for
    <-- Information for assembly Scaffold 'SoapOutput-SB372.scafSeq'.(cut_off_length < 100bp) -->
    resp
    <-- Information for assembly Contig 'SoapOutput-SB372.contig'.(cut_off_length < 100bp) -->
    to see what type of sequences does the statistics refer to.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Advanced Methods for the Detection of Infectious Disease
      by seqadmin




      The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
      ...
      11-27-2023, 01:15 PM
    • seqadmin
      Strategies for Investigating the Microbiome
      by seqadmin




      Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
      11-09-2023, 07:02 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Yesterday, 10:48 AM
    0 responses
    16 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 11-29-2023, 08:26 AM
    0 responses
    12 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 11-29-2023, 08:12 AM
    0 responses
    13 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 11-27-2023, 08:12 AM
    0 responses
    22 views
    0 likes
    Last Post seqadmin  
    Working...
    X