Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Weird stats output of SOAP de Novo

    Dear all,

    I have run SOAP de Novo to assemble a nematode genome.
    SOAP de Novo output a stats file .scafStatistics which I do not understand. I am especially confused about the N50 values. Why are there 2 values?

    Here is the output:

    <-- Information for assembly Scaffold 'SoapOutput-SB372.scafSeq'.(cut_off_length < 100bp) -->

    Size_includeN 76698637
    Size_withoutN 65796073
    Scaffold_Num 15259
    Mean_Size 5026
    Median_Size 160
    Longest_Seq 1079942
    Shortest_Seq 100
    Singleton_Num 11712
    Average_length_of_break(N)_in_scaffold 714

    Known_genome_size NaN
    Total_scaffold_length_as_percentage_of_known_genome_size NaN

    scaffolds>100 15047 98.61%
    scaffolds>500 4059 26.60%
    scaffolds>1K 3004 19.69%
    scaffolds>10K 688 4.51%
    scaffolds>100K 216 1.42%
    scaffolds>1M 1 0.01%

    Nucleotide_A 18790176 24.50%
    Nucleotide_C 14159122 18.46%
    Nucleotide_G 14204857 18.52%
    Nucleotide_T 18641918 24.31%
    GapContent_N 10902564 14.21%
    Non_ACGTN 0 0.00%
    GC_Content 43.11% (G+C)/(A+C+G+T)

    N10 488263 12
    N20 319382 32
    N30 235181 60
    N40 182496 96
    N50 138908 144
    N60 102282 210
    N70 73846 298
    N80 43901 429
    N90 5795 899

    NG50 NaN NaN
    N50_scaffold-NG50_scaffold_length_difference NaN

    <-- Information for assembly Contig 'SoapOutput-SB372.contig'.(cut_off_length < 100bp) -->

    Size_includeN 66764916
    Size_withoutN 66764916
    Contig_Num 69780
    Mean_Size 956
    Median_Size 458
    Longest_Seq 33978
    Shortest_Seq 100

    Contig>100 69392 99.44%
    Contig>500 33098 47.43%
    Contig>1K 20004 28.67%
    Contig>10K 138 0.20%
    Contig>100K 0 0.00%
    Contig>1M 0 0.00%

    Nucleotide_A 19146203 28.68%
    Nucleotide_C 14420728 21.60%
    Nucleotide_G 14387230 21.55%
    Nucleotide_T 18810755 28.17%
    GapContent_N 0 0.00%
    Non_ACGTN 0 0.00%
    GC_Content 43.15% (G+C)/(A+C+G+T)

    N10 6141 779
    N20 4338 2094
    N30 3326 3858
    N40 2586 6144
    N50 2011 9076
    N60 1536 12880
    N70 1122 17959
    N80 755 25179
    N90 410 37034

    NG50 NaN NaN
    N50_contig-NG50_contig_length_difference NaN

    Number_of_contigs_in_scaffolds 58068
    Number_of_contigs_not_in_scaffolds(Singleton) 11712
    Average_number_of_contigs_per_scaffold 16.4

    I have looked all over for the answer but didn´t manage to find it.

    All the best,
    Sophie

  • #2
    The first one is scaffold N50, the second is contig N500. Look for
    <-- Information for assembly Scaffold 'SoapOutput-SB372.scafSeq'.(cut_off_length < 100bp) -->
    resp
    <-- Information for assembly Contig 'SoapOutput-SB372.contig'.(cut_off_length < 100bp) -->
    to see what type of sequences does the statistics refer to.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      New Genomics Tools and Methods Shared at AGBT 2025
      by seqadmin


      This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

      The Headliner
      The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
      03-03-2025, 01:39 PM
    • seqadmin
      Investigating the Gut Microbiome Through Diet and Spatial Biology
      by seqadmin




      The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
      02-24-2025, 06:31 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 12:50 PM
    0 responses
    9 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-03-2025, 01:15 PM
    0 responses
    181 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 02-28-2025, 12:58 PM
    0 responses
    275 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 02-24-2025, 02:48 PM
    0 responses
    663 views
    0 likes
    Last Post seqadmin  
    Working...
    X