Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • arundurvasula
    replied
    All statistics are based on contigs of size >= 100 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).

    Assembly contig
    # contigs (>= 0 bp) 295
    # contigs (>= 1000 bp) 37
    Total length (>= 0 bp) 199556
    Total length (>= 1000 bp) 86293
    # contigs 295
    Largest contig 15124
    Total length 199556
    GC (%) 46.27
    N50 759
    N75 426
    L50 53
    L75 141
    # N's per 100 kbp 0.00

    Leave a comment:


  • jpummil
    replied
    quast quality stats of the assembly?

    Leave a comment:


  • arundurvasula
    replied
    I was able to assemble my data using IDBA_UD. I set it to cycle through k mers that were less than my read size and it produced a 15000bp sequence: idba_ud -r ../data/trimmed-reads/LV89-02.fa -o ../results/contigs/008 --mink 19 --maxk 49 --step 2

    Leave a comment:


  • SNPsaurus
    replied
    Do you think the viral genome will be divergent within a sample from replication errors? That could cause issues for assembly if there are lots of related kmers at a location instead of just one or two alleles and a low level of sequencing error.

    Leave a comment:


  • mastal
    replied
    Are you trying to assemble genomic data or transcriptomic data?

    What is the expected genome size of the virus genome you are trying to assemble?

    What kmer length have you used?

    As Brian already mentioned above, I would play around with the kmer length
    when using velvet, to see what kmer length gives you the best n50.

    Have you done any QC, adapter trimming or quality trimming on your reads?

    Leave a comment:


  • arundurvasula
    replied
    Thanks for the replies.

    I used VelvetOptimiser to determine optimal k-mer length. Our data contains a mixture of grape and virus reads, but we removed the reads that aligned to the grape reference genome. Our read length is 50 bp and we have 7,764,190 reads after filtering out the grape reads.

    Here is the quast output from the optimal velvet run:

    All statistics are based on contigs of size >= 100 bp, unless otherwise noted (e.g., "# contigs (>= 0 bp)" and "Total length (>= 0 bp)" include all contigs).

    Assembly contigs
    # contigs (>= 0 bp) 3547
    # contigs (>= 1000 bp) 1
    Total length (>= 0 bp) 326445
    Total length (>= 1000 bp) 1073
    # contigs 941
    Largest contig 1073
    Total length 156559
    GC (%) 46.84
    N50 162
    N75 122
    L50 305
    L75 584
    # N's per 100 kbp 0.00

    Leave a comment:


  • luc
    replied
    ... and the amount of contaminating sequences?

    Originally posted by Brian Bushnell View Post
    Have you tried varying the kmer length when assembling? Also, it would be helpful to know more about your data, like the read length and total amount, and quality metrics.

    I encourage you to read this thread:
    http://seqanswers.com/forums/showthread.php?t=42555

    Leave a comment:


  • Brian Bushnell
    replied
    Have you tried varying the kmer length when assembling? Also, it would be helpful to know more about your data, like the read length and total amount, and quality metrics.

    I encourage you to read this thread:
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Leave a comment:


  • arundurvasula
    started a topic Increasing contig lengths

    Increasing contig lengths

    I'm working on a project to identify by sequence viruses infecting grapevines.

    I have single end Illumina reads (50 bp) and have been trying to assemble them using a combination of Velvet and PRICE. I've been able to get to a max contig length of around 1500 with Velvet and an n50 of 46. After putting this output through PRICE, I can increase the n50 to 195. However, I am having trouble increasing my contig length after this. Do you have any advice regarding contig extension with single end reads?

Latest Articles

Collapse

  • seqadmin
    Quality Control Essentials for Next-Generation Sequencing Workflows
    by seqadmin




    Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

    Nucleic Acid Quality Control
    Preparing for NGS starts with isolating the...
    02-10-2025, 01:58 PM
  • seqadmin
    An Introduction to the Technologies Transforming Precision Medicine
    by seqadmin


    In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...
    01-27-2025, 07:46 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 02-07-2025, 09:30 AM
0 responses
67 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-05-2025, 10:34 AM
0 responses
104 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-03-2025, 09:07 AM
0 responses
83 views
0 likes
Last Post seqadmin  
Started by seqadmin, 01-31-2025, 08:31 AM
0 responses
45 views
0 likes
Last Post seqadmin  
Working...
X