Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • dan
    replied
    Sorry, I'm still confused about these parameters.

    Here is some information I got by asking the same question to Dr. Michael Stiens, Manager Customer Support Genome Sequencing, Roche Diagnostics GmbH...

    Seed Step: It is the number of bases after which the next seed begins on the same read. Each seed is 16 bp in lenght (default) and the seed step is 12bp. So there is an overlap of 4bp between every seed on a read.

    One question would be... Does the seed step parameter define an upper or a lower limit? While I found dePhi's answer to be very interesting (I never thought about the different mapping qualities w.r.t. seed length before) I don't see how it relates to the parameters used. i.e. they talk about a "distribution of seed step" ... so is the parameter the upper limit of that distribution?

    Cheers,

    Leave a comment:


  • drgoettel
    replied
    Very useful.
    Thankyou!

    Leave a comment:


  • dePhi
    replied
    I'm not an expert on assembly, but i'll try to help.

    When doing a overlap analysis you want to know some parameters about how good your overlap is. Is it nice and uniform, or does it have parts only represented by 2 or 3 seeds and parts covered by 100 seeds. But that's only coverage, a bit to much of a simplification of the assembly quality. 30 time coverage with 500-mers is not the same as 30 time coverage with 8-mers. Which is where these 2 parameters come in.

    Seed step is the distance between the start of one overlapping segment with the next. Say you find sequence #1 (a 12-mer for example) starting at base number 1 and you find sequence #2 (also a 12-mer) starting at base number 6, then your seed step would be 5. The distribution of seed step gives you a idea of how uniformly that part of you assembly is represented by actual reads. Ideally you would have a new read start at each new base for the best alignment quality.

    Seed length is the k-mer length you are using. If your assembly would consist of uniform reads, all of the same length, your seed length wouldn't vary across your assembly. But reporting the seed length gives you an idea of the quality of the reads used in that part of your assembly. For instance, if part of your assembly is made up of seeds which are way smaller then a part of your assembly which is just as well covered but by seeds with a much greater length, you can say that the quality of your assembly is better at the site with larger seed length. That's because the quality of your reads is usually better in longer reads, or else they would have been trimmed.

    But the power of these parameters I think is in there combination. Having large seed steps is okay as long as your k-mer length is also large. If your k-mers are small you want small seed steps, or otherwise the total alignment quality is lower.

    I hope my rambling was useful.
    Cheers

    Leave a comment:


  • drgoettel
    started a topic GS FLX data analysis software manual

    GS FLX data analysis software manual

    Hello,

    could anybody explain with a little more detail the next overlap detection parameters available in the GS de novo assembler Application gsAssembler??

    Seed step – The number of bases between seed generation locations used in the exact k-mer matching part of the overlap detection
    Seed length – The number of bases used for each seed in the exact k-mer matching part of the overlap detection (i.e. the “k” value of the k-mer matching)

    Thankyou very much!!

Latest Articles

Collapse

  • seqadmin
    Advanced Tools Transforming the Field of Cytogenomics
    by seqadmin


    At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
    09-26-2023, 06:26 AM
  • seqadmin
    How RNA-Seq is Transforming Cancer Studies
    by seqadmin



    Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
    09-07-2023, 11:15 PM
  • seqadmin
    Methods for Investigating the Transcriptome
    by seqadmin




    Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

    Whole Transcriptome RNA-seq
    Whole transcriptome sequencing...
    08-31-2023, 11:07 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Today, 09:38 AM
0 responses
8 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-27-2023, 06:57 AM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-26-2023, 07:53 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-25-2023, 07:42 AM
0 responses
17 views
0 likes
Last Post seqadmin  
Working...
X