Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • pindel data least fold needed

    Hello, I am using pindel to find deletions. First, I use real data(read length is 75bp) of 10 fold of a genome in which 85% are repetitive sequences, and i find some deletions.Then, simulated data were generated to evaluate the accuracy based on a real data of 200kb.I simulated 75bp,10 fold, 20 fold,50 fold respectively, at different fold, i found different numbers of deletion of 1,13(of which, 1 is not in the right chromosome), 47(of which, 5 is not in the right chromosome ). Then my question is what do you think the least fold when using pindel to find deletion.
    I saw published paper using 20X, 40X of 35bp read length,does it ok to use data of 10X at a read length of 75bp for a genome possessing 85% repetitive sequences?
    Expecting to your reply!
    Thanks!

    jane
    Last edited by jane_orderly; 12-31-2013, 12:08 AM.

  • #2
    reply from the author of pindel

    hi Jane,

    please post your questions to seqanswers or biostar in the future so that the discussion might be seen by others.

    for Pindel, the more coverage the better sensitivity. pindel will find variants at low coverage as low as 4x as in 1000G but then we need to pool samples together. all depending the percentage of variants you want to discover, 10x for single sample will give you reasonable result and 20x above will be much better.

    as for repetitive sequences, the reads have to be longer than the repetitive sequence to allow sensitive discovery. so again, the longer the better.

    let me know if you have any questions. I could not give precise numbers as we normally work on high coverage data in cancer samples for somatic variants or low coverage pooled samples.

    Kai
    Last edited by jane_orderly; 12-31-2013, 12:39 AM.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    13 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    17 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    14 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    43 views
    0 likes
    Last Post seqadmin  
    Working...
    X