Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Heisman
    replied
    Originally posted by rama View Post
    Thanks a bunch for the pointer.
    once we identify the data with "high quality set" is there a way to compute metrics at different coverage thresholds. I am not sure how to do it, do I have to randomly subset sequence reads and check for the variant calls or just compare with the consensus?
    I was thinking just separate calls by coverage. IE, make a set of calls at >100x coverage, a set at 90-100x, a set at 80-90x, etc, and compare them. Or use quality score instead of coverage if you like that metric better. Your idea is interesting though; you could take a set of high quality calls and then randomly take smaller and smaller sets of reads for the same positions, redo the calling, and see how low the coverage threshold can get until your "subset calls" deviate too much from the legitimate set. The problem is if your high quality calls are in "easy" sites then this strategy won't apply to the rest of the genome necessarily.

    Leave a comment:


  • Joann
    replied
    Global Alliance White Paper on Clinical Data

    There is a consortium on clinical data as described in the White Paper linked here:



    On page 30 there is listed the names of organizers and their institutions, where you may be able to obtain additional follow-up information to "standards" questions about clinical data at this time.

    Please contribute your posts on any standards statements that you may obtain therefrom here at this forum and/or in the Wiki so that others may be kept informed thus enabling a more rapid dissemination of consensus parameters.

    Leave a comment:


  • rama
    replied
    Thanks a bunch for the pointer.
    once we identify the data with "high quality set" is there a way to compute metrics at different coverage thresholds. I am not sure how to do it, do I have to randomly subset sequence reads and check for the variant calls or just compare with the consensus?

    Leave a comment:


  • Heisman
    replied
    Originally posted by rama View Post
    This is an extension to the original question on this post. I was wondering if anybody knows how I can calculate the accuracy in sequencing at various levels of depth of coverage. based on this I want to choose the coverage with more confidence. thanks in advance to all.
    A couple ideas: http://genome.sph.umich.edu/wiki/SNP...Set_Properties

    Also, for any metric, you can tentatively assume your higher coverage/higher quality score calls will be more "correct" than the lower coverage/lower quality score calls. Thus, for any metric, compare different coverage thresholds to your highest quality sets. One caveat is it's possible for mapping artifacts or other things to lead to super high coverage, so make sure your "high quality set" looks real.

    Leave a comment:


  • rama
    replied
    finding the depth of coverage with more confidence

    This is an extension to the original question on this post. I was wondering if anybody knows how I can calculate the accuracy in sequencing at various levels of depth of coverage. based on this I want to choose the coverage with more confidence. thanks in advance to all.

    Leave a comment:


  • giorgifm
    replied
    Thank you for your answer Bukowski. So far we are aiming at around 40x coverage. That seems to be the minimum coverage to stabilize the significance of somatic mutations found.

    Leave a comment:


  • Bukowski
    replied
    Originally posted by giorgifm View Post
    Dear all,

    I was wondering if there is a standard "coverage" for exomic SNP calling in tumor_vs_healthy samples (same patient). As we know, tumor samples have an intrinsically higher mutability (Parsons et al., 1993). I was thinking of applying a threshold of at least 20X for the healthy one, and 50X for the tumor one. Do these look sufficient to you?

    Also, there appears to be no standard for coverage definition: so by "50X" I intend exome-wise coverage of 100bp uniquely-mapping Illumina paired reads, after duplicate removal.

    Thanks!

    Federico
    No there is no standard. It depends how many calls you want to make accurately. Something like SomaticSniper will happily call things in low coverage areas, but you will have little confidence in the genotypes. Even with 40x coverage for an exome sample.

    I'm doing some development work on cancer panels, and we've been advised (this is not exome sequencing, but targetted resequencing) to be aiming for 500x to 1000x coverage. I was a little iffy about these figures until I started actually doing the analysis on exomes myself just to test things out.

    This is prohibitively expensive for exomes I imagine, so I think in terms of depth 'as much as you can afford'. Remember you will also want to be confident about the genotype calls in your normal samples..

    Leave a comment:


  • Coverage "standards" for SNP detection in tumor samples

    Dear all,

    I was wondering if there is a standard "coverage" for exomic SNP calling in tumor_vs_healthy samples (same patient). As we know, tumor samples have an intrinsically higher mutability (Parsons et al., 1993). I was thinking of applying a threshold of at least 20X for the healthy one, and 50X for the tumor one. Do these look sufficient to you?

    Also, there appears to be no standard for coverage definition: so by "50X" I intend exome-wise coverage of 100bp uniquely-mapping Illumina paired reads, after duplicate removal.

    Thanks!

    Federico

Latest Articles

Collapse

  • seqadmin
    Best Practices for Single-Cell Sequencing Analysis
    by seqadmin



    While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
    06-06-2024, 07:15 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 06-21-2024, 07:49 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-20-2024, 07:23 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-17-2024, 06:54 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 06-14-2024, 07:24 AM
0 responses
25 views
0 likes
Last Post seqadmin  
Working...
X