Seqanswers Leaderboard Ad

**TiborNagy** · 01-24-2014, 04:33 AM

No, because if you call variants in a single read, you can not distinguish read errors and real variations.

**SNPsaurus** · 01-24-2014, 08:46 AM

I did see a talk at PAG XXII where the person called variants from a pileup and then went back to the individual reads to fit the variants into haplotypes enforced by the reads. Of course, I can't recall the talk, or if it was even a new thing! But that would give you the results you want.

**obk** · 01-24-2014, 10:10 AM

Dear TiborNagy
I understand I wouldn't want to make a call based on a single read, but in the simple example, you'd call the two mutations 100 times each, which I think would give me some confidence that they are not erroneous reads... I think what I'm wondering is: if you have enough confidence in the base calling technology (or have enough coverage per unique read (like in the example)), what is the difference between:
1) pileup reads to get consensus read -> variant call -> filter: is it real? -> real SNVs
2) variant call individual reads -> 'pileup' variant calls -> filter: is it real? -> real SNVs
(this question may be specific to amplicon sequencing...)
Thanks.

**SNPsaurus** · 01-24-2014, 11:34 AM

You can do that. When we do genotyping of populations, we get reads along the lines of what describe (mixed haplotypes). So one way we look at it is to align reads, track the variants of each read, then filter. The one difference is that our reads are all in synch (a stack of 100 reads at position 100,000, then a stack of 100 reads at position 200,000, etc). You would have some reads that end in between variants, leading to a little more work interpreting that.

**obk** · 01-24-2014, 01:27 PM

Thanks for your comments SNPsaurus.
Do you have strategies to do any quantitative analysis based on the stack of reads? If it's amplicon sequencing that you're doing, then I imagine it is difficult to account for PCR duplicates.

**SNPsaurus** · 01-24-2014, 02:13 PM

PCR duplicates are an issue, since we can't use different start and stop locations as a way to distinguish independent events. We were mostly looking for the presence of haplotypes in the populations so the precise level wasn't a concern. I was impressed by the "call from pileup then use the reads for phasing" approach I saw at the meeting because it did allow the use of common pipelines up until the last step and I think using common tools is increasingly important.

Topics	Statistics	Last Post
ASHG 2024 Highlights – Part Two by seqadmin Started by seqadmin, Today, 11:09 AM	0 responses 24 views 0 likes	Last Post by seqadmin Today, 11:09 AM
ASHG 2024 Highlights – Part One by seqadmin Started by seqadmin, Today, 06:13 AM	0 responses 20 views 0 likes	Last Post by seqadmin Today, 06:13 AM
Seq-Scope Expands Possibilities for High-Resolution Gene Expression Analysis by seqadmin Started by seqadmin, 11-01-2024, 06:09 AM	0 responses 30 views 0 likes	Last Post by seqadmin 11-01-2024, 06:09 AM
New Model Aims to Explain Polygenic Diseases by Connecting Genomic Mutations and Regulatory Networks by seqadmin Started by seqadmin, 10-30-2024, 05:31 AM	0 responses 21 views 0 likes	Last Post by seqadmin 10-30-2024, 05:31 AM

Seqanswers Leaderboard Ad

Announcement

Why not variant call before pileup?

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News