Seqanswers Leaderboard Ad

**ECO** · 09-29-2008, 11:57 AM

As far as I understand it...your script calculates single colorspace errors, right? Rather than "miscalls" in true basespace?

**new300** · 09-30-2008, 06:23 AM

I guess it's application dependent. What are you intended to do with SOLiD reads without a reference? I would have thought that with short color space reads there's little you can do but SNP calling against a reference, but I could be wrong.

If you're aligning to a reference, any reference I would have thought it would make sense to calculate the error rate against this.

**snetmcom** · 10-02-2008, 02:29 PM

You may find this useful, but it's falling into many of the pitfalls of non Solid informatics people.

**pmiguel** · 10-04-2008, 08:33 AM

Originally posted by ECO View Post

As far as I understand it...your script calculates single colorspace errors, right? Rather than "miscalls" in true basespace?

Yes. Except it estimates the number of errors per read based on the quality values assigned by the SOLiD base(color) caller.

For example, if a read is 35 bases and each base had a quality value of 10, then that is a 10% chance of error per base. So the estimated number of miscalls would be 3.5 =(0.1*35). But if each base had a quality value of 20, the estimated number of miscalls for that read would be 0.3 =(0.01*35).

Of course normal reads will have different quality values for each base. To estimate the number of miscalls, the script just adds up the estimated chance of a miscall for each base.

The major pitfall here is that I have no idea whether the SOLiD base caller accurately predicts its own error rate. I gather that the SOLiD base caller is tuned on mappable reads (those with 3 errors or less). Should be possible to check how it does on reads mapped with up to 6 errors against a reference sequence without a lot of redundant/low complexity segments. But I have not done this.

--
Phillip

**ECO** · 10-13-2008, 10:41 PM

Originally posted by snetmcom View Post

You may find this useful, but it's falling into many of the pitfalls of non Solid informatics people.

I'd love to hear more on that line of thinking....

Topics	Statistics	Last Post
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, Today, 08:47 AM	0 responses 11 views 0 likes	Last Post by seqadmin Today, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 54 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM

Seqanswers Leaderboard Ad

Announcement

Metrics for usability of a SOLiD dataset

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News