Originally posted by ajthomas
View Post
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
-
Originally posted by pmiguel View PostHard to estimate total number of miscalls in a read from a mean quality value. But Q30 is one error per 1000 bases. So, as long as you don't have crazy high quality values off setting really low values, then I would not expect several errors in a 200-400bp read.
Also, to the extent that the quality values are accurate, software could use them to weight the likelihood of a given base being a true variation or not. Or, trivially, you could mask out bases that had quality values lower than 30.
Let's not ignore the elephant here: Illumina is producing 100's of gigabases of sequence per flow cell whereas a 454 run produces 100's of megabases. Illumina chemistry has a higher per run cost than 454, but we are still looking at something approaching a 100x price per base differential.
But the same logic applies to Sanger sequencing, which is at least 100x more expensive per base.
--
Phillip
It's true that 454 is less cost effective than Illumina. Most applications can use the shorter read lengths obtained from Illumina/Solid, etc., and for those applications it makes a lot more sense to use those technologies. One thing to keep in mind, however, when comparing the amount of data produced--454 doesn't produce as much data, but in many cases, you don't need as much data with 454, either. Simply comparing numbers doesn't tell the whole story. RNA-seq provides an excellent example of where one technology might be better than the other, depending on your experiment. If you're trying to quantify gene expression, Illumina is definitely the way to. In that case, you're just trying to identify transcripts and count them. The high number of reads is a boon to your experiment. However, if you're looking for splice variants and don't care so much to quantify expression, 454 is probably a better technology. There will always be a niche for 454, although it's never going to be large.
Leave a comment:
-
Originally posted by rskr View PostWell except 454, would still fail to find linkage over 400bp consistently since the median read is much less, and with sufficient coverage paired end data is likely to find linkage up to 800bp which is maximum length PCR product.
If all goes well, a 454 run will have median read lengths >400 bases.
--
Phillip
Leave a comment:
-
Originally posted by rskr View PostAnd you can do that with the 454 error model? Last I checked a mean quality of 30 would guarantee several errors in a 200-400bp read? Might be better off With a 250 base insert size and 150 bp paired end reads, with an overlapper that finds the intersection.
Also, to the extent that the quality values are accurate, software could use them to weight the likelihood of a given base being a true variation or not. Or, trivially, you could mask out bases that had quality values lower than 30.
Let's not ignore the elephant here: Illumina is producing 100's of gigabases of sequence per flow cell whereas a 454 run produces 100's of megabases. Illumina chemistry has a higher per run cost than 454, but we are still looking at something approaching a 100x price per base differential.
But the same logic applies to Sanger sequencing, which is at least 100x more expensive per base.
--
Phillip
Leave a comment:
-
Originally posted by rskr View PostSo, what is the difference between that and looking at a pileup of paired end Illumina reads? They will get the linkage just as well.
Leave a comment:
-
Originally posted by ajthomas View PostDoes the fact that I use the 454 offend you or something? I explained why I use it and why other technologies aren't appropriate for my work and you seem to think I'm an idiot for using it. I'm a little confused at your derision.
By the way, I don't look at individual reads, I look at consensus reads (usually 10-100X coverage per variant).
Leave a comment:
-
Originally posted by rskr View PostCertainly you have some will to analyze erroneous data. If you cared you would have seen that 454 is only .9999% accurate with deep coverage, but you are talking about analyzing individual reads looking for variants, which suggests a different type of of "work for you", than I would find acceptable, but hey you are probably an MD analyzing a major histocompatibility complex, so you can get away with saying anything you want, because you hate statistics.
Leave a comment:
-
Does the fact that I use the 454 offend you or something? I explained why I use it and why other technologies aren't appropriate for my work and you seem to think I'm an idiot for using it. I'm a little confused at your derision.
By the way, I don't look at individual reads, I look at consensus reads (usually 10-100X coverage per variant).
Leave a comment:
-
Originally posted by ajthomas View PostI'm not trying to argue with you, but I've looked at the options and the 454 is the only one that works for my application. I can't get 400bp reads any other way, and one of my amplicons is nearly that long.
Leave a comment:
-
I'm not trying to argue with you, but I've looked at the options and the 454 is the only one that works for my application. I can't get 400bp reads any other way, and one of my amplicons is nearly that long.
Leave a comment:
-
It works just fine. Perhaps the accuracy is better than you think. I don't see nearly as many errors as you imply here. And no, I can't work with shorter reads that must be overlapped. I had to do that before switching from the older standard chemistry to the Titanium chemistry. I ended up with a number of allele misidentifications because of it. Some of the alleles are just too similar and can't be reliably identified without a full-length sequence. Say you have four different alleles: A and B differ from C and D by one base near the 5' end. A and C differ from B and D by one base near the 3' end. Any given sample may have any combination of the four (there are ~10 loci in the genome, so ~20 alleles present in a heterozygote). You can't differentiate these four alleles without having both ends of the amplicon on the same read.
Leave a comment:
-
Originally posted by ajthomas View PostI'm using it primarily for genotyping highly polymorphic genes (MHC if you must know), sequencing amplicons of 200-400bp long. Because some alleles only differ by one or two bases that may be at one end or the other of the amplicon, reads that are not full length cannot always differentiate some closely-related alleles. I must have full-length reads of my amplicons which I cannot get from any NGS technology except 454.
Leave a comment:
-
I'm using it primarily for genotyping highly polymorphic genes (MHC if you must know), sequencing amplicons of 200-400bp long. Because some alleles only differ by one or two bases that may be at one end or the other of the amplicon, reads that are not full length cannot always differentiate some closely-related alleles. I must have full-length reads of my amplicons which I cannot get from any NGS technology except 454.
Leave a comment:
-
Originally posted by ajthomas View PostIn spite of the short read technologies getting longer, their read lengths still can't compare to that achieved on the 454. Of course, those longer "short" reads also means the number of applications where the 454 is required is shrinking. In my own case, I must have at least 400bp reads, and I'm excited about the longer reads of the FLX+ because that opens up the way for some other experiments I couldn't do before. It will be a while (maybe a long while) before the short read technologies can do what I need.
I find 100 base paired end data do subsume just about any benefits gained from error prone reads with a mean of 400 but a median of 200.
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...-
Channel: Articles
04-22-2024, 07:01 AM -
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-25-2024, 11:49 AM
|
0 responses
19 views
0 likes
|
Last Post
by seqadmin
04-25-2024, 11:49 AM
|
||
Started by seqadmin, 04-24-2024, 08:47 AM
|
0 responses
18 views
0 likes
|
Last Post
by seqadmin
04-24-2024, 08:47 AM
|
||
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
62 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
60 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
Leave a comment: