Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Brian Bushnell
    replied
    Originally posted by cement_head View Post
    So I took another look at this and it strikes me that the whole problem is the use of only four fluors for 16 combinations. (Seems odd that this wasn't the primary issue attempted to be solved; i.e generating 16 distinct fluors.) Once I got that part, it became obvious why there's an issue with colourspace. Curiously, I just found out that MiniSeq and NextSeq from Illumina use only two fluors - seems like a huge potential issue is one isn't resequencing a human genome...
    Yes, if Solid had used 16 colors it might have been substantially better, though that would have added its own unique issues (like potentially taking 4x as long to sequence).

    Illumina's 2-color chemistry is not like Solid Colorspace, though. It's just a binary encoding of bases -> colors; no information is lost (since no two bases share the same pair of color polarities), except that you can no longer distinguish between no signal and one of the bases. It works fairly well in practice (for de-novo sequencing) and you don't need to align sequences to determine what they are. The 2-color platforms have weaknesses, but it is not clear that the weaknesses are linked to the number of dyes.
    Last edited by Brian Bushnell; 07-05-2016, 12:22 PM.

    Leave a comment:


  • cement_head
    replied
    Originally posted by Chipper View Post
    The quoted error rate (<0.1%) must be after reference-based correction. The problem with SOLiD was the high raw error rate of the ligation based chemistry (compared to Illumina) and the short read lengths which makes it essentially useless for de novo assembly.

    I think the best option today for a large genome and a low budget would be to use the 10x Chromium with HiseqX (~$2000 for one lane PE150 linked reads from long fragments).
    So I took another look at this and it strikes me that the whole problem is the use of only four fluors for 16 combinations. (Seems odd that this wasn't the primary issue attempted to be solved; i.e generating 16 distinct fluors.) Once I got that part, it became obvious why there's an issue with colourspace. Curiously, I just found out that MiniSeq and NextSeq from Illumina use only two fluors - seems like a huge potential issue is one isn't resequencing a human genome...

    Leave a comment:


  • Chipper
    replied
    Originally posted by cement_head View Post
    I guess I still don't understand the "issues" with deconvoluting colour-space. It seems as though it would be much more accurate than sequencing in basespace (e.g. Illumina). That's if I'm reading this paper correctly (attached).
    The quoted error rate (<0.1%) must be after reference-based correction. The problem with SOLiD was the high raw error rate of the ligation based chemistry (compared to Illumina) and the short read lengths which makes it essentially useless for de novo assembly.

    I think the best option today for a large genome and a low budget would be to use the 10x Chromium with HiseqX (~$2000 for one lane PE150 linked reads from long fragments).

    Leave a comment:


  • gringer
    replied
    Originally posted by cement_head View Post
    It seems as though it would be much more accurate than sequencing in basespace (e.g. Illumina). That's if I'm reading this paper correctly.
    If our preferred model of DNA were colour-space, then it might have been more accurate with sufficient technology development. As it is, Illumina has had plenty of opportunity to improve the accuracy of their technology, and benefits from their chemical model being almost a direct representation of the DNA model that we use for sequencing.

    Leave a comment:


  • cement_head
    replied
    Originally posted by gringer View Post
    I suspect I've discussed this with you previously, but I might as well say things I haven't said before:

    Homopolymers look identical in colour-space, which causes havoc for transcriptome assemblies (e.g. distinguishing between poly-T and poly-A sequences). Other simple repeats would also cause issues for genomic assembly (e.g. ACACACACAC and GTGTGTGTGT are identical, despite having both a base shift and a complementation). The assemblies are only likely to be useful in colour-space, because colour-space errors propagate through as very different sequences in base-space. Also, every contig has four possible base-space representations, which among other things makes it quite difficult to use other genome assemblies as scaffolds for a colour-space assembly.
    I guess I still don't understand the "issues" with deconvoluting colour-space. It seems as though it would be much more accurate than sequencing in basespace (e.g. Illumina). That's if I'm reading this paper correctly (attached).
    Attached Files

    Leave a comment:


  • gringer
    replied
    Originally posted by westerman View Post
    Going off the topic here (which is that the SOLiD is not good for denovo work) I wonder where you get that statement. It seems to me that 60 quality bases would be enough to place accurately except for long repeat regions (e.g., LTRs).
    I suspect I've discussed this with you previously, but I might as well say things I haven't said before:

    Homopolymers look identical in colour-space, which causes havoc for transcriptome assemblies (e.g. distinguishing between poly-T and poly-A sequences). Other simple repeats would also cause issues for genomic assembly (e.g. ACACACACAC and GTGTGTGTGT are identical, despite having both a base shift and a complementation). The assemblies are only likely to be useful in colour-space, because colour-space errors propagate through as very different sequences in base-space. Also, every contig has four possible base-space representations, which among other things makes it quite difficult to use other genome assemblies as scaffolds for a colour-space assembly.

    Leave a comment:


  • cement_head
    replied
    Ok, thanks

    Leave a comment:


  • RickC7
    replied
    Reagent support for SOLiD until May2017 or sooner per demand.

    We use/used SOLiD for SAGE, great for short reads but more expensive than Illumina runs. Converting everything over to Illumina adapters now...

    The couple times we did targeted reseq or whole transciptome, reverse read quality was bad.

    Leave a comment:


  • colindaven
    replied
    @westerman

    It wasn't clear from the start whether the topic was de novo or reference based assembly.

    Have a look at the genome mappability score which came out of Mike Schatz's lab as one example (http://bioinformatics.oxfordjournals...8/16/2097.full).

    Even with 100bp perfect simulated single reads there are regions which cannot be mapped to reliably. Therefore, 60 bp reads containing errors won't be so nice to deal with. I remember working on human twin genomes and getting ~40-50,000 differences in VCF despite various SNP callers and stringent mapping quality filters.



    By the way, I work on plant genomes, and repetitive regions can be > 80%, so I thought the original poster might have similar issues.

    Leave a comment:


  • westerman
    replied
    Originally posted by colindaven View Post
    A 60bp SE read is too short to place accurately in many/most genomes.
    Going off the topic here (which is that the SOLiD is not good for denovo work) I wonder where you get that statement. It seems to me that 60 quality bases would be enough to place accurately except for long repeat regions (e.g., LTRs).

    Leave a comment:


  • Brian Bushnell
    replied
    My experience with Solid 4 was that it had terrible accuracy... on both read 1 and read 2.

    Leave a comment:


  • colindaven
    replied
    There are still quite a few SOLiDs out there, see for example this data just into the SRA:

    http://www.ncbi.nlm.nih.gov/sra/ERX1488475[accn]

    Raw read accuracy is excellent, but keep in mind paired end reads do not really work at all (R1 was ~ 75 bp, 60bp after trimming, and R2 was just pure rubbish).

    A 60bp SE read is too short to place accurately in many/most genomes. Also de novo assembly simply does not work, which rules out all other than resequencing applications (you need a very good reference genome too).

    Leave a comment:


  • cmbetts
    replied
    They may both use sequencing by ligation, but SOLiD and Complete Genomics are different technologies. As far as I can tell, SOLiD has been discontinued, having been beaten by Illumina and replace by Ion Torrent long ago.
    Either would still be inappropriate for de novo genome sequencing. Complete has always been exclusively for human genome resequencing, and the colorspace reads of SOLiD were best when a reference was available because sequencing errors introduced frameshifts in the base encoding.

    Leave a comment:


  • cement_head
    replied
    Hello,

    It is not obsolete - Complete Genomics (BGI) use sequencing-by-ligation?

    URL: http://bgi-international.com/service...her-platforms/

    -Andor

    Leave a comment:


  • Chipper
    replied
    No. Besides that it is obsolete it gave far too short reads.

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
59 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
57 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
51 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
56 views
0 likes
Last Post seqadmin  
Working...
X