Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • dan
    wiki wiki
    • Jul 2008
    • 194

    Sequencing technology database?

    Hi all,

    I'm planning to build a database of sequencing technologies, companies, instruments (platforms), and capabilities (i.e. run-types), not necessarily limited to NGS, but starting there initially.

    An example data-point could be:
    * Pyrosequencing
    ** Roche
    *** GS-FLX
    **** PE

    Before we get bogged down adding data to the database, I'd like to discuss what types of data you think should be collected? i.e. what fields to include in the database?

    Some ideas include: machine cost, reads per run, read length (statistics), run time, errors (error types and statistics), ...

    This will be a database in the wiki, so once we've settled on a data-model, everyone can contribute to the data, allowing the community to more easily compare platforms and keep track of trends and updates.

    I'm thinking simple is best, but please let me know what you think should be included in such a database :-)
    Homepage: Dan Bolser
    MetaBase the database of biological databases.
  • marcowanger
    Senior Member
    • Dec 2008
    • 273

    #2
    Hi Dan,

    You make me think of this http://www.molecularecologist.com/ne...eldguide-2012/

    Originally posted by dan View Post
    Hi all,

    I'm planning to build a database of sequencing technologies, companies, instruments (platforms), and capabilities (i.e. run-types), not necessarily limited to NGS, but starting there initially.

    An example data-point could be:
    * Pyrosequencing
    ** Roche
    *** GS-FLX
    **** PE

    Before we get bogged down adding data to the database, I'd like to discuss what types of data you think should be collected? i.e. what fields to include in the database?

    Some ideas include: machine cost, reads per run, read length (statistics), run time, errors (error types and statistics), ...

    This will be a database in the wiki, so once we've settled on a data-model, everyone can contribute to the data, allowing the community to more easily compare platforms and keep track of trends and updates.

    I'm thinking simple is best, but please let me know what you think should be included in such a database :-)
    Marco

    Comment

    • marcowanger
      Senior Member
      • Dec 2008
      • 273

      #3
      Another thought is,

      machine and reagent costs vary from country to country. I think it depends who you are (big institution or small, etc). And throughput depends on how much you can squeeze out..

      Put it another way around, I do think it may uncover these hidden marketing strategies, reflect the true costs..
      Last edited by marcowanger; 01-29-2013, 06:21 AM. Reason: remove the quote
      Marco

      Comment

      • dan
        wiki wiki
        • Jul 2008
        • 194

        #4
        Oooh! Nice! What license is that data :-D
        Homepage: Dan Bolser
        MetaBase the database of biological databases.

        Comment

        • dan
          wiki wiki
          • Jul 2008
          • 194

          #5
          Originally posted by marcowanger View Post
          Another thought is,

          machine and reagent costs vary from country to country. I think it depends who you are (big institution or small, etc). And throughput depends on how much you can squeeze out..

          Put it another way around, I do think it may uncover these hidden marketing strategies, reflect the true costs..
          One way round this would be to allow people to submit estimates. Then we could provide upper and lower bounds, median etc... However, the idea of the wiki is to let that kind of consensus emerge through discussion... Not sure what's best here... I guess we could just have one value, and if people complain bitterly, provide for multiple values to be added?
          Homepage: Dan Bolser
          MetaBase the database of biological databases.

          Comment

          • marcowanger
            Senior Member
            • Dec 2008
            • 273

            #6
            i think multiple values may be better. If only 1 single value is allowed, it might becomes a reference price list by manufacturer (IMHO, is useless and miss the point of community oriented).
            Marco

            Comment

            • krobison
              Senior Member
              • Nov 2007
              • 734

              #7
              I think a key bit for such a database is to timestamp all the entries on price, performance, etc. That way you could pull out trends.

              The mix-and-match nature of the technologies can make life more complicated and may be fun modeling. For example, with PacBio there are separate loading and running chemistries, and right now there are two choices for each (sadly, with the same names in each one) -- C3 and XL. So you can load C3 and run XL (but I think nobody does; puts each in its weak spot), load XL and run C3 (longer reads at higher quality), load XL and run XL (longest reads but quality drop) or load C3 run C3 (if you don't have any XL kits).

              Comment

              • dan
                wiki wiki
                • Jul 2008
                • 194

                #8
                Originally posted by krobison View Post
                I think a key bit for such a database is to timestamp all the entries on price, performance, etc. That way you could pull out trends.

                The mix-and-match nature of the technologies can make life more complicated and may be fun modeling. For example, with PacBio there are separate loading and running chemistries, and right now there are two choices for each (sadly, with the same names in each one) -- C3 and XL. So you can load C3 and run XL (but I think nobody does; puts each in its weak spot), load XL and run C3 (longer reads at higher quality), load XL and run XL (longest reads but quality drop) or load C3 run C3 (if you don't have any XL kits).
                Cry...

                So ... I guess... I'd call each one of these combinations a different 'capability' of the one machine?
                Homepage: Dan Bolser
                MetaBase the database of biological databases.

                Comment

                • marcowanger
                  Senior Member
                  • Dec 2008
                  • 273

                  #9
                  Originally posted by dan View Post
                  Cry...

                  So ... I guess... I'd call each one of these combinations a different 'capability' of the one machine?
                  I think that is why we need a database for this ..
                  Marco

                  Comment

                  • dan
                    wiki wiki
                    • Jul 2008
                    • 194

                    #10
                    Any more thoughts on fields to collect? Based on the link from Marco, I currently have this list of fields:
                    * Instrument
                    * Run time
                    * Millions of Reads/run
                    * Bases / read
                    * Yield (MB/run)

                    What about costs? Cost per run, cost per box? Cost of sample prep? Kits?
                    Homepage: Dan Bolser
                    MetaBase the database of biological databases.

                    Comment

                    • marcowanger
                      Senior Member
                      • Dec 2008
                      • 273

                      #11
                      Originally posted by dan View Post
                      Any more thoughts on fields to collect? Based on the link from Marco, I currently have this list of fields:
                      * Instrument
                      * Run time
                      * Millions of Reads/run
                      * Bases / read
                      * Yield (MB/run)

                      What about costs? Cost per run, cost per box? Cost of sample prep? Kits?
                      I think all make sense.
                      Marco

                      Comment

                      • dan
                        wiki wiki
                        • Jul 2008
                        • 194

                        #12
                        Thing is, I don't want to implement a LIMS (I don't mind doing it, but I don't have time to do it!)
                        Homepage: Dan Bolser
                        MetaBase the database of biological databases.

                        Comment

                        Latest Articles

                        Collapse

                        • SEQadmin2
                          From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                          by SEQadmin2


                          Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                          The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                          ...
                          06-02-2026, 10:05 AM
                        • SEQadmin2
                          Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                          by SEQadmin2


                          With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                          Introduction

                          Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                          05-22-2026, 06:42 AM
                        • SEQadmin2
                          Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                          by SEQadmin2

                          Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                          Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                          05-06-2026, 09:04 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, Yesterday, 08:59 AM
                        0 responses
                        13 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-02-2026, 12:03 PM
                        0 responses
                        22 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-02-2026, 11:40 AM
                        0 responses
                        19 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 05-28-2026, 11:40 AM
                        0 responses
                        32 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...