Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Sequencing technology database?

    Hi all,

    I'm planning to build a database of sequencing technologies, companies, instruments (platforms), and capabilities (i.e. run-types), not necessarily limited to NGS, but starting there initially.

    An example data-point could be:
    * Pyrosequencing
    ** Roche
    *** GS-FLX
    **** PE

    Before we get bogged down adding data to the database, I'd like to discuss what types of data you think should be collected? i.e. what fields to include in the database?

    Some ideas include: machine cost, reads per run, read length (statistics), run time, errors (error types and statistics), ...

    This will be a database in the wiki, so once we've settled on a data-model, everyone can contribute to the data, allowing the community to more easily compare platforms and keep track of trends and updates.

    I'm thinking simple is best, but please let me know what you think should be included in such a database :-)
    Homepage: Dan Bolser
    MetaBase the database of biological databases.

  • #2
    Hi Dan,

    You make me think of this http://www.molecularecologist.com/ne...eldguide-2012/

    Originally posted by dan View Post
    Hi all,

    I'm planning to build a database of sequencing technologies, companies, instruments (platforms), and capabilities (i.e. run-types), not necessarily limited to NGS, but starting there initially.

    An example data-point could be:
    * Pyrosequencing
    ** Roche
    *** GS-FLX
    **** PE

    Before we get bogged down adding data to the database, I'd like to discuss what types of data you think should be collected? i.e. what fields to include in the database?

    Some ideas include: machine cost, reads per run, read length (statistics), run time, errors (error types and statistics), ...

    This will be a database in the wiki, so once we've settled on a data-model, everyone can contribute to the data, allowing the community to more easily compare platforms and keep track of trends and updates.

    I'm thinking simple is best, but please let me know what you think should be included in such a database :-)
    Marco

    Comment


    • #3
      Another thought is,

      machine and reagent costs vary from country to country. I think it depends who you are (big institution or small, etc). And throughput depends on how much you can squeeze out..

      Put it another way around, I do think it may uncover these hidden marketing strategies, reflect the true costs..
      Last edited by marcowanger; 01-29-2013, 06:21 AM. Reason: remove the quote
      Marco

      Comment


      • #4
        Oooh! Nice! What license is that data :-D
        Homepage: Dan Bolser
        MetaBase the database of biological databases.

        Comment


        • #5
          Originally posted by marcowanger View Post
          Another thought is,

          machine and reagent costs vary from country to country. I think it depends who you are (big institution or small, etc). And throughput depends on how much you can squeeze out..

          Put it another way around, I do think it may uncover these hidden marketing strategies, reflect the true costs..
          One way round this would be to allow people to submit estimates. Then we could provide upper and lower bounds, median etc... However, the idea of the wiki is to let that kind of consensus emerge through discussion... Not sure what's best here... I guess we could just have one value, and if people complain bitterly, provide for multiple values to be added?
          Homepage: Dan Bolser
          MetaBase the database of biological databases.

          Comment


          • #6
            i think multiple values may be better. If only 1 single value is allowed, it might becomes a reference price list by manufacturer (IMHO, is useless and miss the point of community oriented).
            Marco

            Comment


            • #7
              I think a key bit for such a database is to timestamp all the entries on price, performance, etc. That way you could pull out trends.

              The mix-and-match nature of the technologies can make life more complicated and may be fun modeling. For example, with PacBio there are separate loading and running chemistries, and right now there are two choices for each (sadly, with the same names in each one) -- C3 and XL. So you can load C3 and run XL (but I think nobody does; puts each in its weak spot), load XL and run C3 (longer reads at higher quality), load XL and run XL (longest reads but quality drop) or load C3 run C3 (if you don't have any XL kits).

              Comment


              • #8
                Originally posted by krobison View Post
                I think a key bit for such a database is to timestamp all the entries on price, performance, etc. That way you could pull out trends.

                The mix-and-match nature of the technologies can make life more complicated and may be fun modeling. For example, with PacBio there are separate loading and running chemistries, and right now there are two choices for each (sadly, with the same names in each one) -- C3 and XL. So you can load C3 and run XL (but I think nobody does; puts each in its weak spot), load XL and run C3 (longer reads at higher quality), load XL and run XL (longest reads but quality drop) or load C3 run C3 (if you don't have any XL kits).
                Cry...

                So ... I guess... I'd call each one of these combinations a different 'capability' of the one machine?
                Homepage: Dan Bolser
                MetaBase the database of biological databases.

                Comment


                • #9
                  Originally posted by dan View Post
                  Cry...

                  So ... I guess... I'd call each one of these combinations a different 'capability' of the one machine?
                  I think that is why we need a database for this ..
                  Marco

                  Comment


                  • #10
                    Any more thoughts on fields to collect? Based on the link from Marco, I currently have this list of fields:
                    * Instrument
                    * Run time
                    * Millions of Reads/run
                    * Bases / read
                    * Yield (MB/run)

                    What about costs? Cost per run, cost per box? Cost of sample prep? Kits?
                    Homepage: Dan Bolser
                    MetaBase the database of biological databases.

                    Comment


                    • #11
                      Originally posted by dan View Post
                      Any more thoughts on fields to collect? Based on the link from Marco, I currently have this list of fields:
                      * Instrument
                      * Run time
                      * Millions of Reads/run
                      * Bases / read
                      * Yield (MB/run)

                      What about costs? Cost per run, cost per box? Cost of sample prep? Kits?
                      I think all make sense.
                      Marco

                      Comment


                      • #12
                        Thing is, I don't want to implement a LIMS (I don't mind doing it, but I don't have time to do it!)
                        Homepage: Dan Bolser
                        MetaBase the database of biological databases.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Exploring the Dynamics of the Tumor Microenvironment
                          by seqadmin




                          The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                          07-08-2024, 03:19 PM
                        • seqadmin
                          Exploring Human Diversity Through Large-Scale Omics
                          by seqadmin


                          In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                          06-25-2024, 06:43 AM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, Today, 11:09 AM
                        0 responses
                        14 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 07-19-2024, 07:20 AM
                        0 responses
                        146 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 07-16-2024, 05:49 AM
                        0 responses
                        120 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 07-15-2024, 06:53 AM
                        0 responses
                        111 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X