Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • What do you do with your database?

    Dear All,
    In a different thread which I posted earlier (Core cluster setup...), westerman suggested I post a new thread with the question what do you do with your DB. I am indeed curious what do you do with your database? I would believe that trying to store the NGS data in something like a SQL database is a lost enterprise. So my questions are

    1) What do you do with your Database?

    2) How do you store your NGS data?

    3) Do you have any troubles with accessing your data on a repeated basis?

    4) What are the biggest bottlenecks you commonly encounter with regards to the data management?

    5) Do you have a commercial solution or a home-grown one?

    Thank you for your time and I shall look forward to your replies.
    Regards
    Quantrix

  • #2
    At the moment we don't use a database. As you say the files are huge. It would be important to store variants etc for comparison if you are always working on one large project, but here we have a lot of smaller/medium projects which aren't relevant to compare to each other.
    Also keep in mind your users might not be trained in database-based analysis, so a good front end will be important.

    Comment


    • #4
      We had an oracle DB housing our SRF + fastq files when we were still generating those. It was huge, but worked well and had a fuse layer to transparently make it visible to the users. The DB was in two halves - a large set of partitions holding blobs (actually oracle "secure files" I think) and a far smaller meta-data component that tracked where things were. It would have worked OK using a filesystem instead of the binary blobs though - there are pros and cons to each method.

      We've since switched both format and DB mechanism for raw data: we store BAM files in an iRODs system.

      The analysis bams & co (ie mapped or assembled data, vcf files, etc) are less clearly divided - stored in various project/group directories over a variety of file system types; slow & fast NFS storage, lustre, etc.

      The only real bottlenecks are if someone tries to access a single DB layer (like the fuse layer) from 1000+ cores on our cpu farm. We require that people copy data to something more scalable first which we use Lustre for.

      Comment


      • #5
        Hi Mapper and jkbonfield,
        That is very helpful indeed!
        Thanks
        Quantrix

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Latest Developments in Precision Medicine
          by seqadmin



          Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

          Somatic Genomics
          “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
          05-24-2024, 01:16 PM
        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin


          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          05-06-2024, 07:48 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 05-24-2024, 07:15 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-23-2024, 10:28 AM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-23-2024, 07:35 AM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-22-2024, 02:06 PM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Working...
        X