Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How much data should I generate for metagenomics/metatranscriptomics?

    Hi, my lab is planning to do shotgun metagenomics and metatranscriptomics using Illumina platforms, we want to go for commercial sequencing since we don't have access to HiSeq machine. I'm a beginner in the NGS field and I'm now getting confused on the data size I should get.
    The samples come from built environment, what I want to know is the taxonomy and functional gene information from metagenomics and 'active' taxonomy and functioning gene information from metatranscriptomics. I've done a 16S rDNA amplicon sequencing with Illumina MiSeq using some of my DNA samples and they generally contain ~500 different bacteria at the genus level.
    The data the sequencing company can give me after sequencing is 'qualified clean data' which seems to be data that index primers are trimmed off and low quality reads are filtered out. The company is suggesting me to get a 3 GB clean data per sample for shotgun metagenomics and a 2-4 GB clean data per sample for metatranscriptomics RNA-seq. I checked for a few metagenomics papers which varies in the data size they acquire (from ~2GB to ~30GB and they do not say if they are raw or clean data).
    In this case, could anybody give me some suggestions or experience in approximately how much data (if possible the minimum), or 'clean data' might be enough for the above metagenomics and metatranscriptomics analysis, considering this is a built environment which might not be of high complexity?
    Thank you very very much!!!!

  • #2
    Dear annaatobe,

    This is some information which might help you. Same kind of the problem i am also facing. No one tells that how much data is generated per sample. this information is provided to me by a service provider;

    ***300bp 515F bacterial, archaeal or fungal diversity PGM assays with nominal 15-20,000 reads/assay for $60/assay (any size project)
    ***2x300bp PE illumina 20,000 sequence diversity assays starting at $60/assay larger projects ($70/assay medium sized project, and $80/sample small.. (for projects < 10 assays per library, a $100 library fee is added) for nominal 20,000 reads/assay. (we do accept any size project)
    ***454 pyrosequencing 1x400bp diversity assays any size project starting at $90/assay for 3,000 nominal reads (any sized project)
    ***454 pyrosequencing 1x400bp diversity assays any size project starting at $160/assay for 10,000 nominal reads
    Shotgun services
    ***Communal Genomes or Metagenome: Our popular program provides $400 bacterial genomes and metagenomes data only.. this program gives roughly 2x150bp 10-20 million paired sequences per sample.. note the turnaround for these vary from a few weeks to a few months so do not put samples on this program that you are in a hurry to have sequenced ;-)
    ***Rapid bacterial Genomes starting $700 including assembly and annotation >40x coverage, $500 without assembly annotation 2x300bp 1-2 million sequences.


    This article may help you

    Comment


    • #3
      ~2.5Gb/sample in this paper:
      Colonization of the fetal and infant gut microbiome results in dynamic changes in diversity, which can impact disease susceptibility. To examine the relationship between human gut microbiome dynamics throughout infancy and type 1 diabetes (T1D), we examined a cohort of 33 infants genetically predisp …


      This Illumina app note did 5-15Gb/sample:


      This paper did 4Gb/sample:


      Human Microbiome Project:
      A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project


      My intuition is that it varies pretty drastically depending on your environment, and you might end up resequencing if you don't hit saturation the first time around.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Best Practices for Single-Cell Sequencing Analysis
        by seqadmin



        While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
        Today, 07:15 AM
      • seqadmin
        Latest Developments in Precision Medicine
        by seqadmin



        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

        Somatic Genomics
        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
        05-24-2024, 01:16 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 08:18 AM
      0 responses
      8 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, Today, 08:04 AM
      0 responses
      10 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 06-03-2024, 06:55 AM
      0 responses
      13 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 05-30-2024, 03:16 PM
      0 responses
      27 views
      0 likes
      Last Post seqadmin  
      Working...
      X