Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-Seq throughput

    Hello everyone!, I have a dumb question, but I didn't found nothing about that.
    I want to do RNA-Seq paired-end with Hiseq2000 to grape mRNA (to see differential expression), but I don't know how many samples may I put per lane to have a good coverage.

    The whole genome in grape (Vitis vinifera): 485.000.000 bp

    I ask to someone and he says me "if sequencing the 8 samples in 1 lane, I can take 3~3.5 Gb throughput"

    1.- what does means with throughput?
    2.- Is enough to see differential expression?
    3.- How can I calculate a good coverage?

    Thank you very much!!

  • #2
    1) ~3-3.5 GB output per sample
    2) You are on the low end. If your goal is differential expression, then GBs is not really that informative, because what you are interested in is the number of reads/fragments. For differential expression, a single accurately mapped 50bp read gives about as much information as a 100bp paired end reads. Now if you are also looking for alternative splicing, then the added bps help. I am assuming these are paired end, 100bp data? If so, then the number of paired end reads you will have will be ~3GBs/200bps = ~15,000,000 reads per sample. Yes, its possible to detect differential expression at that level. But to give you an idea, I would say ~10,000,000 to be the lower limit for Arabidopsis. Also, not all reads will map or map correctly, so that you should expect to lose data. If you can get ~15,000,000 reads for 8 samples on a single lane, then I would go with two lanes of data, get ~30,000,000 which will give you a much better representation. Your ability to detect differential expression accurately is very much dependent upon read counts and if you take the minimal number of reads, then lower expressed genes will be a problem.
    3) I don't like calculating coverage for RNA-seq. Each gene is expressed differently and how does one make sense of coverage from such data? For this to make sense, then you need to know a priori how many copies of each RNA you have.....in which case there is no need to do an experiment.

    Also, make sure you have biological replicates. It is pointless if you do not have biological replicates.
    Last edited by chadn737; 03-08-2013, 07:17 AM.

    Comment


    • #3
      Thanks for your response so quickly!!

      Originally posted by chadn737 View Post
      I am assuming these are paired end, 100bp data? If so, then the number of paired end reads you will have will be ~3GBs/200bps = ~15,000,000 reads per sample. Yes, its possible to detect differential expression at that level. But to give you an idea, I would say ~10,000,000 to be the lower limit for Arabidopsis. Also, not all reads will map or map correctly, so that you should expect to lose data. If you can get ~15,000,000 reads for 8 samples on a single lane, then I would go with two lanes of data, get ~30,000,000 which will give you a much better representation.
      Yes are paired end 100 bp data, really are 4 samples with a biological replicates
      an schema is:
      RNA from:
      plant A: 2 one grape cluster in 2 different times
      plant B: 2 one grape cluster in 2 different times
      each sample separately, like follow:

      PLANT-TIME-CLUSTER
      A-T1-C1
      A-T1-C2
      A-T2-C1
      A-T2-C2
      B-T1-C1
      B-T1-C2
      B-T2-C1
      B-T2-C2

      so, all "C2" are the biological replicates. Thanks again!

      Comment


      • #4
        Its good you have Biological reps. I just checked and the number of grape genes ~30,000 is not much more than Arabidopsis and a lot less than some of the other species I have worked with. You could get away with ~15,000,000 reads, but from experience, getting that extra lane of data and increased depth of sequencing makes a huge difference. So I would really encourage you to use at least 2 lanes of data.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Quality Control Essentials for Next-Generation Sequencing Workflows
          by seqadmin




          Like all molecular biology applications, next-generation sequencing (NGS) workflows require diligent quality control (QC) measures to ensure accurate and reproducible results. Proper QC begins at nucleic acid extraction and continues all the way through to data analysis. This article outlines the key QC steps in an NGS workflow, along with the commonly used tools and techniques.

          Nucleic Acid Quality Control
          Preparing for NGS starts with isolating the...
          02-10-2025, 01:58 PM
        • seqadmin
          An Introduction to the Technologies Transforming Precision Medicine
          by seqadmin


          In recent years, precision medicine has become a major focus for researchers and healthcare professionals. This approach offers personalized treatment and wellness plans by utilizing insights from each person's unique biology and lifestyle to deliver more effective care. Its advancement relies on innovative technologies that enable a deeper understanding of individual variability. In a joint documentary with our colleagues at Biocompare, we examined the foundational principles of precision...
          01-27-2025, 07:46 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 02-07-2025, 09:30 AM
        0 responses
        68 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-05-2025, 10:34 AM
        0 responses
        107 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 02-03-2025, 09:07 AM
        0 responses
        83 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 01-31-2025, 08:31 AM
        0 responses
        47 views
        0 likes
        Last Post seqadmin  
        Working...
        X