Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-Seq throughput

    Hello everyone!, I have a dumb question, but I didn't found nothing about that.
    I want to do RNA-Seq paired-end with Hiseq2000 to grape mRNA (to see differential expression), but I don't know how many samples may I put per lane to have a good coverage.

    The whole genome in grape (Vitis vinifera): 485.000.000 bp

    I ask to someone and he says me "if sequencing the 8 samples in 1 lane, I can take 3~3.5 Gb throughput"

    1.- what does means with throughput?
    2.- Is enough to see differential expression?
    3.- How can I calculate a good coverage?

    Thank you very much!!

  • #2
    1) ~3-3.5 GB output per sample
    2) You are on the low end. If your goal is differential expression, then GBs is not really that informative, because what you are interested in is the number of reads/fragments. For differential expression, a single accurately mapped 50bp read gives about as much information as a 100bp paired end reads. Now if you are also looking for alternative splicing, then the added bps help. I am assuming these are paired end, 100bp data? If so, then the number of paired end reads you will have will be ~3GBs/200bps = ~15,000,000 reads per sample. Yes, its possible to detect differential expression at that level. But to give you an idea, I would say ~10,000,000 to be the lower limit for Arabidopsis. Also, not all reads will map or map correctly, so that you should expect to lose data. If you can get ~15,000,000 reads for 8 samples on a single lane, then I would go with two lanes of data, get ~30,000,000 which will give you a much better representation. Your ability to detect differential expression accurately is very much dependent upon read counts and if you take the minimal number of reads, then lower expressed genes will be a problem.
    3) I don't like calculating coverage for RNA-seq. Each gene is expressed differently and how does one make sense of coverage from such data? For this to make sense, then you need to know a priori how many copies of each RNA you have.....in which case there is no need to do an experiment.

    Also, make sure you have biological replicates. It is pointless if you do not have biological replicates.
    Last edited by chadn737; 03-08-2013, 07:17 AM.

    Comment


    • #3
      Thanks for your response so quickly!!

      Originally posted by chadn737 View Post
      I am assuming these are paired end, 100bp data? If so, then the number of paired end reads you will have will be ~3GBs/200bps = ~15,000,000 reads per sample. Yes, its possible to detect differential expression at that level. But to give you an idea, I would say ~10,000,000 to be the lower limit for Arabidopsis. Also, not all reads will map or map correctly, so that you should expect to lose data. If you can get ~15,000,000 reads for 8 samples on a single lane, then I would go with two lanes of data, get ~30,000,000 which will give you a much better representation.
      Yes are paired end 100 bp data, really are 4 samples with a biological replicates
      an schema is:
      RNA from:
      plant A: 2 one grape cluster in 2 different times
      plant B: 2 one grape cluster in 2 different times
      each sample separately, like follow:

      PLANT-TIME-CLUSTER
      A-T1-C1
      A-T1-C2
      A-T2-C1
      A-T2-C2
      B-T1-C1
      B-T1-C2
      B-T2-C1
      B-T2-C2

      so, all "C2" are the biological replicates. Thanks again!

      Comment


      • #4
        Its good you have Biological reps. I just checked and the number of grape genes ~30,000 is not much more than Arabidopsis and a lot less than some of the other species I have worked with. You could get away with ~15,000,000 reads, but from experience, getting that extra lane of data and increased depth of sequencing makes a huge difference. So I would really encourage you to use at least 2 lanes of data.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advanced Methods for the Detection of Infectious Disease
          by seqadmin




          The recent pandemic caused worldwide health, economic, and social disruptions with its reverberations still felt today. A key takeaway from this event is the need for accurate and accessible tools for detecting and tracking infectious diseases. Timely identification is essential for early intervention, managing outbreaks, and preventing their spread. This article reviews several valuable tools employed in the detection and surveillance of infectious diseases.
          ...
          11-27-2023, 01:15 PM
        • seqadmin
          Strategies for Investigating the Microbiome
          by seqadmin




          Microbiome research has led to the discovery of important connections to human and environmental health. Sequencing has become a core investigational tool in microbiome research, a subject that we covered during a recent webinar. Our expert speakers shared a number of advancements including improved experimental workflows, research involving transmission dynamics, and invaluable analysis resources. This article recaps their informative presentations, offering insights...
          11-09-2023, 07:02 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 08:23 AM
        0 responses
        7 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 12-01-2023, 09:55 AM
        0 responses
        21 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-30-2023, 10:48 AM
        0 responses
        20 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 11-29-2023, 08:26 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Working...
        X