Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • pkstarstorm05
    Member
    • Jun 2014
    • 14

    RNA-seq Exfold Tutorial

    Hi everyone,

    This is a beefy one... sort of.

    I've set up a big RNA-seq experiment where I'm comparing pooled mouse samples. I've clipped off a bit of tissue and extracted the RNA, then for each time point I've pooled together a few individuals since the tissue I'm using is very limited and I can't get much RNA from them. There are 4 samples per pool. After pooling, I ran Ribozero to get rid of the rRNA and during the process I spiked the samples using the ERCC ex-Fold spike in mix (0.5ul - a dilution amount that seemed to be appropriate for my experiment).

    This is the set up:

    Mouse E11.5 Control (4 individuals pooled in to the same tube)
    Mouse E11.5 TEST (4 individuals pooled in to the same tube)
    Mouse E12.5 Control (individuals...etc)
    Mouse E12.5 TEST... etc etc
    All the way up to
    Mouse E17.5 Control (4 individuals pooled)
    Mouse E17.5 TEST (4 individuals pooled)

    Each pool was sequenced on the Illumina hi-seq using v3 chemistry and I have the data. The problem that I have is trying to analyze the pools for differential expression and using the ERCC spike-ins for normalization.

    So just to clear a couple things up first

    --The point of this experiment is not to generate an end all serial transcriptome data set for the tissue I'm studying. We were willing to spend the money to do this as an exploratory experiment to highlight specific genes that we would follow up later. So its just exploratory and not for publishing, necessarily.

    --We are aware of the alternatives for the approach to this experiment, but decided that based on our goals and our budget that this would be the best approach.

    Okay - so considering all of these details, I was hoping I might get some feed back on the following questions:

    1. Was it necessary for us to use the ERCC ex-fold spike ins for this experimental set up? We went back and forth about this a little bit, and decided it would be best to use them. But I wanted to get a feel from the community on this. I know the ERCC spikes are supposed to help control for platform variation, but since we multiplexed all of the pools during the run (across several lanes), does this even matter?

    2. How on earth do I actually normalize the data from the ERCC spike ins. I mean step by step. I have run CuffDiff, and it seems to have its own normalizing standard when performing the analysis which did produce some very interesting results... but surely it doesn't it take in to account the ERCC spike ins automatically? I've also come across forum threads where people reference random functions with no context, like "loess.normalization()". What on Earth is that supposed to mean? Sounds like excel! haha I haven't been able to find a single how-to or tutorial on how to actually run the ERCC normalization. Maybe I'm not looking in the right place? I'm not hugely familiar with the bioinformatics skills necessary for doing this, but there is also no guidance or expertise on this at the institution/dept. I'm in. But we also don't want to outsource. Can anyone give me a step by step or link to a guide for normalizing my RNA-seq data using the ERCC spike ins? I don't have an intuitive knowledge of which programs I am supposed to use and I don't know what some random function is supposed to represent or where I'm supposed to implement it... but I do have the skills to learn how to use the tools with a little guidance.

    Thanks so much for any help and please let me know if you need any more information!

    Cheers!

    Paul
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    1. No, they weren't needed. ERCC spike-ins are mostly useful for single-cell sequencing. I wouldn't bother with them here unless the library normalization goes weird.
    2. I seriously doubt that you can use spike-ins with cuffdiff. When you see people mentioning loess normalization, they're talking about doing things in R, which is pretty much what you'll have to do as well. The general idea is to align to a genome containing the ERCC sequences in it (just concatenate your reference with them) and then get count information for the spike-ins as well as the real genes. You then import that into R using whatever method you prefer and use on the ERCC subset of that for library normalization. You then apply the computed size factors to the dataset (removing the ERCC probes) and continue with the analysis. If you have no clue what that means then either don't bother with the ERCC spike-ins (a good idea anyway since they're likely to produce crappier results) or collaborate with a local bioinformatician.

    Comment

    • munrosa
      Junior Member
      • Oct 2014
      • 1

      #3
      Hi Paul,

      You might be interested in the new erccdashboard R package for analyzing your data. The package is available on Bioconductor: http://bioconductor.jp/packages/3.0/...dashboard.html

      The publication describing the erccdashboard,"Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures" is here: http://www.nature.com/ncomms/2014/14...comms6125.html

      These resources will provide you with details and empirical evidence that should more substantively answer your questions about the utility of the ERCC spike-ins compared to the level of detail that can reasonably be provided in replies to your post. The ERCC spike-ins can be used for more than single-cell sequencing and normalization -- although these have been areas where they've seen a lot of use.

      I'd be happy to work with you on your analysis of the ERCC spike-ins in your experiments and your use of the erccdashboard -- you can feel free to contact me directly.

      Cheers,
      Sarah

      Comment

      Latest Articles

      Collapse

      • SEQadmin2
        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
        by SEQadmin2


        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
        ...
        06-02-2026, 10:05 AM
      • SEQadmin2
        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
        by SEQadmin2


        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


        Introduction

        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
        05-22-2026, 06:42 AM
      • SEQadmin2
        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
        by SEQadmin2

        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
        05-06-2026, 09:04 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, 06-02-2026, 12:03 PM
      0 responses
      19 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-02-2026, 11:40 AM
      0 responses
      14 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 05-28-2026, 11:40 AM
      0 responses
      29 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 05-26-2026, 10:12 AM
      0 responses
      31 views
      0 reactions
      Last Post SEQadmin2  
      Working...