Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Analysing RNA seq data

    Hi,
    I am new to the field of RNA sequencing. I want to know which is the best way to analyse RNA sequencing data from illumina without replications. I describe briefly my experiment and would be grateful if some one could suggest me ways to analyse the data.

    I work on forest trees and they are outbreeding like humans. I have used 3 populations with 15 seedlings from each population. I have grown 15 seedlings from each population in a glasshouse for 4 months and then imposed water stress by giving them limited amount of water for 2 months. I have taken leaf samples for RNA just before imposing stress. Ten seedlings recieved stress treatment and the other five were well watered. I have taken leaf samples one month and two months after treatment. I have bulked the RNA from initial sampling from 10 seedlings which recieved stress treatment and bulked the RNA from the other five seedlings. I did the same with the stress treated seedlins and sequenced all five bulks (3 bulks from stressed including one before imposing stress and two from controls C0). Five RNA bulks were sequenced using Illumina.

    Is it valid to compare expression from the ten seedlings before and after stress treatment? I don't have any replications. Could I use DEseq for analysing this data as I don't have replications?

  • #2
    Hi Balat,

    DESeq supports testing without replicates. See thread
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    and the documentation on http://bioconductor.org/packages/rel...tml/DESeq.html

    Comment


    • #3
      Hi Balat

      The Bioconductor package edgeR also supports DE analysis without replication - see the discussion at the link posted by joro. Find out more about edgeR here - I recommend having a look at the User's Guide to get a feel for what the package does and how to use it. The section on Poisson analysis is most relevant if you want to analyse data without replication.

      Best regards
      Davis

      Comment


      • #4
        Hi,

        DESeq, when given data without replicates, will switch to a conservative mode of overestimating variance, as I described in the post that joro cited. EdgeR can do the same but you have to tell it what dispersion estimate to use. Be sure to read Davis's post in the same thread. We both stress that switching the dispersion to 0 (Poisson test) will never give reliable results.

        Why did you pool your data into bulk samples? I understand that sequencing each sample individually would have been to expensive, but you could have pooled the 10 stress samples into two pools of five each. Then, you would have sequences for three pools, two of which would have been biological replicates which is fully sufficient to get a good noise estimate. If you had used barcoded adapters, it might not even have cost more.

        Simon

        Comment


        • #5
          Hi Simon,
          Thanks for the reply. Yes I could have used two bulks of five each but unfortunately I didn't. However I have analysed my data using the sample clustering feature of DESeq. I have sequence data from two populations. I have denoted the control samples from each population as S0-P and S0-K, similarly from stress2 as S1-P and S1-K and stress1 as S2-P and S2-K (there was an error in my labelling). The heat map clearly separates the treatments. Moreover expression from the two populations within a treatment look very similar as biological relplicates. In that case, can I treat the two populations as biological replicates? I have attached the heat map here.
          Thank you very mcuh.
          Attached Files

          Comment


          • #6
            Hi Davis,
            I just had a look into your reply to Sergio. I have used edgeR by treating S0-P and S0-K and S1-P and S1-K as biological replicates. I got a common dispersion estimate of 0.06. Based on this result and the heat map figure, is it ok to treat my two populations (P and K) as bilogical replicates?

            Thank you.

            Comment


            • #7
              Originally posted by Balat View Post
              In that case, can I treat the two populations as biological replicates?
              If they are two independently grown populations, you don't just treat them as biological replicates, they are biological replicates. So go ahead and use them that way.

              Simon

              Comment


              • #8
                I concur with Simon - you either have biological replicates or you do not, based on the origin of your samples. It is not something that is determined in the analysis of the data.

                I have used edgeR by treating S0-P and S0-K and S1-P and S1-K as biological replicates. I got a common dispersion estimate of 0.06.
                By way of interpretation, the common dispersion estimate is the "squared coefficient of variation", which is a measure of the inter-library variability, distinct from the technical variability. Here the coefficient of variation is therefore approximately 0.24, which we would interpret as indicating that the true concentration of each gene (an unobservable quantity) varies up and down by 24% between libraries.

                The assumption here is that the coefficient of variation is more or less constant across all genes. Now, Simon would rightly point out that this assumption does not hold for all RNA-seq datasets, but it does give you some idea of the variability you see between sample replicates in you data.

                Cheers
                Davis

                Comment


                • #9
                  Thanks Simon and Davis.
                  The populations in my study are separately grown populations of the same species under common garden conditions. I was expecting that the gene expression patterns would be very different between the populations. But clustering analysis by DEseq clearly shows that the expression patterns within each treatment are similar between the two populations.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Exploring the Dynamics of the Tumor Microenvironment
                    by seqadmin




                    The complexity of cancer is clearly demonstrated in the diverse ecosystem of the tumor microenvironment (TME). The TME is made up of numerous cell types and its development begins with the changes that happen during oncogenesis. “Genomic mutations, copy number changes, epigenetic alterations, and alternative gene expression occur to varying degrees within the affected tumor cells,” explained Andrea O’Hara, Ph.D., Strategic Technical Specialist at Azenta. “As...
                    07-08-2024, 03:19 PM
                  • seqadmin
                    Exploring Human Diversity Through Large-Scale Omics
                    by seqadmin


                    In 2003, researchers from the Human Genome Project (HGP) announced the most comprehensive genome to date1. Although the genome wasn’t fully completed until nearly 20 years later2, numerous large-scale projects, such as the International HapMap Project and 1000 Genomes Project, continued the HGP's work, capturing extensive variation and genomic diversity within humans. Recently, newer initiatives have significantly increased in scale and expanded beyond genomics, offering a more detailed...
                    06-25-2024, 06:43 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 07-10-2024, 07:30 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-03-2024, 09:45 AM
                  0 responses
                  201 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-03-2024, 08:54 AM
                  0 responses
                  211 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 07-02-2024, 03:00 PM
                  0 responses
                  192 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X