Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • tujchl
    Member
    • Sep 2009
    • 74

    How to compare RNA-seq with microarray

    HI,
    I want to compare my RNA-seq with microarray which is found on GEO to indicate the highly correlation between them. I search many papers but found little information about details of comparing RNA-seq and microarray.
    there is my workflow
    1. mapping probe seq to refseq to cluster refseq to certain probeset. such as probeset A can be mapped to refseq A and B.
    2. calculate reads mapping on these refseq and if one reads can be mapped to both A and B, it is counted only once.
    3. calculate pearson corlation using microarray probeset intesity and RNA-seq reads counts.

    can you tell me where can I improve? for the correlation between my RNA-seq and microarray is not very good (pearson corelation=0.589)

    Thanks
    Attached Files
  • kopi-o
    Senior Member
    • Feb 2008
    • 319

    #2
    Actually your correlation is not that bad, I would say. What do you get if you try Spearman correlation? That might be more appropriate here, as you probably have your RNA-seq and microarray expression values on different scales.

    Comment

    • tujchl
      Member
      • Sep 2009
      • 74

      #3
      Thank you for your replying, kopi-o
      the spearman correlation is almost equally as pearson, about 0.57.
      I use cor.test() in R to calculate spearman and pearson correlation.
      I also consider my correlation is not so bad. but I just follow a papper. in this papper the correlation is more than 0.7 (see below). I didn`t calculate such high value even using the same data. So I think there could be something I missed.
      I really need help, Thanks
      Attached Files

      Comment

      • staylor
        Member
        • Feb 2009
        • 17

        #4
        Originally posted by tujchl View Post
        HI,
        I want to compare my RNA-seq with microarray which is found on GEO to indicate the highly correlation between them. I search many papers but found little information about details of comparing RNA-seq and microarray.
        there is my workflow
        1. mapping probe seq to refseq to cluster refseq to certain probeset. such as probeset A can be mapped to refseq A and B.
        2. calculate reads mapping on these refseq and if one reads can be mapped to both A and B, it is counted only once.
        3. calculate pearson corlation using microarray probeset intesity and RNA-seq reads counts.

        can you tell me where can I improve? for the correlation between my RNA-seq and microarray is not very good (pearson corelation=0.589)

        Thanks
        Hi,

        I am trying to do something similar.

        I am using Normalised logged expression from the array vs log2(FPKM) from RNA-Seq on x and y respectively but as you would expect, the scales are very different. I calculate the Spearman correlation and get around 0.64.

        I echo tujchl's comments and wonder if there is a better method I should be using!?
        Attached Files

        Comment

        • Blahah404
          Member
          • Dec 2011
          • 48

          #5
          The microarray will detect many fewer genes, so perhaps the original authors took the correlation between the microarray data and a subset of the RNAseq data the same size as the microarray set.

          Comment

          • mbblack
            Senior Member
            • Aug 2009
            • 245

            #6
            Originally posted by Blahah404 View Post
            The microarray will detect many fewer genes, so perhaps the original authors took the correlation between the microarray data and a subset of the RNAseq data the same size as the microarray set.
            Not necessarily - it will depend entirely on the read depth of the RNAseq dataset. Many times, in terms of genes actually detected, you will have far more genes with valid array signal than you will have with valid RNAseq counts.

            In terms of equal coverage by genes detected, you may need far more sequence than most experiments come even close to collecting.

            Equivalence to my mind actually has two meanings in this context. One is what proportion of the genome is detected by signal. In that regard, most genomic microarrays will detect a considerably greater proportion of genes than most RNAseq experiments will, unless the RNAseq experiment involved collecting extraordinary numbers of reads.

            The second part of equivalence is statistical significance in genes detected. In that regard, even moderate read depth seems to be enough to either equal or exceed microarrays, for the genes detected in common.
            Last edited by mbblack; 09-18-2013, 10:56 AM.
            Michael Black, Ph.D.
            ScitoVation LLC. RTP, N.C.

            Comment

            • mbblack
              Senior Member
              • Aug 2009
              • 245

              #7
              Originally posted by tujchl View Post
              HI,
              I want to compare my RNA-seq with microarray which is found on GEO to indicate the highly correlation between them. I search many papers but found little information about details of comparing RNA-seq and microarray.
              there is my workflow
              1. mapping probe seq to refseq to cluster refseq to certain probeset. such as probeset A can be mapped to refseq A and B.
              2. calculate reads mapping on these refseq and if one reads can be mapped to both A and B, it is counted only once.
              3. calculate pearson corlation using microarray probeset intesity and RNA-seq reads counts.

              can you tell me where can I improve? for the correlation between my RNA-seq and microarray is not very good (pearson corelation=0.589)

              Thanks
              Are you aligning the two datasets by annotation in the same manner as the authors of the paper you mentioned? Are you including promiscuous probes in your microarray data?
              Michael Black, Ph.D.
              ScitoVation LLC. RTP, N.C.

              Comment

              • tujchl
                Member
                • Sep 2009
                • 74

                #8
                HI everyone,
                I`m back. I think there are two ways which may work
                1. align RNA-seq reads to mRNA seq (eg. refseq) then merge all the isoform to calculate gene expression and compare gene expression with microarray on gene level.

                2. in one papper (http://lib.tkk.fi/Dipl/2012/urn100639.pdf) Karolis Uziela introduce a new method which may help

                I try both there are indeed some improvement but I partially agree with Blahah404. I think there are some filterations

                Comment

                • Gonza
                  Member
                  • Mar 2013
                  • 78

                  #9
                  Hi all,
                  I am trying to figure out how to compare arrays vs RNAseq. I am not really familiar with arrays, but I am a little bit concerned that we compare 'apples with pears'. I think RNA-seq and microarrays measure slightly different things - although both using mRNA.

                  I have calculated spearman correlation for a data frame with both array and RNAseq. Biologically, the tree makes a lot of sense. I just do not know how to 'justify' that i can do this.

                  Any thoughts are welcome.

                  ### R script final step. all7 is my data.frame with 244 obs. of 20 variables ###

                  test = dist(cor(all7[,-1], method="spearman"))
                  a = hclust(test)
                  plot(a)

                  Comment

                  • NextGenSeq
                    Senior Member
                    • Apr 2009
                    • 482

                    #10
                    How old is the microarray data?
                    If they are the old Affy chips they are 3' biased.

                    We've found that comparing RNA-Seq results with the libraries made two different ways can give horrible correlation. We never use oligo-dT for RT priming if we can avoid it.

                    Comment

                    Latest Articles

                    Collapse

                    • SEQadmin2
                      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                      by SEQadmin2


                      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                      ...
                      06-02-2026, 10:05 AM
                    • SEQadmin2
                      Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                      by SEQadmin2


                      With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                      Introduction

                      Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                      05-22-2026, 06:42 AM
                    • SEQadmin2
                      Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                      by SEQadmin2

                      Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                      Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                      05-06-2026, 09:04 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, Today, 08:59 AM
                    0 responses
                    10 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 12:03 PM
                    0 responses
                    21 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 11:40 AM
                    0 responses
                    17 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 05-28-2026, 11:40 AM
                    0 responses
                    30 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...