Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq2 with no replicates - strange results

    Hi all,

    I am using the DESeq2 package to analyse my RNA-Seq data set from the fruit fly (D. melanogaster). Unfortunately there are no replicates.

    I know this is not optimal and one can't really relay on the statistical strength of the results, but we can still look into the data and relay on the fold-induction differences between the samples.

    This is also the reason for my question.
    I know the variance might be over-estimated, but what I don not understand is, why I get strange BaseMean and FoldChange results.

    This is how I run DESeq2:
    cds <- DESeqDataSetFromMatrix (
    countData = Comp,
    colData   = colData,  
    design    = ~condition
    fit = DESeq(cds)
    res = results(fit)
    But when I look at the results, I get the wrong numbers.
    the raw values from my samples:
                sample1  sample2
    FBgn0085379        1    4
    [B]FBgn0085380      104  117[/B]
    FBgn0085382      101  137
    FBgn0085383       88  187
    FBgn0085384       90  275
    FBgn0085385       18   55
    [B]FBgn0085386       40   40[/B]
    FBgn0085387       16  310
    FBgn0085388      910 3333
    FBgn0085390      192  179
    FBgn0085391       96  359
    and these is a snippet off the results from the "differential expression" analysis:
    log2 fold change (MAP): condition sample2 vs sample1 
    Wald test p-value: condition sample2 vs sample1 
    DataFrame with 11 rows and 6 columns
                  baseMean log2FoldChange     lfcSE       stat    pvalue      padj
                 <numeric>      <numeric> <numeric>  <numeric> <numeric> <numeric>
    FBgn0085379   2.047768      0.1776917  1.656357  0.1072786 0.9145679  0.999346
    [B]FBgn0085380 119.997967     -1.0010375  1.365438 -0.7331255 0.4634819  0.999346[/B]
    FBgn0085382 123.804622     -0.7832908  1.339541 -0.5847457 0.5587187  0.999346
    FBgn0085383 128.899132     -0.2415351  1.299186 -0.1859127 0.8525132  0.999346
    FBgn0085384 157.869569      0.2069421  1.275569  0.1622352 0.8711206  0.999346
    ...                ...            ...       ...        ...       ...       ...
    [B]FBgn0085386   44.59838     -1.0634868  1.528435 -0.6958011 0.4865534  0.999346[/B]
    FBgn0085387  109.25461      2.2342308  1.536826  1.4537959 0.1460029  0.999346
    FBgn0085388 1768.01176      0.4434720  1.179468  0.3759932 0.7069220  0.999346
    FBgn0085390  210.03007     -1.2372886  1.341767 -0.9221335 0.3564590  0.999346
    FBgn0085391  188.81235      0.4581024  1.270345  0.3606124 0.7183892  0.999346
    My questions regards the values in the line "FBgn0085380" and "FBgn0085386", just as an example.

    In the raw data for the first gene shows a slight higher read counts for sample2, while the number is equal for the second gene. But in the results of the differential expression I get a different picture.
    for the first gene I get a BaseMean of ~119, though the numer of reads is lower, in the second I have a similar picture. The FoldChange values are off in the same way.
    I get in both a downregulation in my first sample, though the number of reads is higher in the second or equal in the two samples respectively.

    Is there an explanation for this behaviour? Are the numbers off due to the fact, that I have no replicate and all the samples are regarded as replicates ( but this still doesn't explain the BaseMean values)?

    Thanks in advance


  • #2
    Because the result is the data after it has been corrected for library size. E.g. if one sample had 20M reads and the other had 16M reads then they can't be compared directly. DESeq corrects for this and that is why you get so many decimal places.

    Try this:
    apply(Comp, 2, sum)
    Last edited by Jeremy; 07-23-2014, 01:53 AM.


    • #3

      oh yes you're right.

      I have totally forgot it.

      yes I checked the normalized values and it looks better:
                     sample1     sample2
      FBgn0085379    1.608047    2.48749
      FBgn0085380  167.236850   72.75908
      FBgn0085382  162.412710   85.19653
      FBgn0085383  141.508104  116.29016
      FBgn0085384  144.724197  171.01494
      FBgn0085385   28.944839   34.20299
      FBgn0085386   64.321865   24.87490
      FBgn0085387   25.728746  192.78048
      FBgn0085388 1463.322439 2072.70108
      FBgn0085390  308.744954  111.31518
      FBgn0085391  154.372477  223.25223
      how embarrassing



      Latest Articles


      • seqadmin
        The Impact of AI in Genomic Medicine
        by seqadmin

        Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
        02-26-2024, 02:07 PM
      • seqadmin
        Multiomics Techniques Advancing Disease Research
        by seqadmin

        New and advanced multiomics tools and technologies have opened new avenues of research and markedly enhanced various disciplines such as disease research and precision medicine1. The practice of merging diverse data from various ‘omes increasingly provides a more holistic understanding of biological systems. As Maddison Masaeli, Co-Founder and CEO at Deepcell, aptly noted, “You can't explain biology in its complex form with one modality.”

        A major leap in the field has
        02-08-2024, 06:33 AM





      Topics Statistics Last Post
      Started by seqadmin, 02-28-2024, 06:12 AM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 02-23-2024, 04:11 PM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 02-21-2024, 08:52 AM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 02-20-2024, 08:57 AM
      0 responses
      Last Post seqadmin