Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq2 collinearity issue

    Hi everybody,

    I have been trying to use DESeq2 for analyzing RNA-seq data, and ran into a problem.

    The conditions table looks like this:

    Code:
    sample	donor	virus	vpu
    DonorA1_01	A1	none	mock
    DonorA1_02	A1	CH293	wt
    DonorA1_03	A1	CH293	stop
    DonorA1_04	A1	CH293	R50K
    DonorA1_05	A1	CH293	teth_count
    DonorA1_06	A1	CH077	wt
    DonorA1_07	A1	CH077	stop
    DonorA1_08	A1	CH077	R50K
    DonorA1_09	A1	CH077	teth_count
    DonorA1_10	A1	STC01	wt
    DonorA1_11	A1	STC01	stop
    DonorA1_12	A1	STC01	R50K
    DonorA1_13	A1	STC01	teth_count
    DonorX_01	X	none	mock
    DonorX_02	X	CH293	wt
    DonorX_03	X	CH293	stop
    DonorX_04	X	CH293	R50K
    DonorX_05	X	CH293	teth_count
    DonorX_06	X	CH077	wt
    DonorX_07	X	CH077	stop
    DonorX_08	X	CH077	R50K
    DonorX_09	X	CH077	teth_count
    DonorX_10	X	STC01	wt
    DonorX_11	X	STC01	stop
    DonorX_12	X	STC01	R50K
    DonorX_13	X	STC01	teth_count
    DonorY_01	Y	none	mock
    DonorY_02	Y	CH293	wt
    DonorY_03	Y	CH293	stop
    DonorY_04	Y	CH293	R50K
    DonorY_05	Y	CH293	teth_count
    DonorY_06	Y	CH077	wt
    DonorY_07	Y	CH077	stop
    DonorY_08	Y	CH077	R50K
    DonorY_09	Y	CH077	teth_count
    DonorY_10	Y	STC01	wt
    DonorY_11	Y	STC01	stop
    DonorY_12	Y	STC01	R50K
    DonorY_13	Y	STC01	teth_count
    DonorZ_01	Z	none	mock
    DonorZ_02	Z	CH293	wt
    DonorZ_03	Z	CH293	stop
    DonorZ_04	Z	CH293	R50K
    DonorZ_05	Z	CH293	teth_count
    DonorZ_06	Z	CH077	wt
    DonorZ_07	Z	CH077	stop
    DonorZ_08	Z	CH077	R50K
    DonorZ_09	Z	CH077	teth_count
    DonorZ_10	Z	STC01	wt
    DonorZ_11	Z	STC01	stop
    DonorZ_12	Z	STC01	R50K
    DonorZ_13	Z	STC01	teth_count
    When I specify the model for differential expression analysis as dds <- DESeqDataSetFromTximport(txi, samples, ~vpu+donor+virus), I get an error message:

    Error in checkFullRank(modelMatrix) :
    the model matrix is not full rank, so the model cannot be fit as specified.
    One or more variables or interaction terms in the design formula are linear
    combinations of the others and must be removed.

    Same problem if the model is “vpu + virus” only.

    I understand that this is because of collinearity among the variables but I am not sure how to resolve the issue.

    Any help would be highly appreciated!

    Thanks!
    Chris

  • #2
    The problem is that "virus==none" is the same as "vpu==mock".

    Comment


    • #3
      Thanks a lot!
      With respect to the most important comparisons, which is VPU "wt" vs. "stop", leaving out the non-infected samples works.
      But if I want to compare non-infected vs. infected samples, how could this be resolved? Using an additional dummy variable?
      How would that look like?

      Thanks!
      Chris

      Comment


      • #4
        My guess (since I haven't a clue about the background to your experiment) is that you want vpu="Wt" for the mock infected samples. The combination of vpu and virus would then be distinct.

        *Edit*: Alternatively, set the virus="none" samples to what they're mock infected with (e.g., CH293), which I suspect will be clearer.

        Comment


        • #5
          Perfect, thanks a lot for your help, Devon!

          Best,
          Chris

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Best Practices for Single-Cell Sequencing Analysis
            by seqadmin



            While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
            06-06-2024, 07:15 AM
          • seqadmin
            Latest Developments in Precision Medicine
            by seqadmin



            Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

            Somatic Genomics
            “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
            05-24-2024, 01:16 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 07:23 AM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 06-17-2024, 06:54 AM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 06-14-2024, 07:24 AM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 06-13-2024, 08:58 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Working...
          X