Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • younko
    Member
    • May 2014
    • 24

    EdgeR or GLM model design question

    Dear,

    I have a question related with my design matrix.
    My RNA-seq experimental design looks like this. (paried design)


    patient gender drug
    sample1 patient1 F Pre
    sample2 patinet1 F Post
    sample3 patient2 M Pre
    sample4 patient2 M Post
    sample5 patient3 F Pre
    sample6 patient3 F Post
    sample7 patient4 F Pre
    sample8 patient4 F Post
    sample9 patient5 F Pre
    sample10 patient5 F Post
    sample11 patient6 M Pre
    sample12 patient6 M Post

    =====================================

    What I want to test is this.

    Q1) DEG responsing to drug... (regardless of gender)
    Q2) DEG responsing to drug on female
    DEG responsing to drug on male..

    More ultimately, Does genes reponse to the drug differently based on gender??? (or not really?? ) In other words, gender is the significant factor to drug response??

    Q3) Which gender is more sensitive to the drug...??

    ==================================================
    In order to test or answer Q1 , my model looks like this.

    design<- model.matrix(~drug + drug:gender, data=data)

    colnames(design)
    ##################################################
    [1] "(Intercept)" "drugpost" "drugpre:genderM" "drugpost:genderM"
    ###################################################

    lrt1 <- glmLRT(fabry_fit, coef=2) # drugpost

    returns the genes responsing to the drug.. (Q1)


    ======================================


    lrt2 <- glmLRT(fabry_fit, coef=4) # drugpost:genderM

    does it returns that genes responding the drug depending on gender differently ?? In this case, we can answer "DEG responsing to drug on female
    DEG responsing to drug on male.. " ?? Q2 ...........

    =======================================================

    Then, what is the meaing of drugpre:genderF???

    Could you please somebody to explain the meaning of this two model??

    Again, I would like to differentally expressed genes under the drug treatment as well as gender.
    (e.g. up after drug treatment for male, but not for female.. something like this... )

    thanks in advance,
    Last edited by younko; 08-29-2014, 12:03 AM.
  • younko
    Member
    • May 2014
    • 24

    #2
    One more question:

    I am trying to do several changes..

    In the same experimental design,

    I created two model..

    MODEL1
    # model.matrix(~drug + drug:gender)
    #[1] "(Intercept)" "drugpost" "drugpre:genderM" "drugpost:genderM"

    MODEL2
    # model.matrix(~drug + gender + drug:gender)
    #[1] "(Intercept)" "drugpost" "genderM" "drugpost:genderM"

    When I apply with these two models, I thought that last coefficient should test the same thing....

    BUT MAYBE NOT

    When, I compare the result from these two coefficient results, the significant genes are totally different.. which I am confusing..

    out1 <- glmLRT(dd_model1, coef=4)
    out2 <- glmLRT(dd_model2, coef=4)


    Please help me with this!
    Last edited by younko; 08-29-2014, 12:04 AM.

    Comment

    • Gordon Smyth
      Member
      • Apr 2011
      • 91

      #3
      Your experimental design is the same as that covered in Section 3.5 of the edgeR User's Guide called "Comparisons Both Between and Within Subjects". You can simply follow the advice and code given in the User's Guide.

      Comment

      • younko
        Member
        • May 2014
        • 24

        #4
        Thank you so much.

        However, I still have a problem. After reading that example, I changed my model like this

        model.matrix(~drug + gender + gender:drug)
        #[1] "(Intercept)" "drugpost" "genderM" "drugpost:genderM"

        I exepct to see the following coefficients such as

        genderF:drugpost : genes responding to the drug in female:
        genderM : drugpost : genes responding to the drug in male


        But I can only see "drugpost:genderM" coefficient

        genes responding to the gender in drug treatment.


        Why this happens? Do I misunderstand??

        Comment

        • Gordon Smyth
          Member
          • Apr 2011
          • 91

          #5
          You haven't followed the example in the User's Guide. The User's Guide tells you to fit two interaction terms but you have fitted only one. You have to follow the example exactly.

          Your gender variable corresponds to Disease in the example, while your drug variable corresponds to Treatment in the example.

          Comment

          • younko
            Member
            • May 2014
            • 24

            #6
            Thank you Gorden,

            I tried as you suggest.. but it seems my result looks weird..

            when I try my model looks like this

            # model.matrix(~drug + drug:gender)
            #[1] "(Intercept)" "drugpost" "drugpre:genderM" "drugpost:genderM"

            * cpm value for top ranked gene based on drugpost:gender coefficent looks like this.

            KYM_pre(F) KYM_post(F) GDY_pre(M) GDY_post(M) HKS_pre(F) HKS_post(F) KWK_pre(F) KWK_post(F) PYM_pre(F) PYM_post(F) SHM_pre(M) SHM_post(M)
            RPS4Y1 0 0 149.5080373 79.3694069 0 0 0 0 0 0 201.5823721 265.40717
            KDM5D 0 0 63.02199964 50.10588553 0 0 0 0.186331781 0 0 30.12833879 88.66478468
            PRKY 0 0 35.19605708 37.68467861 0 0 0 0 0 0 15.26238215 54.41238442
            UTY 0 0 20.75664905 26.10558742 0 0 0 0 0 0 4.55889337 13.1137761
            DDX3Y 0 0 29.78127907 54.94805093 0 0.191243459 0 0 0 0 4.757106125 15.46251212

            But when I used the model like this as you suggested (If I understand correctly..

            # model.matrix(~gender + gender:drug)

            cpm value for top ranked genes for this coefficients genderM:drugpost looks like this.

            KYM_pre(F) KYM_post(F) GDY_pre(M) GDY_post(M) HKS_pre(F) HKS_post(F) KWK_pre(F) KWK_post(F) PYM_pre(F) PYM_post(F) SHM_pre(M) SHM_post(M)
            IL4I1 2.9742981 1.7673777 6.9276722 1.686450 2.626656 2.2986379 3.536214 1.679694 1.2629356 1.6349216 9.1263648 2.5477597
            FKBP5 75.6793627 52.6678567 99.2464349 669.520561 148.162849 86.5820270 113.480321 87.530734 65.2516736 72.1409155 40.8702425 65.2618450
            ARG1 5.2876411 4.5951821 2.7108283 23.610297 5.156028 3.4479568 4.982847 8.585104 6.1041888 3.6785736 0.7935969 1.9598152
            MYO5B 0.9914327 0.0000000 0.1506016 1.264837 2.821223 2.4901910 2.571792 1.306429 0.8419571 0.6130956 0.0000000 1.9598152
            LOC100130855 1.6523878 2.1208533 0.7530079 6.956605 6.226147 3.0648505 4.661373 3.359389 0.4209785 0.8174608 0.0000000 0.7839261

            If I undestand corrently,

            coefficents genderF:drugpost returns the genes responding the drug at gender F

            coefficents genderM:drugpost returns the genes responding the drug at gender M.

            But what I looked at the result cpm value, it seems that coeffient drugpost:gender results looks more reasonable..

            Am I missing something??
            Last edited by younko; 08-29-2014, 12:45 AM.

            Comment

            • Gordon Smyth
              Member
              • Apr 2011
              • 91

              #7
              You haven't tried what I suggested.

              Comment

              • younko
                Member
                • May 2014
                • 24

                #8
                Hm..

                This model is the one you suggest.. Isn't it?

                model.matrix(~gender + gender:drug)
                #[1] "(Intercept)" "genderM" "genderF:drugpost" "genderM:drugpost"

                In case, I also tested with this one (which is exactly same with 3.5 EdgeR example.

                model.matrix(~gender + genderatient + gender:drug)
                #[1] "(Intercept)" "genderM" "genderFatient" "genderMatient" "genderF:drugpost"
                #[6] "genderM:drugpost"
                ###########################
                Am I missing?

                =================================


                coefficient for "genderF:drugpost" and "genderM:drugpost" represents that the the drug response genes in Female, drug response genes in Male respectively..Isn't it?


                The first model and second model reports the similar genes as significant ones... which is similar results what I report above.. ;(
                Last edited by younko; 08-31-2014, 09:41 PM.

                Comment

                • Gordon Smyth
                  Member
                  • Apr 2011
                  • 91

                  #9
                  Originally posted by younko View Post
                  Hm..

                  This model is the one you suggest.. Isn't it?

                  model.matrix(~gender + gender:drug)
                  #[1] "(Intercept)" "genderM" "genderF:drugpost" "genderM:drugpost"
                  I have no idea where you got this model from. I did not suggest it. It is obviously wrong because it ignores the patient information.

                  In case, I also tested with this one (which is exactly same with 3.5 EdgeR example.

                  model.matrix(~gender + genderatient + gender:drug)
                  #[1] "(Intercept)" "genderM" "genderFatient" "genderMatient" "genderF:drugpost"
                  #[6] "genderM:drugpost"
                  ###########################
                  Yes, this is what I have been suggesting all along.

                  coefficient for "genderF:drugpost" and "genderM:drugpost" represents that the the drug response genes in Female, drug response genes in Male respectively..Isn't it?
                  Yes, that is what the edgeR User's Guide tells you.

                  The first model and second model reports the similar genes as significant ones... which is similar results what I report above.. ;(
                  The two models are not the same.

                  Have you defined the patient factor correctly as described in the edgeR User's Guide?

                  Comment

                  • younko
                    Member
                    • May 2014
                    • 24

                    #10
                    Thank you Gordon..

                    Now, I have a brief idea..

                    Yes. I followed the exact format as EdgeR user guide.


                    gender patient drug
                    1 M 1 pre
                    2 M 1 post
                    3 M 2 pre
                    4 M 2 post
                    5 F 1 pre
                    6 F 1 post
                    7 F 2 pre
                    8 F 2 post
                    9 F 3 pre
                    10 F 3 post
                    11 F 4 pre
                    12 F 4 post

                    This is my data structure.

                    My model is "~gender + genderatient + gender:drug"

                    In here, I have one more question regaring to the edgeR manual for 3.5

                    They say that Disease and treatment should be defined as a factor..
                    But, I don't think the patient columne should be a factor......

                    So, without defining the patient as factor., my model "~gender + genderatient + gender:drug" returns that

                    "[1] "(Intercept)" "genderM" "genderFatient" "genderMatient" "genderF:enzymepost"
                    #[6] "genderM:enzymepost"
                    ############################"

                    BUT, the edgeR manual shows that
                    > design <- model.matrix(~Disease+Disease:Patient+Disease:Treatment)
                    > colnames(design)
                    [1] "(Intercept)" "DiseaseDisease1"
                    [3] "DiseaseDisease2" "DiseaseHealthy:Patient2"
                    [5] "DiseaseDisease1:Patient2" "DiseaseDisease2:Patient2"
                    [7] "DiseaseHealthy:Patient3" "DiseaseDisease1:Patient3"
                    [9] "DiseaseDisease2:Patient3" "DiseaseHealthy:TreatmentHormone"
                    [11] "DiseaseDisease1:TreatmentHormone" "DiseaseDisease2:TreatmentHormone"
                    it seems that Patient is also defined as factor!!!

                    Do I have to define patient as factor?? ( I don't think so.. ;()

                    Also, in my case, since the number of patient is different in female (4) vs male (2) , when I defined as factor for patient column, I got the following error...

                    "Error in glmFit.default(y, design, dispersion = par^4, offset = offset, :
                    Design matrix not of full rank. The following coefficients not estimable:
                    genderMatient3 genderMatient4"
                    Last edited by younko; 08-31-2014, 11:18 PM.

                    Comment

                    • younko
                      Member
                      • May 2014
                      • 24

                      #11
                      Oh.. I got it.. It seems that we posted at the same time..
                      I have updated my posting.. but I got your points!!!!!!!!!

                      I really appreciate your helps!!!!!!!!!!!!!!!! Thank you SO MUCH!!

                      I have learned a lot through this thread!! Again, thank you so much!

                      Youn
                      Last edited by younko; 08-31-2014, 11:29 PM.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                        by SEQadmin2


                        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                        Here are nine questions we think about, in roughly the order they matter, before...
                        06-18-2026, 07:11 AM
                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-26-2026, 11:10 AM
                      0 responses
                      13 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-17-2026, 06:09 AM
                      0 responses
                      48 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-09-2026, 11:58 AM
                      0 responses
                      107 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-05-2026, 10:09 AM
                      0 responses
                      125 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...