Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • r_sitaram
    Member
    • Apr 2012
    • 10

    DEXSeq output not convincing

    Hello,
    I'm running DEXSeq for finding out the differential exon usage between 3 Wild type and 3 transgenic rat data. I obtained the exon count using the scripts dexseq_count.py and the gff file using dexseq_prepare_annotation.py. After following all the steps given in your vignette,When I run the DEUresultTable command, I get a lot of NA's in the pvalue column and just 1's in the padjust column. Also after executing the estimateDispersions function, the column dispersion_CR_est is showing NULL. I have a feeling that I'm making a mistake in the exoncount data object but I'm not sure. Could someone throw some light on this?

    I could probably highlight the steps(in bold) I have done.

    annotationfile = file.path("/triton/ics/scratch/rnaseq/exoncount/Rattus_norvegicus.RGSC3.4.66_DEXSEQ.gff")

    annotationfile
    [1] "/triton/ics/scratch/rnaseq/exoncount/Rattus_norvegicus.RGSC3.4.66_DEXSEQ.gff"

    samples=data.frame(condition=c(rep("wild",3),rep("trans",3)),replicate=c(1,1,1,1,1,1),stringsAsFactors = TRUE,check.names = FALSE)
    row.names(samples) <- c("01_5b_WT.txt","02_6b_WT.txt","03_15b_WT.txt","04_4b_TG.txt","05_7b_TG.txt","06_16b_TG.txt")


    samples
    condition replicate
    01_5b_WT.txt wild 1
    02_6b_WT.txt wild 1
    03_15b_WT.txt wild 1
    04_4b_TG.txt trans 1
    05_7b_TG.txt trans 1
    06_16b_TG.txt trans 1

    countfiles = file.path("/triton/ics/scratch/rnaseq/exoncount",paste(row.names(samples))

    countfiles
    [1] "/triton/ics/scratch/rnaseq/exoncount/01_5b_WT.txt"
    [2] "/triton/ics/scratch/rnaseq/exoncount/02_6b_WT.txt"
    [3] "/triton/ics/scratch/rnaseq/exoncount/03_15b_WT.txt"
    [4] "/triton/ics/scratch/rnaseq/exoncount/04_4b_TG.txt"
    [5] "/triton/ics/scratch/rnaseq/exoncount/05_7b_TG.txt"
    [6] "/triton/ics/scratch/rnaseq/exoncount/06_16b_TG.txt"

    ecs = read.HTSeqCounts(countfiles,design=samples,flattenedfile=annotationfile)

    ExonCountSet (storageMode: environment)
    assayData: 236327 features, 6 samples
    element names: counts
    protocolData: none
    phenoData
    sampleNames: 01_5b_WT.txt 02_6b_WT.txt ... 06_16b_TG.txt (6 total)
    varLabels: sizeFactor condition replicate countfiles
    varMetadata: labelDescription
    featureData
    featureNames: ENSRNOG00000000001:001 ENSRNOG00000000001:002 ...
    ENSRNOG00000045521:001 (236327 total)
    fvarLabels: geneID exonID ... transcripts (13 total)
    fvarMetadata: labelDescription
    experimentData: use 'experimentData(object)'
    Annotation:

    sampleNames(ecs) = row.names(samples)

    ecs <- estimateSizeFactors(ecs)


    sizeFactors(ecs)
    01_5b_WT.txt 02_6b_WT.txt 03_15b_WT.txt 04_4b_TG.txt 05_7b_TG.txt
    1.6794610 0.5407073 0.8215439 1.7972423 0.8516481
    06_16b_TG.txt
    0.9878676

    ecs <- estimateDispersions(ecs)


    featureData(ecs)$dispersion_CR_est
    NULL

    ecs <- fitDispersionFunction(ecs)

    ecs <- testForDEU(ecs)

    res <- DEUresultTable(ecs)
  • ngsAnalytics
    Junior Member
    • Apr 2012
    • 3

    #2
    Hi,

    I would start by checking that the ExonContSet (ecs) was built correctly.

    A couple of things to try

    design(ecs) - This should match what you intended.

    sampleNames(ecs) - the row names of the design output should match this

    table(geneIDs(ecs)) will show you how many genes are in the set.

    you can also try head(modelFrameforGene(ecs, "Pick some gene" )) of course replace the Pick some gene with the name of a gene from your genome.
    This will show if something for counts was read in.

    Also normally the values for replicate would be 1,2, and 3 for each sample type. I don't think having them all set to one will cause a problem, but I have never tried it.



    Craig

    Comment

    • r_sitaram
      Member
      • Apr 2012
      • 10

      #3
      Thank you Craig. I will look into it immediately and get back here.

      Comment

      • Simon Anders
        Senior Member
        • Feb 2010
        • 995

        #4
        First of all: The command "featureData(ecs)$dispersion_CR_est" did not work because you are reading an outdated manual. The slot "dispersion_CR_est" is now called "dispBeforeSharing".

        Use the command "openVignette()" to get documentation that fits your installed package versions.

        In the current manual, you will see instructions to plot the estimated dispersions against the mean count value. Have a look, maybe your data is simply to noisy to get any results. (If you unsure how to interpret the plot, just post it here.)

        Also have a look at the count values plot for few genes (with plotDEXSeq(ecs, "my_favourite_geneID", norCounts = TRUE)). Does it look like you should detect something?

        Comment

        • r_sitaram
          Member
          • Apr 2012
          • 10

          #5
          Hello Craig,
          I checked my dataset with the commands that you mentioned. Everything looks fine. The counts were also read. I just did one change. I got information that my data had 3 replicates. So I re-ran the process by replacing the 1's with 1,2 and 3 for wild and 1,2 and 3 for trans.

          Comment

          • r_sitaram
            Member
            • Apr 2012
            • 10

            #6
            Hello Simon,
            Thanks for that. I downloaded the new vignette and proceeded as stated. Here is the plot between mean count value and dispersion.

            Could you interpret the plot ?

            I selected a particular gene for testing purpose. That gene coincidentally had data so I could visualise using the plotDEXSeq function. I'm trying to verify the result with the exon count to be sure that the output is valid.

            Comment

            • r_sitaram
              Member
              • Apr 2012
              • 10

              #7
              It looks like the image is not rendering. Here is it.
              Attached Files

              Comment

              • r_sitaram
                Member
                • Apr 2012
                • 10

                #8
                Hello Simon. I'm not sure if you received my other posts here. But I proceeded with my analysis and I have arrived at some results for which I need your help in the analysis. I sorted the dexseq output according to p value and visualised the exons.

                Gene ID: ENSRNOG00000017466
                Exon : E016
                p Value : 0.000166557183275073

                For the above mentioned gene, the fitted expression did not help me conclude anything so I went for the fitted splicing. I see that a lot of exons like E016,E017 and so on show similar results but none of their p values are even close to E016. In such a case, how can I make the required conclusion? Am I missing something in the analysis ? Is there another way to look at it?
                Attached Files

                Comment

                • Simon Anders
                  Senior Member
                  • Feb 2010
                  • 995

                  #9
                  Note how one of your three wild-type samples looks completely different than the other two. It seems that the whole left half is overrepresented. Hence, DEXSeq considers these exons as too noisy to detect anything. I'm a bit surprised that E016 got a low p value, as it does not look that much better.

                  Is this the gene with the strongest effect that you got? Then, it were justified if you got no significant hits.

                  Check a few more genes to see if it is always one sample that behaves strangely, and if so, if it is always the same. (You can instruct the plotting function to use different colours for the replicates.) If it is one sample that is behaving oddly, you may want to see what happens if you repeat the analysis without it.

                  Comment

                  • r_sitaram
                    Member
                    • Apr 2012
                    • 10

                    #10
                    Hello Simon,
                    This is the third best gene available. I checked the top 5 and I see this behaviour very distinct only in 2 of them. I’m attaching the pdf of that for your reference. The remaining do not highlight a lot. I should also mention that all the adjusted p values were 1. I also read another post of yours connected with DESeq where you had mentioned about a correlation between high variance and the output. I have posted a graph showing the connection between mean value and dispersion in reply #7. Does the graph hint anything? I'm not able to interpret that one very well.
                    Attached Files

                    Comment

                    • r_sitaram
                      Member
                      • Apr 2012
                      • 10

                      #11
                      I forgot to mention the p values for those 5 genes. They are as follows.
                      ENSRNOG00000005330 E035 0.000108786
                      ENSRNOG00000009864 E005 0.000152295
                      ENSRNOG00000017466 E016 0.000166557
                      ENSRNOG00000005434 E046 0.000262485
                      ENSRNOG00000032391 E004 0.00050834

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      19 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      29 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-26-2026, 10:12 AM
                      0 responses
                      31 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...