Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq: "NA" generated in the resulted differentially expressed genes

    I am using DESeq to analyze my RNA-seq data. However, I found in my generated differentially expressed genes there were a a bunch of "NA". Please see the attached table for details. The number of those "NA" genes is different for different comparisons.

    id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj resVarA resVarB
    NA NA NA NA NA NA NA NA NA NA NA
    NA.1 NA NA NA NA NA NA NA NA NA NA
    616 GPR128 187.5648803 0 234.4561004 Inf Inf 1.19E-15 1.16E-12 0 19.90527498


    Is there anyone else who experience this before? What could be the problem? Thanks.

  • #2
    Originally posted by idyll_ty View Post
    I am using DESeq to analyze my RNA-seq data. However, I found in my generated differentially expressed genes there were a a bunch of "NA". Please see the attached table for details. The number of those "NA" genes is different for different comparisons.

    id baseMean baseMeanA baseMeanB foldChange log2FoldChange pval padj resVarA resVarB
    NA NA NA NA NA NA NA NA NA NA NA
    NA.1 NA NA NA NA NA NA NA NA NA NA
    616 GPR128 187.5648803 0 234.4561004 Inf Inf 1.19E-15 1.16E-12 0 19.90527498


    Is there anyone else who experience this before? What could be the problem? Thanks.
    Did you check the the same set of geneID used for read-counting is identical for every samples (gene with no read : 0)?
    Marco

    Comment


    • #3
      Yes, the genes names are consistent.

      I find the problem. Because in my input read count data, for some genes, there are no reads mapped at all, and those genes cause NA values in the results.

      Comment


      • #4
        specifically, you mean there is no reads mapped at all.

        Do you mean for a majority of samples within 1 group has no reads at all, and thus shows "0"?

        Visually, suppose there are 2 samples per group

        [group1] sample 1: 0, sample 2: 0
        [group2] sample 1: 1, sample 2: 2

        ?
        Marco

        Comment


        • #5
          Originally posted by marcowanger View Post
          specifically, you mean there is no reads mapped at all.

          Do you mean for a majority of samples within 1 group has no reads at all, and thus shows "0"?

          Visually, suppose there are 2 samples per group

          [group1] sample 1: 0, sample 2: 0
          [group2] sample 1: 1, sample 2: 2

          ?
          idyll_ty, have you checked your data??
          Marco

          Comment


          • #6
            Typically, such entries appear in R when subsetting with a conditional expression that may contain or result in NA. Please post your full R code (and the output of sessionInfo()), the we can have a look.

            Comment


            • #7
              Dear DESeq experts,

              Apologies for continuing this thread - I have an identical problem.

              I try to identify Differentially Expressed Genes (DEG) from a known dataset. I am trying to understand why over 80% of entries with 'NA' values are obtained extracted from counts table as obtained via DESeq_1.8.2. I have seen similar queries in the forum but I believe I am using the latest release of DESeq that is not a development version. However, if this issue is fixed in an updated version please let us know and how do we load that library in R? Thanks.
              Best,
              sarosh


              A two step approach to my workflow is as follows:


              Part_1- extract dataset (pasilla dataset)
              Part_2- use DESeq library calls to identify DEGs and show the 'NA' values.

              This dataset (countstable.txt) has
              14,470 entries with count information, of which
              ~2,500 entries with count information 0 for all case replicates


              Code:
              ################################################
              #
              #Part_1- extract a dataset
              #
              
              rm(list = ls());
              
              #require(DESeq);
              require(pasilla);
              
              data("pasillaGenes");
              
              head(counts(pasillaGenes));
              
              #save_data to view and contrast
              write.table(counts(pasillaGenes), file="countstable.txt", quote=FALSE, sep="  ", row.names=TRUE);

              ################################################

              #edit countstable.txt - remove header
              #count the number of entries with all counts 0
              # (use grep command ..)
              #start R again


              Code:
              ################################################
              ################################################
              #
              #Part_2
              
              require(DESeq);
              require(pasilla);
              
              countsTable <-read.table("countstable.txt", header=TRUE, stringsAsFactors=TRUE)
              rownames( countsTable ) <- countsTable$gene
              countsTable <- countsTable[,-1]
              conds=c("U","U","U","U","T","T","T");
              
              cds <- newCountDataSet( countsTable, conds);
              cds <-estimateSizeFactors(cds);
              
              #normcds <- counts( cds, normalized=TRUE );
              #write.table(normcds, file="normalized.countstable.txt", quote=FALSE, sep="\t", row.names=TRUE);
              
              cds <- estimateDispersions( cds, sharingMode="fit-only" );
              res <- nbinomTest(cds, "U","T");
              
              resSig <- res[ res$padj < 0.05,];
              resSig <- resSig[ order(resSig$pval), ];
              write.table(resSig, file="DEGsig_list.txt", quote=FALSE, sep="\t", row.names=FALSE);
              
              #############################################
              Final list of DEG has a large majority of NA entries.

              Comment


              • #8
                When you make the resSig it keeps the same number of lines that were in res and just writes NA for all those lines that did not meet the cutoff. The presence of NA in these lines is not a problem.

                To get rid of them, just use the na.omit function:

                Code:
                resSig<-na.omit(resSig)
                This will omit all lines that have an NA, leaving you only those lines with differentially expressed genes.

                Comment


                • #9
                  i usually trim out zero count genes (across all samples) before calling newCountDataSet like so...

                  Code:
                  mycounts <- mycounts[rowSums(mycounts) > 0,]
                  /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
                  Salk Institute for Biological Studies, La Jolla, CA, USA */

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Choosing Between NGS and qPCR
                    by seqadmin



                    Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                    10-18-2024, 07:11 AM
                  • seqadmin
                    Non-Coding RNA Research and Technologies
                    by seqadmin




                    Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                    Nobel Prize for MicroRNA Discovery
                    This week,...
                    10-07-2024, 08:07 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 11-01-2024, 06:09 AM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-30-2024, 05:31 AM
                  0 responses
                  14 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-24-2024, 06:58 AM
                  0 responses
                  24 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 10-23-2024, 08:43 AM
                  0 responses
                  52 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X