Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq no recognizing row.names

    Hi,

    I'm trying to use DESeq to know the differential expressed genes of my datasets and i'm encountering that DESeq is not recognizing my row.names so i can't create my cds.

    My .csv input file looks like:

    Code:
    transcript_id,C4,CRL_2APR10,CRL_1_15JUL11,CRL_2_15JUL11 
    comp1000201_c0_seq1,5.00,0.00,0.00,0.00
    comp1000297_c0_seq1,7.00,0.00,0.00,0.00
    comp100036_c0_seq1,0.00,0.00,0.00,0.00
    comp10003_c1_seq1,2.00,0.00,0.00,0.00
    comp100041_c0_seq1,3.00,0.00,0.00,0.00
    comp100041_c0_seq2,0.00,0.00,0.00,0.00
    comp100041_c0_seq3,0.00,0.00,0.00,0.00
    comp100051_c0_seq1,0.00,0.00,0.00,0.00
    comp1000890_c0_seq1,3.00,0.00,0.00,0.00
    This is what i'm running:

    Code:
    > spercysts_vs_embryos = read.csv (
    +   file.choose(), 
    +   header = TRUE, 
    +   row.names=1, 
    +   sep = ",", 
    +   dec = ".")
    
    > head(spercysts_vs_embryos)
                        C4 CRL_2APR10 CRL_1_15JUL11 CRL_2_15JUL11
    comp1000201_c0_seq1  5          0             0             0
    comp1000297_c0_seq1  7          0             0             0
    comp100036_c0_seq1   0          0             0             0
    comp10003_c1_seq1    2          0             0             0
    comp100041_c0_seq1   3          0             0             0
    comp100041_c0_seq2   0          0             0             0
    
    >cond = factor(c("SP", "SP", "EB", "EB"))
    
    > spercysts_vs_embryosDesign = data.frame(
    +   row.names = colnames( spercysts_vs_embryos ), 
    +   condition = c( "SP", "SP", "EB", "EB" ), 
    +   libType = c( "paired-end", "paired-end", "paired-end", "paired-end" ) )
    > spercysts_vs_embryosDesign
                  condition    libType
    C4                   SP paired-end
    CRL_2APR10           SP paired-end
    CRL_1_15JUL11        EB paired-end
    CRL_2_15JUL11        EB paired-end
    
    > str(spercysts_vs_embryos)
    'data.frame':	307048 obs. of  4 variables:
     $ C4           : num  5 7 0 2 3 0 0 0 3 0 ...
     $ CRL_2APR10   : num  0 0 0 0 0 0 0 0 0 0 ...
     $ CRL_1_15JUL11: num  0 0 0 0 0 0 0 0 0 10 ...
     $ CRL_2_15JUL11: num  0 0 0 0 0 0 0 0 0 3 ...
    So, everything looks fine to me. But when i try to create my cds:

    Code:
    > cds <-newCountDataSet(spercysts_vs_embryos, cond )
    Error in newCountDataSet(spercysts_vs_embryos, cond) : 
      The countData is not integer.
    So, if i check what is happening:

    Code:
    > which( is.na(spercysts_vs_embryos), arr.ind=TRUE )
         row col
    Any suggestions???
    Thanks!

  • #2
    looks like there'ssome non-integer, have you tried tail(spercysts_vs_embryos) ? Once I had some non-integer in tail
    pbseq

    Comment


    • #3
      Please do not crosspost the same question simultaneously in two forums (SeqAnswers and Bioconductor mailing list).

      Comment


      • #4
        Originally posted by Simon Anders View Post
        Please do not crosspost the same question simultaneously in two forums (SeqAnswers and Bioconductor mailing list).
        Sorry Simon, i was desperate...

        Comment


        • #5
          Originally posted by pbseq View Post
          looks like there'ssome non-integer, have you tried tail(spercysts_vs_embryos) ? Once I had some non-integer in tail
          pbseq
          > tail(spercysts_vs_embryos)
          C4 CRL_2APR10 CRL_1_15JUL11 CRL_2_15JUL11
          comp99965_c0_seq1 3 0 11 0
          comp99972_c0_seq1 0 0 22 0
          comp99988_c0_seq2 0 0 0 0
          comp99995_c0_seq1 2 0 0 0
          comp999991_c0_seq1 3 0 9 0
          comp99999_c0_seq1 5 0 0 0

          Comment


          • #6
            Hi,

            Perform colSums(spercysts_vs_embryos), to see if there are any decimal vaues
            in the sums.

            Thanks
            --
            Muthu

            Comment


            • #7
              Originally posted by muthu545 View Post
              Hi,

              Perform colSums(spercysts_vs_embryos), to see if there are any decimal vaues
              in the sums.

              Thanks
              --
              Muthu

              > colSums(spercysts_vs_embryos)
              C4 CRL_2APR10 CRL_1_15JUL11 CRL_2_15JUL11
              17856472 4152157 27308366 3531719

              Comment


              • #8
                Hi,

                Is it possible for you to attach the csv file (if in case you do not mind)
                in_order to replicate the same problem you encounter.

                Thanks
                --
                Muthu

                Comment


                • #9
                  it's not very polished , but I'd try:
                  new_DF =data.frame(cbind(as.integer(spercysts_vs_embryos[,1]),as.integer(spercysts_vs_embryos[,2]),as.integer(spercysts_vs_embryos[,3])))

                  then to get back to proper colnames:
                  colnames(new_DF)=c("a","b","c")

                  Comment


                  • #10
                    Hi all,

                    I discovered that the problem was that RSEM was generating (for some reason that i cannot explain) decimal number in the column of expected count where you are suppose to have only integer numbers... I fixed it with excel (i know that is not a fancy way but i didn't know how to do it).

                    Thanks,
                    alisrpp

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Recent Advances in Sequencing Analysis Tools
                      by seqadmin


                      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                      05-06-2024, 07:48 AM
                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 05-14-2024, 07:03 AM
                    0 responses
                    19 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-10-2024, 06:35 AM
                    0 responses
                    44 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-09-2024, 02:46 PM
                    0 responses
                    54 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-07-2024, 06:57 AM
                    0 responses
                    42 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X