Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DESeq Count Tables Prep

    Hi,

    I'm new to almost all of this - R, DESeq, bioinformatics etc... (obviously a bench biologist).

    I've been given a count table from an alignment run with a custom Perl script that is a tab delimited text file. The header includies the ContigID title, and each sample name, and then each row contains the contig number, and the counts for each sample.

    How do I use this as the count file for DESeq? Can I import it as a tab delim. txt file, or do I have to change it into a file that is readable by DESeq?

    I just want to be able to get to the point where I can walk through the DESeq vignette on R with my own data.

    Thanks

  • #2
    An addition to my post:

    So I've uploaded the txt file successfully as a read.table, and uploaded library(DESeq) and am now trying to get my file to be the countsTable, and this is where I'm running into problems. My R code:

    > countsTable <- counts( myFile )
    Error in function (classes, fdef, mtable) :
    unable to find an inherited method for function "counts", for signature "data.frame"

    How do I make my read.table file (i.e. the uploaded tab delim. txt file) to act as the countsTable?

    Thanks

    Comment


    • #3
      Have you tried the newCountDataSet() function? It should work if your file that you imported with read.table() is in the correct format, which it may very well be from your description. You're not supposed to use the counts() function in this case - I think that's just a way to extract counts from a pre-existing CountDataSet object, which you don't have.

      Comment


      • #4
        Ok - so I tried the newCountDataSet() function with the imported read.table() and I received the following error:
        Error in round(countData) : Non-numeric argument to mathematical function

        This is everything that I've input so far:
        > fullmoonDN <- read.table("filelocation",header=TRUE,sep="\t")
        > conds <- factor( c("AM0008","AM0009") )
        > cds <- newCountDataSet( fullmoonDN, conds )
        Error in round(countData) : Non-numeric argument to mathematical function
        > #show the first 5 rows
        > fullmoonDN[1:5,]
        X AM0008 AM0009
        1 EZ031385 3045 3218
        2 EZ019649 12679 9190
        3 EZ010289 800 28050
        4 EF202590 816 3108
        5 EZ010226 633 459

        Is there something wrong with how my txt file is laid out? I did not have any text input for the first column name, and it automatically put an X (I tried with having the first column name as Contig_ID, and I came up with the same result). I'm not sure what I'm doing wrong here. Thoughts?

        Comment


        • #5
          Try to add "row.names=1" to the read.table command. That will make the first column into row names instead of entries in a column. The newCountDataSet() function excepts only numerical values in the input.

          Comment


          • #6
            That worked perfectly. Thank you!

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-25-2024, 11:49 AM
            0 responses
            19 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-24-2024, 08:47 AM
            0 responses
            20 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            62 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Working...
            X