Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Error at Creating Count Table for DESeq2

    I have used Tophat-CuffDiff pipeline so far but I want to give a try for DESeq2. I have 2 conditions and 3 replicates for each, aim is to find the differentially expressed genes.

    For a couple of days, I am trying to use HTSeq to prepare my count files. I guess I did it but now I am stuck at creating the count table as the DESeq2 input.

    I didn't use R that much so far, so I am having difficulties. Here is the problem:

    > library('DESeq2')
    Loading required package: GenomicRanges
    Loading required package: BiocGenerics
    Loading required package: parallel
    Attaching package: ‘BiocGenerics’
    The following objects are masked from ‘package:parallel’:
        clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply,
        parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
    The following object is masked from ‘package:stats’:
    The following objects are masked from ‘package:base’:
        anyDuplicated, append,, as.vector, cbind, colnames, duplicated, eval, evalq, Filter, Find,
        get, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax,, pmin,, Position, rank, rbind, Reduce,, rownames, sapply, setdiff, sort, table, tapply, union,
        unique, unlist
    Loading required package: IRanges
    Loading required package: XVector
    Loading required package: Rcpp
    Loading required package: RcppArmadillo
    > setwd("C:/Python27/SKMEL-5")
    > directory<-"C:/Python27/SKMEL-5/ALL"
    > sampleFiles <- grep("SKMEL-5",list.files(directory),value=TRUE)
    > sampleCondition<-c("KD","KD","KD","WT","WT","WT")
    > sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition)
    > sampleTable
           sampleName        fileName condition
    1 SKMEL-5_I-1.txt SKMEL-5_I-1.txt        KD
    2 SKMEL-5_I-2.txt SKMEL-5_I-2.txt        KD
    3 SKMEL-5_I-3.txt SKMEL-5_I-3.txt        KD
    4 SKMEL-5_L-1.txt SKMEL-5_L-1.txt        WT
    5 SKMEL-5_L-2.txt SKMEL-5_L-2.txt        WT
    6 SKMEL-5_L-3.txt SKMEL-5_L-3.txt        WT
    > ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design=~condition)
    Error in DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory,  : 
      Gene IDs (first column) differ between files.
    In addition: There were 36 warnings (use warnings() to see them)
    Here is the 36 warnings:

    Warning messages:
    1: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    2: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    3: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    4: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    5: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    6: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    7: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    8: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    9: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    10: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    11: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    12: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    13: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    14: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    15: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    16: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    17: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    18: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    19: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    20: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    21: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    22: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    23: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    24: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    25: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    26: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    27: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    28: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    29: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    30: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    31: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    32: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    33: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    34: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    35: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    36: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    Because it says "Gene IDs (first column) differ between files.", I have checked each file but all have the same number of rows and I guess the first column is same for all (well, I have used the same gtf file for all of them, so it must be).

    I know the problem is at a very basic stage but I have no clue as an R-noob.
    Last edited by sazz; 03-23-2014, 04:27 AM.

  • #2
    Solved, my files were not in Tab Delimited format :/


    • #3
      Additional answer

      I got the same issue and found your post helpful. To solve I opened the file in notepad and changed the encoding from Unicode to ANSI and then it imported cleanly into R.


      Latest Articles


      • seqadmin
        Latest Developments in Precision Medicine
        by seqadmin

        Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

        Somatic Genomics
        “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
        05-24-2024, 01:16 PM
      • seqadmin
        Recent Advances in Sequencing Analysis Tools
        by seqadmin

        The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
        05-06-2024, 07:48 AM





      Topics Statistics Last Post
      Started by seqadmin, Today, 06:55 AM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 05-30-2024, 03:16 PM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 05-29-2024, 01:32 PM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 05-24-2024, 07:15 AM
      0 responses
      Last Post seqadmin  