Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sazz
    Member
    • Oct 2012
    • 28

    Error at Creating Count Table for DESeq2

    I have used Tophat-CuffDiff pipeline so far but I want to give a try for DESeq2. I have 2 conditions and 3 replicates for each, aim is to find the differentially expressed genes.

    For a couple of days, I am trying to use HTSeq to prepare my count files. I guess I did it but now I am stuck at creating the count table as the DESeq2 input.

    I didn't use R that much so far, so I am having difficulties. Here is the problem:

    Code:
    > library('DESeq2')
    Loading required package: GenomicRanges
    Loading required package: BiocGenerics
    Loading required package: parallel
    
    Attaching package: ‘BiocGenerics’
    
    The following objects are masked from ‘package:parallel’:
    
        clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport, clusterMap, parApply, parCapply,
        parLapply, parLapplyLB, parRapply, parSapply, parSapplyLB
    
    The following object is masked from ‘package:stats’:
    
        xtabs
    
    The following objects are masked from ‘package:base’:
    
        anyDuplicated, append, as.data.frame, as.vector, cbind, colnames, duplicated, eval, evalq, Filter, Find,
        get, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
        pmin.int, Position, rank, rbind, Reduce, rep.int, rownames, sapply, setdiff, sort, table, tapply, union,
        unique, unlist
    
    Loading required package: IRanges
    Loading required package: XVector
    Loading required package: Rcpp
    Loading required package: RcppArmadillo
    
    > setwd("C:/Python27/SKMEL-5")
    > directory<-"C:/Python27/SKMEL-5/ALL"
    > sampleFiles <- grep("SKMEL-5",list.files(directory),value=TRUE)
    > sampleCondition<-c("KD","KD","KD","WT","WT","WT")
    > sampleTable<-data.frame(sampleName=sampleFiles, fileName=sampleFiles, condition=sampleCondition)
    > sampleTable
           sampleName        fileName condition
    1 SKMEL-5_I-1.txt SKMEL-5_I-1.txt        KD
    2 SKMEL-5_I-2.txt SKMEL-5_I-2.txt        KD
    3 SKMEL-5_I-3.txt SKMEL-5_I-3.txt        KD
    4 SKMEL-5_L-1.txt SKMEL-5_L-1.txt        WT
    5 SKMEL-5_L-2.txt SKMEL-5_L-2.txt        WT
    6 SKMEL-5_L-3.txt SKMEL-5_L-3.txt        WT
    > ddsHTSeq<-DESeqDataSetFromHTSeqCount(sampleTable=sampleTable, directory=directory, design=~condition)
    Error in DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory,  : 
      Gene IDs (first column) differ between files.
    In addition: There were 36 warnings (use warnings() to see them)
    Here is the 36 warnings:

    Code:
    Warning messages:
    1: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    2: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    3: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    4: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    5: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    6: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    7: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    8: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    9: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    10: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    11: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    12: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    13: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    14: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    15: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    16: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    17: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    18: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    19: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    20: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    21: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    22: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    23: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    24: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    25: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    26: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    27: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    28: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    29: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    30: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    31: In read.table(file.path(directory, fn)) :
      line 1 appears to contain embedded nulls
    32: In read.table(file.path(directory, fn)) :
      line 2 appears to contain embedded nulls
    33: In read.table(file.path(directory, fn)) :
      line 3 appears to contain embedded nulls
    34: In read.table(file.path(directory, fn)) :
      line 4 appears to contain embedded nulls
    35: In read.table(file.path(directory, fn)) :
      line 5 appears to contain embedded nulls
    36: In scan(file = file, what = what, sep = sep, quote = quote,  ... :
      embedded nul(s) found in input
    Because it says "Gene IDs (first column) differ between files.", I have checked each file but all have the same number of rows and I guess the first column is same for all (well, I have used the same gtf file for all of them, so it must be).

    I know the problem is at a very basic stage but I have no clue as an R-noob.
    Last edited by sazz; 03-23-2014, 04:27 AM.
  • sazz
    Member
    • Oct 2012
    • 28

    #2
    Solved, my files were not in Tab Delimited format :/

    Comment

    • angus878
      Junior Member
      • Nov 2014
      • 1

      #3
      Additional answer

      I got the same issue and found your post helpful. To solve I opened the file in notepad and changed the encoding from Unicode to ANSI and then it imported cleanly into R.

      Comment

      Latest Articles

      Collapse

      • SEQadmin2
        Nine Things a Sample Prep Scientist Thinks About Before Sequencing
        by SEQadmin2


        I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

        Here are nine questions we think about, in roughly the order they matter, before...
        06-18-2026, 07:11 AM
      • SEQadmin2
        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
        by SEQadmin2


        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
        ...
        06-02-2026, 10:05 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by SEQadmin2, Yesterday, 05:37 AM
      0 responses
      5 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-26-2026, 11:10 AM
      0 responses
      16 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-17-2026, 06:09 AM
      0 responses
      50 views
      0 reactions
      Last Post SEQadmin2  
      Started by SEQadmin2, 06-09-2026, 11:58 AM
      0 responses
      110 views
      0 reactions
      Last Post SEQadmin2  
      Working...