Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • b_wes
    replied
    I am receiving the same error message when attempting to create my dds from HTSeqCounts: Error in Ops.factor(a$V1, l[[1]]$V1) :
    level sets of factors are different.

    However, I am absolutely sure the first column of features of my htseq-count files are identical and I have deleted any headers. My count files are identical in format to other former lab members who have run successful DESeq and they are all the same length. I am at a loss for why I would still receive this error!

    Leave a comment:


  • ronaldrcutler
    replied
    Good news, I tried it and started getting the right output files! Will let you know if anything else comes up when running DESeq.

    Leave a comment:


  • antoshka
    replied
    Hi, I think you are having the same problem that I originally had when I started using htseq-count: namely using the wrong input to run DESeq.
    htseq-count writes the count table directly into STDOUT, while the "-o" option creates an additional sam file, which in most cases you won't need.
    If you want htseq-count to write a separate text file for your counts, you can use "> yourfilename.txt".
    So you should change "-o" to ">" in your script.
    Let me know if it worked.

    Leave a comment:


  • ronaldrcutler
    replied
    The input files were generated using HT-Seq with this specific command line argument:
    Code:
    htseq-count -f bam -s reverse -i Name -o {0}_htseq_out.txt {1} /Volumes/cachannel/RNA_SEQ/Notch_RNASeq/9.1_Reference_Files/XENLA_UTAmayball_cdna_longest_CHRS2.gff3'.format(files, files)
    The txt file look weird compared to some of my colleagues previous runs. Mine is around 600mb and looks some thing like:
    Code:
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    	XF:Z:__no_feature
    	XF:Z:__alignment_not_unique
    	XF:Z:__alignment_not_unique
    While theirs is around 3 mb and looks like (what I know I should be getting):
    Code:
    AAGAB|c.Audic201207_X025945|JGIv7b.000058049_4975593-4992662-__chr3L	27
    AAGAB|c.Park201106_X000169|JGIv7a.000035880_844976-861439-__chr3L	0
    AAGAB|c.Taira201203egg_X008072|NIGv2.S00000107_247362-264101-__chr3L	0
    AAGAB|c.Ueno201210kidney_X002041|NIGv2.S00001669_498925-515626-__chr3L	0
    AAGAB|c.UniGene_Xl_S20337254|JGIv7b.000036401_3602075-3619253+__chr3S	424
    This leads me to believe it may be more of a problem with HT-Seq...

    Leave a comment:


  • antoshka
    replied
    I think the error you describe may be due to mismatch between the count tables that you provided (e.g. different number of rows, non-unique rows, typos).
    How did you generate your input files? How do your count txt files look?

    Leave a comment:


  • ronaldrcutler
    replied
    DESeqDataSet creation error

    Hi, I'm a new to learning DESeq,

    I am having a similar problem that has been talked about here. This is the error:
    Code:
    Error in Ops.factor(a$V1, l[[1]]$V1) : 
      level sets of factors are different
    In addition: Warning message:
    In is.na(e1) | is.na(e2) :
      longer object length is not a multiple of shorter object length
    This is the script I am using:
    Code:
    library("DESeq2")
    
    files = c("merged_sample_2.bam_htseq_out.txt","merged_sample_11.bam_htseq_out.txt","merged_sample_20.bam_htseq_out.txt","merged_sample_3.bam_htseq_out.txt","merged_sample_12.bam_htseq_out.txt","merged_sample_21.bam_htseq_out.txt")
    
    cond = c("GFP","GFP","GFP","DBM","DBM","DBM")
    
    sTable = data.frame(sampleName = files, fileName = files, condition = cond)
    
    dds <-DESeqDataSetFromHTSeqCount(sampleTable=sTable, directory = "/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/DESeq", design = ~condition)
    I also tried running this code from the command line as mentioned above:
    Code:
    cut -f merged_sample_2.bam_htseq_out.txt | sort | uniq -c
    But got this error:
    Code:
    cut: [-cf] list: illegal list value
    Any help would be appreciated. Thanks!
    Last edited by ronaldrcutler; 05-09-2016, 07:35 PM.

    Leave a comment:


  • antoshka
    replied
    Originally posted by pm2012 View Post
    Thanks a lot for help. It was indeed a problem with my count files. I didn't realize I had to redirect the output of HTseq into a different file. I was using file generated with -o option as an input.
    I reran the script & was able to generate the correct file (also filtered the last few lines starting with __). The rest of code seems to be working well now.
    I also got rid of last colum in sampleTable. It was just one of the many things I was trying to solve my issue.
    Hello pm2012,
    I am having the same problem that you had back then.
    I also just used the file produced by -o option and got the same error message.
    How exactly did you redirect your output file to make it compatible with DESeq2?
    Thanks
    Last edited by antoshka; 05-09-2016, 07:43 PM. Reason: typo

    Leave a comment:


  • Michael Love
    replied
    thank you Devon. good to know.

    Leave a comment:


  • dpryan
    replied
    Just to keep the group in the loop, there ended up being two problems. The error message posted here was due to an apparent typo in one of the count files. Fixing that solved that problem. There was an additional issue due to a header line having been added (I don't know if this was done by htseq-count or not, I should have asked). Removing that allowed for the creation of a proper DESeqDataSet object.

    Leave a comment:


  • dpryan
    replied
    Sure, at least as long as those 2 files are sufficient to cause the problem. You can email me at [email protected].

    Leave a comment:


  • essepf
    replied
    Hi dpryan

    I can send you 2 files output from htseq-count by mail, can be?
    can provide me your email?

    Leave a comment:


  • dpryan
    replied
    You might just post those files somewhere so we can reproduce and track down the cause of this problem.

    Leave a comment:


  • essepf
    replied
    Hi pm2012

    Thanks for your reply.

    The output files I have are on this format.

    gene reads_WR1
    610005C13Rik 2473
    0610007N19Rik 15
    0610007P14Rik 1291
    0610008F07Rik 149
    0610009B14Rik 0
    0610009B22Rik 361
    0610009D07Rik 272
    0610009E02Rik 4
    0610009L18Rik 8

    when you say, filtered, you refers to what?

    command I used to generate the count:

    samtools view file.bam | htseq-count -s no -i gene_name - mus_musculus.gff > WT_results_counts.txt

    Thank you for your help

    Leave a comment:


  • pm2012
    replied
    Did you check your count files generated from HTSeq? I had an issue with the count file itself thats why I was getting the error. The count files need to be filtered. See my previous reply to the thread above.

    Leave a comment:


  • essepf
    replied
    Hi

    this is my script:

    library("DESeq2")
    sampleFiles <- list.files(path="/Users/me/Desktop/RNASeq/htseq-count_Results_6Samples/htseq_Adp/")
    sampleCondition=factor(c(rep("pr",3), rep("wt",3)))
    sampleTable=data.frame(sampleName=sampleFiles, fileName=sampleFiles,condition=sampleCondition)
    directory <- c("/Users/me/Desktop/RNASeq/htseq-count_Results_6Samples/htseq_Adp/")
    des <- formula(~ condition)
    ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)

    if I do what you suggest me I have exactly same error.

    > ddsHTSeq <- DESeqDataSetFromHTSeqCount(sampleTable = sampleTable, directory = directory, design = des)
    Error in Ops.factor(a$V1, l[[1]]$V1) :
    level sets of factors are different

    Thank you

    Leave a comment:

Latest Articles

Collapse

  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM
  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
32 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
35 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
29 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
53 views
0 likes
Last Post seqadmin  
Working...
X