Announcement

Collapse

Welcome to the New Seqanswers!

Welcome to the new Seqanswers! We'd love your feedback, please post any you have to this topic: New Seqanswers Feedback.
See more
See less

Empty instance in cummeRbund

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Empty instance in cummeRbund

    Hi everyone,

    I'm using cuffdiff to do my differential expression analysis for my mouse data. I am using annotation file to map only known transcripts using TopHat. I'm using the same GTF file for my Cuffdiff analysis.
    Most of the output files look fine I think, but when I try to load the files into CummeRbund, the resulting variable is empty. The ouput:
    > cuff_data <- readCufflinks('diff_out_KOvsWT')
    > cuff_data
    CuffSet instance with:
    0 samples
    0 genes
    0 isoforms
    0 TSS
    0 CDS
    0 promoters
    0 splicing
    0 relCDS

    I've not been able to figure out why this is, does anybody have an idea? I have no idea whether the problem is in my installation of cummerbund,or if it has to do with the output from cuffdiff.

    thanks in advance!

    Below is some of the relevant information.

    My output in R:
    R version 2.15.3 (2013-03-01) -- "Security Blanket"
    Copyright (C) 2013 The R Foundation for Statistical Computing
    ISBN 3-900051-07-0
    Platform: x86_64-pc-linux-gnu (64-bit)

    R is free software and comes with ABSOLUTELY NO WARRANTY.
    You are welcome to redistribute it under certain conditions.
    Type 'license()' or 'licence()' for distribution details.

    Natural language support but running in an English locale

    R is a collaborative project with many contributors.
    Type 'contributors()' for more information and
    'citation()' on how to cite R or R packages in publications.

    Type 'demo()' for some demos, 'help()' for on-line help, or
    'help.start()' for an HTML browser interface to help.
    Type 'q()' to quit R.

    > library(cummeRbund)
    Loading required package: BiocGenerics

    Attaching package: ‘BiocGenerics’

    The following object(s) are masked from ‘package:stats’:

    xtabs

    The following object(s) are masked from ‘package:base’:

    anyDuplicated, cbind, colnames, duplicated, eval, Filter, Find,
    get, intersect, lapply, Map, mapply, mget, order, paste, pmax,
    pmax.int, pmin, pmin.int, Position, rbind, Reduce, rep.int,
    rownames, sapply, setdiff, table, tapply, union, unique

    Loading required package: RSQLite
    Loading required package: DBI
    Loading required package: ggplot2
    Loading required package: reshape2
    Loading required package: fastcluster

    Attaching package: ‘fastcluster’

    The following object(s) are masked from ‘package:stats’:

    hclust

    Loading required package: rtracklayer
    Loading required package: GenomicRanges
    Loading required package: IRanges
    Loading required package: Gviz
    Loading required package: grid
    > cuff_data <- readCufflinks('diff_out_KOvsWT')
    > cuff_data
    CuffSet instance with:
    0 samples
    0 genes
    0 isoforms
    0 TSS
    0 CDS
    0 promoters
    0 splicing
    0 relCDS


    The cuffdiff run output (last couple of lines):
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 45818586.86
    > Raw Map Mass: 50853518.89
    > Number of Multi-Reads: 2689569 (with 7422256 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 45818586.86
    > Raw Map Mass: 45467086.16
    > Number of Multi-Reads: 2561480 (with 7010002 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 45818586.86
    > Raw Map Mass: 45613387.71
    > Number of Multi-Reads: 2548387 (with 6777011 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 45818586.86
    > Raw Map Mass: 47467692.15
    > Number of Multi-Reads: 2705326 (with 7047321 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    > Map Properties:
    > Normalized Map Mass: 45818586.86
    > Raw Map Mass: 52463715.19
    > Number of Multi-Reads: 2987109 (with 7526927 total hits)
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [13:59:58] Calculating preliminary abundance estimates
    > Processed 21109 loci. [*************************] 100%
    [14:56:21] Learning bias parameters.
    [15:14:08] Testing for differential expression and regulation in locus.
    > Processed 21109 loci. [*************************] 100%
    Performed 12350 isoform-level transcription difference tests
    Performed 0 tss-level transcription difference tests
    Performed 10502 gene-level transcription difference tests
    Performed 0 CDS-level transcription difference tests
    Performed 0 splicing tests
    Performed 0 promoter preference tests
    Performing 0 relative CDS output tests
    Writing isoform-level FPKM tracking
    Writing TSS group-level FPKM tracking
    Writing gene-level FPKM tracking
    Writing CDS-level FPKM tracking
    Writing isoform-level count tracking
    Writing TSS group-level count tracking
    Writing gene-level count tracking
    Writing CDS-level count tracking
    Writing isoform-level read group tracking
    Writing TSS group-level read group tracking
    Writing gene-level read group tracking
    Writing CDS-level read group tracking
    Writing read group info
    Writing run info


    My files look like (first 10 lines):
    [email protected]:~/RNAanalysis/data/20130306_Martijn_Sander/diff_out_KOvsWT$ head -n10 *
    ==> bias_params.info <==
    condition_name replicate_num param pos_i pos_j value
    WT 0 start_seq 0 0 1.04493
    WT 0 start_seq 0 1 1.02886
    WT 0 start_seq 0 2 1.03362
    WT 0 start_seq 0 3 1.03065
    WT 0 start_seq 0 4 1.02129
    WT 0 start_seq 0 5 1.02415
    WT 0 start_seq 0 6 0.976724
    WT 0 start_seq 0 7 0.879226
    WT 0 start_seq 0 8 1.06228

    ==> cds.count_tracking <==
    tracking_id

    ==> cds.diff <==
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 sqrt(JS) test_stat p_value q_value significant

    ==> cds_exp.diff <==
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant

    ==> cds.fpkm_tracking <==
    tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage

    ==> cds.read_group_tracking <==
    tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status

    ==> cuffData.db <==
    SQLite format [email protected] A#-â&—AûöñìçâÝØÓÎÉÄ¿ºµ°«¦¡—%9indexsqlite_autoindex_TSS_1TSgtablesamplessamplesCREATE TABLE "samples"(
    "sample_index" INTEGER NOT NULL,
    "sample_name" VARCHAR(45) PRIMARY KEY NOT NULL
    )-Aindexsqlite_autoindex_samples_1samplesdablebiasDatabiasDataCREATE TABLE "biasData"(
    "biasData_id" INTEGER PRIMARY KEY NOT NULL
    )wƒMtablegenesgenesCREATE TABLE "genes"(
    "gene_id" VARCHAR(45) PRIMARY KEY NOT NULL,
    "class_code" VARCHAR(45),
    "nearest_ref_id" VARCHAR(45),
    "gene_short_name" VARCHAR(45),

    ==> gene_exp.diff <==
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
    0610005C13Rik 0610005C13Rik 0610005C13Rik chr7:45567794-45589710 WT KO OK 0 0.604775 inf -nan 0.2076 0.639265 no
    0610007N19Rik 0610007N19Rik 0610007N19Rik chr15:32240567-32244662 WT KO OK 5.46571 7.5372 0.463622 0.26062 0.6309 0.891824 no
    0610007P14Rik 0610007P14Rik 0610007P14Rik chr12:85815454-85824545 WT KO OK 0.00531297 2.53232 8.89673 1.65699 0.13885 0.592104 no
    0610008F07Rik 0610008F07Rik 0610008F07Rik chr2:163492317-163541721 WT KO NOTEST 0 0 0 0 1 1 no
    0610009B14Rik 0610009B14Rik 0610009B14Rik chr12:80686374-80692466 WT KO NOTEST 0 0 0 0 1 1 no
    0610009B22Rik 0610009B22Rik 0610009B22Rik chr11:51685384-51688634 WT KO OK 115.715 87.1343 -0.409267 -0.755209 0.29795 0.701585 no
    0610009D07Rik 0610009D07Rik 0610009D07Rik chr12:4817607-4827659 WT KO OK 106.758 92.7211 -0.203369 -0.399578 0.5278 0.841627 no
    0610009L18Rik 0610009L18Rik 0610009L18Rik chr11:120348677-120351190 WT KO OK 26.0873 43.6863 0.743836 0.908239 0.3811 0.76145 no
    0610009O20Rik 0610009O20Rik 0610009O20Rik chr18:38238404-38262629 WT KO OK 26.684 43.2191 0.695695 1.08275 0.063 0.533389 no

    ==> genes.count_tracking <==
    tracking_id WT_count WT_count_variance WT_count_uncertainty_var WT_count_dispersion_var WT_status KO_count KO_count_variance KO_count_uncertainty_var KO_count_dispersion_var KO_status
    0610005C13Rik 0 2.01378 0 0 OK 19.7291 8094.36 0 6927.97 OK
    0610007N19Rik 353.184 92023.9 0 91760 OK 478.051 122087 0 122086 OK
    0610007P14Rik 0.300445 1 0 0.374955 OK 135.823 34661.3 0 33302.6 OK
    0610008F07Rik 0 0 0 0 OK 0 0 0 0 OK
    0610009B14Rik 0 0 0 0 OK 0 0 0 0 OK
    0610009B22Rik 3247.34 1.08884e+06 0 1.08884e+06 OK 2501.94 804035 0 804035 OK
    0610009D07Rik 4741.38 1.70538e+06 0 1.70538e+06 OK 4301.16 1.51118e+06 0 1.51118e+06 OK
    0610009L18Rik 397.699 104366 0 104365 OK 736.619 202041 0 201922 OK
    0610009O20Rik 2694.43 859079 0 859079 OK 4362.3 1.51431e+06 0 1.51431e+06 OK

    ==> genes.fpkm_tracking <==
    tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage WT_FPKM WT_conf_lo WT_conf_hi WT_status KO_FPKM KO_conf_lo KO_conf_hi KO_status
    0610005C13Rik - - 0610005C13Rik 0610005C13Rik - chr7:45567794-45589710 - - 0 0 0.0238527 OK 0.604775 0 3.14855 OK
    0610007N19Rik - - 0610007N19Rik 0610007N19Rik - chr15:32240567-32244662 - - 5.46571 0 16.2838 OK 7.5372 0.0170511 19.7281 OK
    0610007P14Rik - - 0610007P14Rik 0610007P14Rik - chr12:85815454-85824545 - - 0.00531297 0 0.0552537 OK 2.53232 0 9.22736 OK
    0610008F07Rik - - 0610008F07Rik 0610008F07Rik - chr2:163492317-163541721 - - 0 0 0 OK 0 0 0 OK
    0610009B14Rik - - 0610009B14Rik 0610009B14Rik - chr12:80686374-80692466 - - 0 0 0 OK 0 0 0 OK
    0610009B22Rik - - 0610009B22Rik 0610009B22Rik - chr11:51685384-51688634 - - 115.715 48.0154 183.03 OK 87.1343 33.1359 141.027 OK
    0610009D07Rik - - 0610009D07Rik 0610009D07Rik - chr12:4817607-4827659 - - 106.758 53.8199 159.367 OK 92.7211 46.1145 139.028 OK
    0610009L18Rik - - 0610009L18Rik 0610009L18Rik - chr11:120348677-120351190 - - 26.0873 1.30458 50.5611 OK 43.6863 5.88822 80.4606 OK
    0610009O20Rik - - 0610009O20Rik 0610009O20Rik - chr18:38238404-38262629 - - 26.684 9.45996 43.8886 OK 43.2191 19.8421 66.5767 OK

    ==> genes.read_group_tracking <==
    tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status
    0610005C13Rik WT 1 0 0 0 0 - OK
    0610005C13Rik WT 0 0 0 0 0 - OK
    0610005C13Rik WT 2 0 0 0 0 - OK
    0610005C13Rik KO 1 62.0006 59.1874 59.1874 1.81433 - OK
    0610005C13Rik KO 0 0 0 0 0 - OK
    0610005C13Rik KO 2 0 0 0 0 - OK
    0610007N19Rik WT 1 2 1.85224 1.85224 0.0315828 - OK
    0610007N19Rik WT 0 545 491.228 491.228 6.70655 - OK
    0610007N19Rik WT 2 526 566.472 566.472 9.65898 - OK

    ==> isoform_exp.diff <==
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant
    NM_001001130 Zfp85-rs1 Zfp85-rs1 chr13:67729261-67756838 WT KO OK 1.90369 0.00601979 -8.30487 -2.05805 0.1332 0.664166 no
    NM_001001144 Scap Scap chr9:110333292-110384946 WT KO NOTEST 0.0524217 0.040569 -0.369786 0 1 1 no
    NM_001001152 Zfp458 Zfp458 chr13:67254917-67269068 WT KO NOTEST 0.00205102 0 -inf 0 1 1 no
    NM_001001160 Fbxo41 Fbxo41 chr6:85469577-85502886 WT KO NOTEST 0 0.0688398 inf 0 1 1 no
    NM_001001176 Taf9b Taf9b chrX:106206873-106221158 WT KO NOTEST 8.89874e-05 0.000169376 0.928556 0 1 1 no
    NM_001001177 BC051142 BC051142 chr17:34398819-34460734 WT KO NOTEST 0 0 0 0 1 1 no
    NM_001001178 Ccdc148 Ccdc148 chr2:58821697-59018606 WT KO NOTEST 0.00455913 0.00357061 -0.352587 0 1 1 no
    NM_001001179 BC048546 BC048546 chr6:128539821-128581606 WT KO NOTEST 0 0.00134926 inf 0 1 1 no
    NM_001001180 Zfp941 Zfp941 chr7:140809676-140822178 WT KO NOTEST 0.0276003 0.0376914 0.449551 0 1 1 no

    ==> isoforms.count_tracking <==
    tracking_id WT_count WT_count_variance WT_count_uncertainty_var WT_count_dispersion_var WT_status KO_count KO_count_variance KO_count_uncertainty_var KO_count_dispersion_var KO_status
    NM_001001130 175.756 41657 0 41656.4 OK 0.55577 2 0 1.13185 OK
    NM_001001144 8.42453 6833.56 0 576.583 OK 7.58851 1855 0 1854.66 OK
    NM_001001152 0.308707 1 0 0.392771 OK 0 0 0 0 OK
    NM_001001160 0 0 0 0 OK 17.8208 1299.74 0 1294.42 OK
    NM_001001176 0.013785 2 0 1.01778 OK 0.0215626 764.266 0 2.2876 OK
    NM_001001177 0 0 0 0 OK 0 0 0 0 OK
    NM_001001178 0.717963 2 0 1.81451 OK 0.65445 1 0 0.866675 OK
    NM_001001179 0 0 0 0 OK 0.277885 1 0 0.326309 OK
    NM_001001180 5.28247 37.5317 0 36.6546 OK 6.40012 81.3041 0 79.5666 OK

    ==> isoforms.fpkm_tracking <==
    tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage WT_FPKM WT_conf_lo WT_conf_hi WT_status KO_FPKM KO_conf_lo KO_conf_hi KO_status
    NM_001001130 - - Zfp85-rs1 Zfp85-rs1 - chr13:67729261-67756838 2218 - 1.90369 0 4.39756 OK 0.00601979 0 0.0108314 OK
    NM_001001144 - - Scap Scap - chr9:110333292-110384946 4286 - 0.0524217 0 0.908718 OK 0.040569 0 0.106908 OK
    NM_001001152 - - Zfp458 Zfp458 - chr13:67254917-67269068 3488 - 0.00205102 0 0.0187717 OK 0 0 0 OK
    NM_001001160 - - Fbxo41 Fbxo41 - chr6:85469577-85502886 6562 - 0 0 0 OK 0.0688398 0 0.289362 OK
    NM_001001176 - - Taf9b Taf9b - chrX:106206873-106221158 2583 - 8.89874e-05 0 0 OK 0.000169376 0 0.0091703 OK
    NM_001001177 - - BC051142 BC051142 - chr17:34398819-34460734 1900 - 0 0 0 OK 0 0 0 OK
    NM_001001178 - - Ccdc148 Ccdc148 - chr2:58821697-59018606 3640 - 0.00455913 0 0.0179878 OK 0.00357061 0 0.0119919 OK
    NM_001001179 - - BC048546 BC048546 - chr6:128539821-128581606 4698 - 0 0 0 OK 0.00134926 0 0.0139369 OK
    NM_001001180 - - Zfp941 Zfp941 - chr7:140809676-140822178 3909 - 0.0276003 0 0.0949165 OK 0.0376914 0 0.134 OK

    ==> isoforms.read_group_tracking <==
    tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status
    NM_001001130 WT 1 517 478.804 478.804 5.18614 - OK
    NM_001001130 WT 0 0 0 0 0 - OK
    NM_001001130 WT 2 45 48.4625 48.4625 0.524918 - OK
    NM_001001130 KO 1 0 0 0 0 - OK
    NM_001001130 KO 0 0 0 0 0 - OK
    NM_001001130 KO 2 2 1.66731 1.66731 0.0180594 - OK
    NM_001001144 WT 1 0.00702877 0.00650949 0.00650949 3.47958e-05 - OK
    NM_001001144 WT 0 28.0258 25.2607 25.2607 0.157196 - OK
    NM_001001144 WT 2 0.00596457 0.0064235 0.0064235 3.43362e-05 - OK

    ==> promoters.diff <==
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 sqrt(JS) test_stat p_value q_value significant

    ==> read_groups.info <==
    file condition replicate_num total_mass norm_mass internal_scale external_scale
    ./KO918_C1BGEACXX_3_3413/accepted_hits.bam WT 0 4.85925e+07 4.58186e+07 1.10947 1
    ./KO922_C1BGEACXX_6_3415/accepted_hits.bam WT 1 5.08535e+07 4.58186e+07 1.07977 1
    ./KO957_C1BGEACXX_6_3416/accepted_hits.bam WT 2 4.54671e+07 4.58186e+07 0.928554 1
    ./WT915_C1BGEACXX_3_3412/accepted_hits.bam KO 0 4.56134e+07 4.58186e+07 0.991352 1
    ./WT921_C1BGEACXX_3_3414/accepted_hits.bam KO 1 4.74677e+07 4.58186e+07 1.04753 1
    ./WT962_C1BGEACXX_6_3417/accepted_hits.bam KO 2 5.24637e+07 4.58186e+07 1.19954 1

    ==> run.info <==
    param value
    cmd_line cuffdiff -o diff_out_KOvsWT -b ../../RefData/index_files/Mouse/UCSC/Bowtie2Index/genome.fa -p 20 -L WT,KO -u ../../RefData/gtf_refs/mouse/refGene_mouse_UCSC_adapted.gtf ./KO918_C1BGEACXX_3_3413/accepted_hits.bam,./KO922_C1BGEACXX_6_3415/accepted_hits.bam,./KO957_C1BGEACXX_6_3416/accepted_hits.bam ./WT915_C1BGEACXX_3_3412/accepted_hits.bam,./WT921_C1BGEACXX_3_3414/accepted_hits.bam,./WT962_C1BGEACXX_6_3417/accepted_hits.bam
    version 2.1.1
    SVN_revision 4046M
    boost_version 104900

    ==> splicing.diff <==
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 sqrt(JS) test_stat p_value q_value significant

    ==> tss_group_exp.diff <==
    test_id gene_id gene locus sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value significant

    ==> tss_groups.count_tracking <==
    tracking_id

    ==> tss_groups.fpkm_tracking <==
    tracking_id class_code nearest_ref_id gene_id gene_short_name tss_id locus length coverage

    ==> tss_groups.read_group_tracking <==
    tracking_id condition replicate raw_frags internal_scaled_frags external_scaled_frags FPKM effective_length status

    ==> var_model.info <==
    condition locus compatible_count_mean compatible_count_var total_count_mean total_count_var fitted_var
    WT chr1:3214481-3671498 0 0 9.5328 32.1283 0
    WT chr1:4290845-4409241 286.716 38731.3 531.42 257374 63261.5
    WT chr1:4490927-4496413 14.5177 362.867 16.2529 409.386 237.869
    WT chr1:4773199-4785726 11866.9 910828 12014 896817 5.98831e+06
    WT chr1:4807892-4846735 4219.87 1.56711e+06 4257.21 1.57699e+06 1.44602e+06
    WT chr1:4857693-4897909 1609.49 636541 1723.2 617485 472116
    WT chr1:4909575-5070285 13.033 2.7366 60.1232 8594.32 190.845
    WT chr1:5083172-5162549 2522.24 1.12535e+06 2666.59 1.21034e+06 776043
    WT chr1:5588492-5606133 0 0 419.29 13007.3 0

  • #2
    Hi there
    I got the same CuffSet instance with:
    4 samples
    39098 genes
    0 isoforms
    0 TSS
    0 CDS
    0 promoters
    0 splicing
    0 relCDS
    can you guess please ?

    Thank you

    Originally posted by rubbertjes View Post
    Hi everyone,

    I'm using cuffdiff to do my differential expression analysis for my mouse data. I am using annotation file to map only known transcripts using TopHat. I'm using the same GTF file for my Cuffdiff analysis.
    Most of the output files look fine I think, but when I try to load the files into CummeRbund, the resulting variable is empty. The ouput:
    > cuff_data <- readCufflinks('diff_out_KOvsWT')
    > cuff_data
    CuffSet instance with:
    0 samples
    0 genes
    0 isoforms
    0 TSS
    0 CDS
    0 promoters
    0 splicing
    0 relCDS

    ................................
    WT chr1:5083172-5162549 2522.24 1.12535e+06 2666.59 1.21034e+06 776043
    WT chr1:5588492-5606133 0 0 419.29 13007.3 0

    Comment

    Working...
    X