Hi All,
I am currently trying to use goseq to analyze my RNAseq data. I believe I have all the required files, I just cannot seem to create the proper file to input for the gene2cat arguement. I am working with a non-native species. Below is a snippet of my code:
> library(goseq)
Loading required package: BiasedUrn
Loading required package: geneLenDataBase
> de.genes<-scan('de_genes_GFOLD_24.txt', what=character())
Read 4078 items
> assayed.genes<-scan('all_genes_GFOLD_24.txt', what=character())
Read 17479 items
> gene.length=scan('gene_lengths_noIDs.txt', what=numeric())
Read 17479 items
> gene.vector=as.integer(assayed.genes%in%de.genes)
> names(gene.vector)=assayed.genes
> head(gene.vector)
AAEL000001 AAEL000002 AAEL000003 AAEL000004 AAEL000005 AAEL000006
0 0 0 0 0 1
> pwf=nullp(gene.vector,bias.data=gene.length)
> head(pwf)
DEgenes bias.data pwf
AAEL000001 0 1590 0.26119999
AAEL000002 0 198 0.05383339
AAEL000003 0 2093 0.28937882
AAEL000004 0 2571 0.31792557
AAEL000005 0 1429 0.24989351
AAEL000006 1 4345 0.40676721
> rownames(pwf) <- names(gene.length)
> GOterms=read.delim('Goaccesions.txt',header=TRUE)
> GOterms=as.data.frame.matrix(GOterms)
> head(GOterms)
Gomapping geneID
1 na AAEL000001
2 na AAEL000002
3 GO:0016772 AAEL000003
4 GO:0016757 AAEL000004
5 GO:0008152 AAEL000004
6 GO:0003676 AAEL000005
> GO.wall=goseq(pwf,gene2cat=go.ids)
Error in goseq(pwf, gene2cat = go.ids) :
Was expecting a dataframe or a list mapping categories to genes. Check gene2cat input and try again.
From the goseq package: "gene2cat: A data frame with two columns containing the mapping between genes and the categories of interest."
Could anyone provide an example of this file set-up? Thanks!
Heather
I am currently trying to use goseq to analyze my RNAseq data. I believe I have all the required files, I just cannot seem to create the proper file to input for the gene2cat arguement. I am working with a non-native species. Below is a snippet of my code:
> library(goseq)
Loading required package: BiasedUrn
Loading required package: geneLenDataBase
> de.genes<-scan('de_genes_GFOLD_24.txt', what=character())
Read 4078 items
> assayed.genes<-scan('all_genes_GFOLD_24.txt', what=character())
Read 17479 items
> gene.length=scan('gene_lengths_noIDs.txt', what=numeric())
Read 17479 items
> gene.vector=as.integer(assayed.genes%in%de.genes)
> names(gene.vector)=assayed.genes
> head(gene.vector)
AAEL000001 AAEL000002 AAEL000003 AAEL000004 AAEL000005 AAEL000006
0 0 0 0 0 1
> pwf=nullp(gene.vector,bias.data=gene.length)
> head(pwf)
DEgenes bias.data pwf
AAEL000001 0 1590 0.26119999
AAEL000002 0 198 0.05383339
AAEL000003 0 2093 0.28937882
AAEL000004 0 2571 0.31792557
AAEL000005 0 1429 0.24989351
AAEL000006 1 4345 0.40676721
> rownames(pwf) <- names(gene.length)
> GOterms=read.delim('Goaccesions.txt',header=TRUE)
> GOterms=as.data.frame.matrix(GOterms)
> head(GOterms)
Gomapping geneID
1 na AAEL000001
2 na AAEL000002
3 GO:0016772 AAEL000003
4 GO:0016757 AAEL000004
5 GO:0008152 AAEL000004
6 GO:0003676 AAEL000005
> GO.wall=goseq(pwf,gene2cat=go.ids)
Error in goseq(pwf, gene2cat = go.ids) :
Was expecting a dataframe or a list mapping categories to genes. Check gene2cat input and try again.
From the goseq package: "gene2cat: A data frame with two columns containing the mapping between genes and the categories of interest."
Could anyone provide an example of this file set-up? Thanks!
Heather