Dear GO SQL Experts,
I would like to create a GO annotation text file which contains the set of genes connected to each GOids and its all descendants (children, grandchildren, etc.).
Can somebody give me some hints how to do this?
(I am working with D.melanogaster and FlyBase geneIDs - FBgn)
I already created an SQl query for this:
SELECT
term.acc AS superterm_acc,
dbxref.xref_key AS gp_acc,
evidence.code AS code
FROM term
INNER JOIN graph_path ON (term.id=graph_path.term1_id)
INNER JOIN association ON (graph_path.term2_id=association.term_id)
INNER JOIN evidence ON (association.term_id=evidence.id)
INNER JOIN gene_product ON (association.gene_product_id=gene_product.id)
INNER JOIN species ON (gene_product.species_id=species.id)
INNER JOIN dbxref ON (gene_product.dbxref_id=dbxref.id)
WHERE
species.genus = 'Drosophila'
AND
species = 'melanogaster';
But I am not sure if the query is correct since the evidence codes that I get are seems to be imperfect:
bash$ awk '{print $3}' downloaded_file | sort | uniq -c
6387 IDA
921948 IEA
2911 IEP
113 IGC
257 IMP
123 ISA
13 ISO
664 ISS
904 NAS
583144 ND
24598 RCA
accordnig to this query I got more kinds of evidence codes:
SELECT evidence.code, count(DISTINCT association.id)
FROM gene_product
INNER JOIN dbxref ON (gene_product.dbxref_id=dbxref.id)
INNER JOIN association ON (gene_product.id=association.gene_product_id)
INNER JOIN species ON (gene_product.species_id=species.id)
INNER JOIN evidence ON (association.id=evidence.association_id)
WHERE dbxref.xref_dbname = 'FB'
AND
species.genus = 'Drosophila'
AND
species = 'melanogaster'
GROUP BY evidence.code;
IBA 1254
IC 1130
IDA 11801
IEA 13541
IEP 974
IGC 18
IGI 2016
IKR 63
IMP 18364
IPI 1889
IRD 9
ISA 141
ISM 1301
ISO 1
ISS 14862
NAS 10277
ND 6899
RCA 19
TAS 4732
Cheers,
Eszter Ari
Institut für Populationsgenetik
Vetmeduni Vienna
Veterinärplatz 1
1210 Wien, Austria
I would like to create a GO annotation text file which contains the set of genes connected to each GOids and its all descendants (children, grandchildren, etc.).
Can somebody give me some hints how to do this?
(I am working with D.melanogaster and FlyBase geneIDs - FBgn)
I already created an SQl query for this:
SELECT
term.acc AS superterm_acc,
dbxref.xref_key AS gp_acc,
evidence.code AS code
FROM term
INNER JOIN graph_path ON (term.id=graph_path.term1_id)
INNER JOIN association ON (graph_path.term2_id=association.term_id)
INNER JOIN evidence ON (association.term_id=evidence.id)
INNER JOIN gene_product ON (association.gene_product_id=gene_product.id)
INNER JOIN species ON (gene_product.species_id=species.id)
INNER JOIN dbxref ON (gene_product.dbxref_id=dbxref.id)
WHERE
species.genus = 'Drosophila'
AND
species = 'melanogaster';
But I am not sure if the query is correct since the evidence codes that I get are seems to be imperfect:
bash$ awk '{print $3}' downloaded_file | sort | uniq -c
6387 IDA
921948 IEA
2911 IEP
113 IGC
257 IMP
123 ISA
13 ISO
664 ISS
904 NAS
583144 ND
24598 RCA
accordnig to this query I got more kinds of evidence codes:
SELECT evidence.code, count(DISTINCT association.id)
FROM gene_product
INNER JOIN dbxref ON (gene_product.dbxref_id=dbxref.id)
INNER JOIN association ON (gene_product.id=association.gene_product_id)
INNER JOIN species ON (gene_product.species_id=species.id)
INNER JOIN evidence ON (association.id=evidence.association_id)
WHERE dbxref.xref_dbname = 'FB'
AND
species.genus = 'Drosophila'
AND
species = 'melanogaster'
GROUP BY evidence.code;
IBA 1254
IC 1130
IDA 11801
IEA 13541
IEP 974
IGC 18
IGI 2016
IKR 63
IMP 18364
IPI 1889
IRD 9
ISA 141
ISM 1301
ISO 1
ISS 14862
NAS 10277
ND 6899
RCA 19
TAS 4732
Cheers,
Eszter Ari
Institut für Populationsgenetik
Vetmeduni Vienna
Veterinärplatz 1
1210 Wien, Austria