Hi,
I'm trying to understand the "-r no" option in the latest version of dexseq_prepare_annotation.py. It seems that the script only ignores an exonic part if it exactly overlaps an exonic part from another gene - is this correct? Couldn't gene-level differential expression also disrupt differential exon usage calls if an exonic part of one gene is a subset of an exonic part of another?
thanks,
Leda
----------
E.g. here are some lines from running the script on gencode 14 annotation. The first exonic part contains the subsequent ones and more (from a different gene), so I'm wary when DEXSeq tells me it's used differentially.
chr19 dexseq_prepare_annotation.py aggregate_gene 2252252 2269758 . - . gene_id "ENSG00000167476.5"
chr19 dexseq_prepare_annotation.py exonic_part 2269408 2269758 . - . transcripts "ENST00000593238.1"; exonic_part_number "012"; gene_id "ENSG00000167476.5"
...
chr19 dexseq_prepare_annotation.py aggregate_gene 2269519 2273487 . + . gene_id "ENSG00000104904.6"
chr19 dexseq_prepare_annotation.py exonic_part 2269519 2269519 . + . transcripts "ENST00000583542.2+ENST00000582888.2"; exonic_part_number "001"; gene_id "ENSG00000104904.6"
chr19 dexseq_prepare_annotation.py exonic_part 2269520 2269528 . + . transcripts "ENST00000583542.2+ENST00000322297.4+ENST00000582888.2"; exonic_part_number "002"; gene_id "ENSG00000104904.6"
chr19 dexseq_prepare_annotation.py exonic_part 2269529 2269529 . + . transcripts "ENST00000583542.2+ENST00000581150.1+ENST00000582888.2+ENST00000322297.4"; exonic_part_number "003"; gene_id "ENSG00000104904.6"
...
I'm trying to understand the "-r no" option in the latest version of dexseq_prepare_annotation.py. It seems that the script only ignores an exonic part if it exactly overlaps an exonic part from another gene - is this correct? Couldn't gene-level differential expression also disrupt differential exon usage calls if an exonic part of one gene is a subset of an exonic part of another?
thanks,
Leda
----------
E.g. here are some lines from running the script on gencode 14 annotation. The first exonic part contains the subsequent ones and more (from a different gene), so I'm wary when DEXSeq tells me it's used differentially.
chr19 dexseq_prepare_annotation.py aggregate_gene 2252252 2269758 . - . gene_id "ENSG00000167476.5"
chr19 dexseq_prepare_annotation.py exonic_part 2269408 2269758 . - . transcripts "ENST00000593238.1"; exonic_part_number "012"; gene_id "ENSG00000167476.5"
...
chr19 dexseq_prepare_annotation.py aggregate_gene 2269519 2273487 . + . gene_id "ENSG00000104904.6"
chr19 dexseq_prepare_annotation.py exonic_part 2269519 2269519 . + . transcripts "ENST00000583542.2+ENST00000582888.2"; exonic_part_number "001"; gene_id "ENSG00000104904.6"
chr19 dexseq_prepare_annotation.py exonic_part 2269520 2269528 . + . transcripts "ENST00000583542.2+ENST00000322297.4+ENST00000582888.2"; exonic_part_number "002"; gene_id "ENSG00000104904.6"
chr19 dexseq_prepare_annotation.py exonic_part 2269529 2269529 . + . transcripts "ENST00000583542.2+ENST00000581150.1+ENST00000582888.2+ENST00000322297.4"; exonic_part_number "003"; gene_id "ENSG00000104904.6"
...