Unconfigured Ad

**dpryan** · 12-23-2014, 06:07 PM

Well, the flattening should always be stranded, since the genes from which exonic bins arise are stranded. However the counting can still be unstranded, since that depends on the library type. The two need not be identical. If a genomic region has overlapping exonic bins on each strand then it'll just receive a 0 count if you have a non-directional dataset. This is the same as with gene counts.

**jrounds** · 12-24-2014, 12:51 PM

I actually focused on the complicated case: exon counting bins overlapping only because of strand, but there is now a simpler case to discuss given your reply.

First in UCSC KnownGene it wouldn't be entirely correct to say, "genes are stranded". Transcripts are stranded, genes consist of multiple transcripts, transcripts describe how to use exon ranges to construct a function polypeptide.

The point here that should be considered is this: a very common source of multiple transcripts for the same gene is literally the strand identifier being switched from positive to negative, but otherwise it is the same transcript with a new id, and ostensibly it is probably the same RNA isoform except observed to be generated from a different strand. I don't have a quantity of how frequently this occurs, but have done a fair bit of work in this area "by-hand" it is fair to say it occurs a lot.

In the case of gene counting with the exon union model, this matters not at all, but in the case of exon counting bins it presents a first obvious challenge. Keep in mind I do not mean to say DEXSeq is "wrong", I think this is a topic with "defensible decisions", but one "defensible decision" does not imply there are not others.

If the only difference between two exons is the strand, but otherwise the transcript they are embedded in is identical (including order, start, stops, usage, and resulting RNA isoform), it is not seem best to count them separately or count them as zero because they overlap. For all appearances they may be biologically identical except for strand, and in the case of read data that doesn't generate strand data you won't ever know which is being utilized.

Counting these identical exons as zero because they are in a transcript with a counter part on the reverse strand is just taking a mulligan where it appears you do not need too. I assumed DEXSeq did not do that, but now I am curious.

**dpryan** · 12-24-2014, 01:03 PM

UCSC's annotations are a complete mess and will generally screw DEXseq up. I can't recommend using them for any reason at all.

Topics	Statistics	Last Post
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 9 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 18 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 52 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 110 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM

Unconfigured Ad

DEXSeq ignore strand issues.

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News