Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEXSeq ignore strand issues.

    DEXSeq runs a flatten python script and then a counting python script. To my knowledge the flatten does not have a ignore strand option, while the counting script does.

    It has occurred to me that it is impossible to do a correct "flatten" operation without knowing if the bins are going to be used with ignore strand or without ignoring strand. The idea behind flattening is to split over-lapping exons into bins that that have no overlap, easing some statistical modelling issues, and enabling better detecting differential splicing, but it occurs to me that if strand is respected during "flattening" and then ignored during counting it can be the case that sub-exon counting bins overlap when the model has them as non-overlapping.

    One way around this might be to always split exons without respect to strand, but that seems unlikely to be a default case.

    Anyone know how DEXSeq handles the issue of strand when splitting exons?

  • #2
    Well, the flattening should always be stranded, since the genes from which exonic bins arise are stranded. However the counting can still be unstranded, since that depends on the library type. The two need not be identical. If a genomic region has overlapping exonic bins on each strand then it'll just receive a 0 count if you have a non-directional dataset. This is the same as with gene counts.

    Comment


    • #3
      I actually focused on the complicated case: exon counting bins overlapping only because of strand, but there is now a simpler case to discuss given your reply.

      First in UCSC KnownGene it wouldn't be entirely correct to say, "genes are stranded". Transcripts are stranded, genes consist of multiple transcripts, transcripts describe how to use exon ranges to construct a function polypeptide.


      The point here that should be considered is this: a very common source of multiple transcripts for the same gene is literally the strand identifier being switched from positive to negative, but otherwise it is the same transcript with a new id, and ostensibly it is probably the same RNA isoform except observed to be generated from a different strand. I don't have a quantity of how frequently this occurs, but have done a fair bit of work in this area "by-hand" it is fair to say it occurs a lot.


      In the case of gene counting with the exon union model, this matters not at all, but in the case of exon counting bins it presents a first obvious challenge. Keep in mind I do not mean to say DEXSeq is "wrong", I think this is a topic with "defensible decisions", but one "defensible decision" does not imply there are not others.

      If the only difference between two exons is the strand, but otherwise the transcript they are embedded in is identical (including order, start, stops, usage, and resulting RNA isoform), it is not seem best to count them separately or count them as zero because they overlap. For all appearances they may be biologically identical except for strand, and in the case of read data that doesn't generate strand data you won't ever know which is being utilized.


      Counting these identical exons as zero because they are in a transcript with a counter part on the reverse strand is just taking a mulligan where it appears you do not need too. I assumed DEXSeq did not do that, but now I am curious.

      Comment


      • #4
        UCSC's annotations are a complete mess and will generally screw DEXseq up. I can't recommend using them for any reason at all.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Non-Coding RNA Research and Technologies
          by seqadmin




          Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

          Nobel Prize for MicroRNA Discovery
          This week,...
          10-07-2024, 08:07 AM
        • seqadmin
          Recent Developments in Metagenomics
          by seqadmin





          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
          09-23-2024, 06:35 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 10-11-2024, 06:55 AM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-02-2024, 04:51 AM
        0 responses
        109 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-01-2024, 07:10 AM
        0 responses
        114 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-30-2024, 08:33 AM
        1 response
        119 views
        0 likes
        Last Post EmiTom
        by EmiTom
         
        Working...
        X