Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEXSeq ignore strand issues.

    DEXSeq runs a flatten python script and then a counting python script. To my knowledge the flatten does not have a ignore strand option, while the counting script does.

    It has occurred to me that it is impossible to do a correct "flatten" operation without knowing if the bins are going to be used with ignore strand or without ignoring strand. The idea behind flattening is to split over-lapping exons into bins that that have no overlap, easing some statistical modelling issues, and enabling better detecting differential splicing, but it occurs to me that if strand is respected during "flattening" and then ignored during counting it can be the case that sub-exon counting bins overlap when the model has them as non-overlapping.

    One way around this might be to always split exons without respect to strand, but that seems unlikely to be a default case.

    Anyone know how DEXSeq handles the issue of strand when splitting exons?

  • #2
    Well, the flattening should always be stranded, since the genes from which exonic bins arise are stranded. However the counting can still be unstranded, since that depends on the library type. The two need not be identical. If a genomic region has overlapping exonic bins on each strand then it'll just receive a 0 count if you have a non-directional dataset. This is the same as with gene counts.

    Comment


    • #3
      I actually focused on the complicated case: exon counting bins overlapping only because of strand, but there is now a simpler case to discuss given your reply.

      First in UCSC KnownGene it wouldn't be entirely correct to say, "genes are stranded". Transcripts are stranded, genes consist of multiple transcripts, transcripts describe how to use exon ranges to construct a function polypeptide.


      The point here that should be considered is this: a very common source of multiple transcripts for the same gene is literally the strand identifier being switched from positive to negative, but otherwise it is the same transcript with a new id, and ostensibly it is probably the same RNA isoform except observed to be generated from a different strand. I don't have a quantity of how frequently this occurs, but have done a fair bit of work in this area "by-hand" it is fair to say it occurs a lot.


      In the case of gene counting with the exon union model, this matters not at all, but in the case of exon counting bins it presents a first obvious challenge. Keep in mind I do not mean to say DEXSeq is "wrong", I think this is a topic with "defensible decisions", but one "defensible decision" does not imply there are not others.

      If the only difference between two exons is the strand, but otherwise the transcript they are embedded in is identical (including order, start, stops, usage, and resulting RNA isoform), it is not seem best to count them separately or count them as zero because they overlap. For all appearances they may be biologically identical except for strand, and in the case of read data that doesn't generate strand data you won't ever know which is being utilized.


      Counting these identical exons as zero because they are in a transcript with a counter part on the reverse strand is just taking a mulligan where it appears you do not need too. I assumed DEXSeq did not do that, but now I am curious.

      Comment


      • #4
        UCSC's annotations are a complete mess and will generally screw DEXseq up. I can't recommend using them for any reason at all.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Advanced Tools Transforming the Field of Cytogenomics
          by seqadmin


          At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
          09-26-2023, 06:26 AM
        • seqadmin
          How RNA-Seq is Transforming Cancer Studies
          by seqadmin



          Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
          09-07-2023, 11:15 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 09:36 AM
        0 responses
        8 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-02-2023, 07:14 AM
        0 responses
        15 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-29-2023, 09:38 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-27-2023, 06:57 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Working...
        X