Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Six reading frame question... why all contain ORF??

    Hi,

    So I have a conceptual question I'm trying to get my head around. I have some RNA-seq data and was trying to determine the ORF of each read. Of course a six reading frame translation of a given nucleotide sequence would be expected to have a significant ORF in at least one frame, as long as the sequence comes from a gene region.

    However, I find short bits of sequences (my 150bp RNA-seq reads) that appears to have continuous ORFs on all 6 frames of translation without any stop codons at all... how can this be? Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.

    I realize this could be an anomaly, but this seems to be the case with about 10% of all my RNA-seq reads. I realize this could happen with repetitive sequence, but I don't think that is the case, since it is RNA-seq data.

    Any thoughts or speculations are gladly welcomed!!

  • #2
    RNA-seq libraries are almost never full length; the strands are fragmented into shorter fragments before sequencing. Therefore the reads you get are only a portion of the full mRNA. If you want to get the complete AA sequence of an RNA, you'll have to assemble your reads back together first.

    Comment


    • #3
      Thanks for the reply. I understand this is just a small fragment of a whole mRNA, but for a span of 150 bases, I can't understand why we should find no stop codons on all 6 reading frames.

      Comment


      • #4
        Originally posted by all_your_base View Post
        Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.
        ...assuming the same frequency for each base, which is usually not the case. What is the GC% of this genome? Also, base distribution is not uniform and often differs between regions (gene/intergenic, exon/intron, etc). You might find GC-rich repeats in 3'UTRs for instance. Last, this subset of 10% might come from the same genomic locus.
        Have you first tried fastqc on your reads?

        Comment


        • #5
          Originally posted by all_your_base View Post
          Since there are 3 stop codons out of 64 possibilities, we should statistically see a stop every 21AA (63 bases) or so.
          Bases and codons aren't randomly distributed, nor should one to expect them to be.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Best Practices for Single-Cell Sequencing Analysis
            by seqadmin



            While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
            Today, 07:15 AM
          • seqadmin
            Latest Developments in Precision Medicine
            by seqadmin



            Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

            Somatic Genomics
            “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
            05-24-2024, 01:16 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Today, 08:18 AM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Today, 08:04 AM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 06-03-2024, 06:55 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 05-30-2024, 03:16 PM
          0 responses
          27 views
          0 likes
          Last Post seqadmin  
          Working...
          X