Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Louis_Lemire
    Junior Member
    • Jun 2011
    • 5

    Alternative splice or RNA-seq generated error?

    I am about as newbie as a newbie can get so to be honest I'm a little reluctant to post - however, I have discovered a result looking at the raw Illumina reads that is not readily answerable this early in my RNA-seq workflow, and, with my limited knowledge at this point, thought I might ask to see if I could get an opinion on the below results.

    I am particularly interested in an hypothetical unannotated paralog/alt splice that may not align with my genome so I am spending quite a bit of time perusing the raw illumina reads in order to take a close look at some of the more conserved regions of my research proteins looking for paralogs, etc. (as well as to get a 'feel' for the raw reads, how they behave, etc, on manual queries). After I finish with this preliminary analysis I have a few hundred hours of learning before I can comment with any confidence on sequence assembly matters - I enjoy computers but I am far removed from a Linux wizard.

    Below is a result I found generated from high quality reads (>Q30) which suggests an alternative splice. In the code box below there are three lines:

    Line 1: partial exon 2 of one of my research proteins
    Line 2: raw Illumina reads linked by grep query, all have >Q30, and all cross the putative splice site
    Line 3: partial exon 7 of the same protein in Line 1

    The <....> bracket indicates the beginning of an intronic sequence at the end of exon 2.

    Code:
    [FONT="Courier New"]
                                  ********** ***::****:*
    e2                         ...RGHTGLFAGG<ASTYQVGLELC...>
         ...GHALLFRTSVMAKVEIQAVSTCRGHTGLFAGG<ASTFHVGLEAC...>
    e7   ...GHALLYRTTVMAKLEIQAVSTCR...      <--- intron --->
            *****:**:****:*********
    [/FONT]
    The above result appears to be an alternative splice. However, I was wondering if it may be an error generated by RNA-seq preparation of exp material, i.e., two pieces of DNA randomly cut and joined. There were about 10 copies of the middle region above all yielding high quality reads and all crossing an apparent splice site.

    Q: What is the likelyhood that the above is real and not a machine artifact?
    Last edited by Louis_Lemire; 07-17-2011, 08:36 AM. Reason: grammer
  • Louis_Lemire
    Junior Member
    • Jun 2011
    • 5

    #2
    I may have found an explanation for this strange exon7-exon2 splice. Li et al. (2008) discuss a statistical approach towards identifying the degree of alternative splicing in a differential gene expression paper. [1] Li points out that the splicing of exons in reverse order is 'impossible' [2,3] and uses these rare events as measures of alternative splice false discovery events. Li found that these type of splicing events amount to roughly 1% of mapped junctions.

    Presumably during the workup of the cDNA for Illumina reading there is a low probability that the DNA can form a hair-pin turn back on itself and purportedly recombine - a rare event. That this happened with my research protein was coincidental.

    [1] Hairi Li et al (2008) Determination of tag density required for digital transcriptome analysis: Application to an androgen-sensitive prostate cancer model. PNAS 105(52):20179-20184
    [2] D. L. Black et al. (2003) Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem 72:291-336.
    [3] J. M. Johnson et al. (2003) Genome-wide survey of human alternative pre--mRNA splicing with exon junction microarrays. Science 302:2141-2144

    Comment

    Latest Articles

    Collapse

    • SEQadmin2
      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
      by SEQadmin2


      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

      Here are nine questions we think about, in roughly the order they matter, before...
      06-18-2026, 07:11 AM
    • SEQadmin2
      From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
      by SEQadmin2


      Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


      The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
      ...
      06-02-2026, 10:05 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by SEQadmin2, Yesterday, 11:10 AM
    0 responses
    8 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-17-2026, 06:09 AM
    0 responses
    43 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-09-2026, 11:58 AM
    0 responses
    104 views
    0 reactions
    Last Post SEQadmin2  
    Started by SEQadmin2, 06-05-2026, 10:09 AM
    0 responses
    125 views
    0 reactions
    Last Post SEQadmin2  
    Working...