Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • question about RNA-seq

    Hi everyone,

    I have a question about RNA-seq.

    "Some manipulations during library construction also complicate the analysis
    of RNA-Seq results. For example, many shorts reads that are identical to each
    other can be obtained from cDNA libraries that have been amplified. These could be a genuine reflection of abundant RNA species, or they could be PCR artefacts. One way to discriminate between these possibilities is to determine whether the same sequences are observed in different biological replicates.(Nat Rev Genet. 2009 Jan;10(1):57-63)"

    Is there any other way in the above situation to discriminate between a genuine reflection of abundant RNA or PCR artefacts?

    Thanks,

    Hai-Ri Li

  • #2
    If your library has been randomly fragmented then it should be possible to look for fragments which appear way more frequently than would be expected by chance. In even a short mRNA you should have a range of different possible fragments and if this is an abundant transcript most or all of these should appear more frequently. PCR artefacts usually affect only a small subset of all possible fragments and would produce a very uneven distribution of fragments over the transcript.

    Comment


    • #3
      Thank you for interpretation. You mean we can ignore PCR artefacts because they affect only a small subsets of fragments and it will not affect final analysis results. However, sometimes we found so many tags hit the same positions so that we cannot ingore them.

      Again, is there any other way in the above situation to discriminate between a genuine reflection of abundant RNA or PCR artefacts?

      Thanks,

      Hai-Ri Li

      Comment


      • #4
        Originally posted by lihairi View Post
        You mean we can ignore PCR artefacts because they affect only a small subsets of fragments and it will not affect final analysis results.
        No, that's not what I'm saying. What I was trying to say was that you can normally distinguish PCR artefacts from expression changes because expression changes normally involved the even enrichment of a large number of different fragments over the expressed region, whereas PCR artefacts usually take only a small number of fragments and amplify them to an unnatural degree when viewed in the context of the surrounding fragments.

        We usually filter our data by measuring the percentage of reads in a region which come from exact overlaps. If this value is above 5-10% then we reject it as a likely PCR artefact. I'm intending to move this to an observed/expected calculation though as this is less prone to errors in very short regions with high coverage.

        Comment


        • #5
          Last time you mentioned "We usually filter our data by measuring the percentage of reads in a region which come from exact overlaps". Here region must mean a window, 100 base? 200 base?

          How to do observed/expected calculation?

          Thanks.

          Comment


          • #6
            Originally posted by lihairi View Post
            Last time you mentioned "We usually filter our data by measuring the percentage of reads in a region which come from exact overlaps". Here region must mean a window, 100 base? 200 base?
            In our case region is pretty generic - sometimes we use fixed size windows (with a size which depends normally on our data density), in other cases we construct contigs from sets of overlapping reads, or we might design probes over particular classes of annotation feature (genes, exons, microRNAs, whatever). These things change depending on what kind of experiment you're running.


            Originally posted by lihairi View Post
            How to do observed/expected calculation?
            We're actually not using a proper O/E calculation at the moment (though it would be nice to move to that). Our filter calculates what percentage of reads which overlap a particular region come from exact overlaps, with the same start and end position. For randomly placed reads this value is usually very low (below 5%), but in some cases you will see towers of exactly duplicated reads which usually indicate a mapping or PCR problem, and we filter these out. You can also get high values from low absolute numbers of reads, so you either need to account for this, or ignore it if you're going to filter those regions anyway.

            I hope that makes things a bit clearer.

            Comment


            • #7
              Recently I downloaded a lot of RNA-seq data from NCBI and mapped to Reference RNA using eland_25. I found around 30-40% of tags were mapped to the exactly same postions as others (even though removing the effect of RNA isorforms), much higher than 5%. I do not know how to interpretate this results.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Best Practices for Single-Cell Sequencing Analysis
                by seqadmin



                While isolating and preparing single cells for sequencing was historically the bottleneck, recent technological advancements have shifted the challenge to data analysis. This highlights the rapidly evolving nature of single-cell sequencing. The inherent complexity of single-cell analysis has intensified with the surge in data volume and the incorporation of diverse and more complex datasets. This article explores the challenges in analysis, examines common pitfalls, offers...
                Today, 07:15 AM
              • seqadmin
                Latest Developments in Precision Medicine
                by seqadmin



                Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                Somatic Genomics
                “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                05-24-2024, 01:16 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 08:18 AM
              0 responses
              8 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Today, 08:04 AM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 06-03-2024, 06:55 AM
              0 responses
              13 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-30-2024, 03:16 PM
              0 responses
              27 views
              0 likes
              Last Post seqadmin  
              Working...
              X