Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Simon Anders
    replied
    Yes, your calculation is correct, if you want to compute RPKM values according to the original definition suggested by Mortazavi et al. However, once you have alternative splicing, dividing by the sum of all exons, no matter whether they are used or not (or maybe even mutually exclusive), may cause severe problems and this is why Trapnell et al. (Nature Biotechnology 28: 511 (2010)) argued that the definition is not such a good one.

    On the other hand, it does not matter that much how you normalize for transcript length, because in most use cases, you won't be interested in absolute expression anyway, because you end up only comparing the expression of the same gene across samples rather than comparing expression between different genes.

    Leave a comment:


  • mauro.pala
    replied
    hi kwatts59,

    Your calculation seems to be correct.

    About the second question. You need to define the "trascriptional units". if your aim is to establish a "gene level" expression your trascriptional unit should be the full gene exons set. If you are interested in "isoforms level expression" you should calculate rpkm for each isoform.

    here you can find some examples:

    http://woldlab.caltech.edu/rnaseq/


    http://sandberg.cmb.ki.se/media/data/rnaseq/instructions-rpkmforgenes.html

    [
    URL="http://cufflinks.cbcb.umd.edu/"]
    http://cufflinks.cbcb.umd.edu/[/URL]



    cheers


    M.

    Leave a comment:


  • kwatts59
    replied
    I really need to know if this calculation is correct or not.
    Is there anybody out there that knows how to calculate RPKM?

    Leave a comment:


  • kwatts59
    replied
    Any comments anybody?

    Leave a comment:


  • kwatts59
    started a topic RPKM calculation help

    RPKM calculation help

    I am trying to write a PERL script to calculate the RPKM for genes of interest and I need some verification that I am doing this calculation correctly. There are 31.8 million mapped reads on the genome.

    Here is the GFF3 file of a gene for example. There are 4,011 reads that map to this gene (between positions 4542759 and 4544980).

    Chr2 MSU_osa1r6 gene 4542759 4544980 . + . ID=13102.t00754;Name=unknown gene
    Chr2 MSU_osa1r6 mRNA 4542759 4544980 . + . ID=13102.m00974;Parent=13102.t00754
    Chr2 MSU_osa1r6 five_prime_UTR 4542759 4543030 . + . Parent=13102.m00974
    Chr2 MSU_osa1r6 CDS 4543031 4543177 . + 0 Parent=13102.m00974
    Chr2 MSU_osa1r6 CDS 4543287 4543709 . + 0 Parent=13102.m00974
    Chr2 MSU_osa1r6 CDS 4543836 4543952 . + 0 Parent=13102.m00974
    Chr2 MSU_osa1r6 CDS 4544064 4544423 . + 0 Parent=13102.m00974
    Chr2 MSU_osa1r6 three_prime_UTR 4544424 4544980 . + . Parent=13102.m00974

    There are 4 exons for this particular gene which contain a total of 1,043 base pairs.

    So the RPKM for this particular gene is ((4,011 reads/1.043kb of exon)/31.8mill mapped reads) = 120.9RPKM

    Is my calculation correct?

    Also, if there are reads that map to the intron regions or partial intron regions, should those reads be excluded from the calculation?
    This gene also has 3 other alternative spliced forms, which splicing is the correct one?

    Thanks in advance

Latest Articles

Collapse

  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM
  • seqadmin
    Techniques and Challenges in Conservation Genomics
    by seqadmin



    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

    Avian Conservation
    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
    03-08-2024, 10:41 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-27-2024, 06:37 PM
0 responses
12 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-27-2024, 06:07 PM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-22-2024, 10:03 AM
0 responses
52 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-21-2024, 07:32 AM
0 responses
68 views
0 likes
Last Post seqadmin  
Working...
X