Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sum of FPKMs?

    I'm analyzing the expression levels of certain genes in different tissues with data from a database and I need to count two different genes as one because I know by experimental data that they were erroneously annotated.

    The expression levels in the database are in FPKM and I know I can't simple make the sum of the two genes to count it as one.

    If I had raw counts what I would do is

    gene A = 400, gene B = 300, counting them as a single gene = 700.

    what would be the best thing to do this with FPKMs?

    gene A = 12 FPKM
    gene B = 20 FPKM
    as single gene = x ?
    Last edited by dlepe; 07-23-2014, 11:51 AM.

  • #2
    FPKM is fragments per kilobase of transcript per million mapped reads.

    So then
    x = total number of fragments / ((total number of bases of transcipt / 1000) * (mapped fragments / 1000000))
    = (fragments mapped to gene A + fragments mapped to gene B) / ((bases of gene A + bases of gene B) / 1000 * (mapped fragments / 1000000))

    This would be if gene A and gene B did not overlap (and by that I mean that no read is mapped to both gene A and gene B). If they do, you'll have to use something like the inclusion-exclusion principle. I don't think you can simply add the two FPKM values, like you mentioned.

    Comment


    • #3
      The thing is I don´t have the total number of mapped fragments from the libraries, I would have to try to see if the raw data is available somewhere and do the mapping myself..

      Since I'm trying to get an estimation of the correlated expression between the gene in question to another gene a friend suggested to simply use the average of gene A and gene B as the expression value I'm trying to find.

      His reasoning is that since FPKMs are normalized by length, and assuming that the number of raw counts in gene A and B similar, the FPKM for only gene A or B should be very similar to the number of FPKMs we'd get if we calculate the FPKMs for they both as a single gene.

      Comment


      • #4
        I suppose you could do an average. I think a weighted average would be better suited for this. You could weight each FPKM value by the length of the corresponding gene.

        Comment


        • #5
          Yeah I guess, I'll see how that goes, thanks.

          Comment


          • #6
            I just did the math, and the weighted average is what you want, provided the genes don't overlap like I previously stated. So if gene A has FPKM a, and gene B has FPKM b, you want:

            a * |A| + b * |B|
            |A| + |B|

            where |x| is the length of gene x.

            Edit: If you want, I can type up my reasoning in latex. I just don't know of a nice way to display fractions on seqanswers.

            Comment


            • #7
              awesome, I'll look into it, thanks again.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Advances in Sequencing Analysis Tools
                by seqadmin


                The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                Today, 07:48 AM
              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 07:17 AM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 05-02-2024, 08:06 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-30-2024, 12:17 PM
              0 responses
              20 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-29-2024, 10:49 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Working...
              X