Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Binary characters in cuffcompare result & Questions on cuffdiff

    Hi,

    I am using tophat/cufflinks packages analyzing my RNA-seq data. I found a small bug in cuffcompare.

    After I compared my reference gtf with transcript.gtf, I got the combined.gtf. But, sometimes, I found some of the strand information was in binary character. For example, if I use "less" to check the combined.gtf, for some transcripts, the strand information is "^@". If I submit this combined.gtf to UCSC genome browser, it will say "cannot read xxx.gtf file". After I changed these binary characters into ".", it works fine.

    Another question is, does anyone know how to set up the minimal threshold in the cuffdiff to do the test. For example, I have a gene expressed mildly in one sample (FPKM 8), but no expression in the other sample (FPKM 0). It is actually one of the most interesting genes I was looking for. But in the cuffdiff, it has the mark of "NOTEST", thus the significance is "no". Can anyone give me any help on this? Can I manually select these genes as differentially expressed genes, because they are expressed and actually the pvalue is also 0?

    Plus, can I remove genes expressed in the low level manually, e.g. for genes with FPKM < 1? These genes dont look very promising...

    Cheers,
    Jun

  • #2
    I'm glad I found this post. I was having the exact same problem and changing the binary character to "." fixed my (current) issues as well.

    Sam

    Comment


    • #3
      Originally posted by nkwuji View Post
      Hi,

      Another question is, does anyone know how to set up the minimal threshold in the cuffdiff to do the test. For example, I have a gene expressed mildly in one sample (FPKM 8), but no expression in the other sample (FPKM 0). It is actually one of the most interesting genes I was looking for. But in the cuffdiff, it has the mark of "NOTEST", thus the significance is "no". Can anyone give me any help on this? Can I manually select these genes as differentially expressed genes, because they are expressed and actually the pvalue is also 0?

      Plus, can I remove genes expressed in the low level manually, e.g. for genes with FPKM < 1? These genes dont look very promising...

      Cheers,
      Jun
      The cuffdiff -c option might be what you are looking for
      Code:
      -c/--min-alignment-count <int>
      This limits the differential testing based on counts - rather than FPKM. However, do you think it is wise/necessary to use this feature if what you want to say is that it is present in one condition and not the other?

      Comment


      • #4
        Thx RockChalkJayhawk.

        I will think about this part, though the result seems to be a little weird on genes expressed at low levels. For example, for this gene expressed in one sample with FPKM of 8, and in the other sample with FPKM of 0, the result is shown as NOTEST. But for the other gene, in one sample, the FPKM is 0.25, and in the other sample is 0. THe result is OK, and significant.

        Possibly it can be explained by the second gene is longer, and the min-alignment-count could be higher than default setting and got the test significant. But I think it may be better to limit the result by FPKM (or average coverage) other than total fragments(or reads), otherwise, it may have bias on longer genes.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin


          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          Yesterday, 07:48 AM
        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 07:17 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-02-2024, 08:06 AM
        0 responses
        19 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-30-2024, 12:17 PM
        0 responses
        20 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-29-2024, 10:49 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Working...
        X