Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Binary characters in cuffcompare result & Questions on cuffdiff

    Hi,

    I am using tophat/cufflinks packages analyzing my RNA-seq data. I found a small bug in cuffcompare.

    After I compared my reference gtf with transcript.gtf, I got the combined.gtf. But, sometimes, I found some of the strand information was in binary character. For example, if I use "less" to check the combined.gtf, for some transcripts, the strand information is "^@". If I submit this combined.gtf to UCSC genome browser, it will say "cannot read xxx.gtf file". After I changed these binary characters into ".", it works fine.

    Another question is, does anyone know how to set up the minimal threshold in the cuffdiff to do the test. For example, I have a gene expressed mildly in one sample (FPKM 8), but no expression in the other sample (FPKM 0). It is actually one of the most interesting genes I was looking for. But in the cuffdiff, it has the mark of "NOTEST", thus the significance is "no". Can anyone give me any help on this? Can I manually select these genes as differentially expressed genes, because they are expressed and actually the pvalue is also 0?

    Plus, can I remove genes expressed in the low level manually, e.g. for genes with FPKM < 1? These genes dont look very promising...

    Cheers,
    Jun

  • #2
    I'm glad I found this post. I was having the exact same problem and changing the binary character to "." fixed my (current) issues as well.

    Sam

    Comment


    • #3
      Originally posted by nkwuji View Post
      Hi,

      Another question is, does anyone know how to set up the minimal threshold in the cuffdiff to do the test. For example, I have a gene expressed mildly in one sample (FPKM 8), but no expression in the other sample (FPKM 0). It is actually one of the most interesting genes I was looking for. But in the cuffdiff, it has the mark of "NOTEST", thus the significance is "no". Can anyone give me any help on this? Can I manually select these genes as differentially expressed genes, because they are expressed and actually the pvalue is also 0?

      Plus, can I remove genes expressed in the low level manually, e.g. for genes with FPKM < 1? These genes dont look very promising...

      Cheers,
      Jun
      The cuffdiff -c option might be what you are looking for
      Code:
      -c/--min-alignment-count <int>
      This limits the differential testing based on counts - rather than FPKM. However, do you think it is wise/necessary to use this feature if what you want to say is that it is present in one condition and not the other?

      Comment


      • #4
        Thx RockChalkJayhawk.

        I will think about this part, though the result seems to be a little weird on genes expressed at low levels. For example, for this gene expressed in one sample with FPKM of 8, and in the other sample with FPKM of 0, the result is shown as NOTEST. But for the other gene, in one sample, the FPKM is 0.25, and in the other sample is 0. THe result is OK, and significant.

        Possibly it can be explained by the second gene is longer, and the min-alignment-count could be higher than default setting and got the test significant. But I think it may be better to limit the result by FPKM (or average coverage) other than total fragments(or reads), otherwise, it may have bias on longer genes.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Developments in Metagenomics
          by seqadmin





          Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
          09-23-2024, 06:35 AM
        • seqadmin
          Understanding Genetic Influence on Infectious Disease
          by seqadmin




          During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

          Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
          09-09-2024, 10:59 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 10-02-2024, 04:51 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 10-01-2024, 07:10 AM
        0 responses
        18 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-30-2024, 08:33 AM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 09-26-2024, 12:57 PM
        0 responses
        17 views
        0 likes
        Last Post seqadmin  
        Working...
        X