Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ysnapus
    replied
    Originally posted by ravipatel4 View Post
    Hi Kcornelius,

    Yes, I found so long transcripts and it seems multiple loci are merged into a single locus when using cuffmerge!!

    I am totally confused whether to use cuffmerge or cuffcompare to merge assemblies from different experiemntal conditions.
    Did you get the long merged loci even after specifying a reference gff to cuffmerge and using also reference based assemblies for cufflinks (providing -g gff to cufflinks)?

    Leave a comment:


  • pengchy
    replied
    I think one alternative method is merging all the bam together, and run cufflinks once.
    Or, do in silico normalization for the large fastq files befor running tophat.

    Leave a comment:


  • ravipatel4
    replied
    Originally posted by Kcornelius View Post
    Sure,

    I have posted one in a related thread:

    http://seqanswers.com/forums/showthread.php?t=19533
    Hi Kcornelius,

    Yes, I found so long transcripts and it seems multiple loci are merged into a single locus when using cuffmerge!!

    I am totally confused whether to use cuffmerge or cuffcompare to merge assemblies from different experiemntal conditions.

    Leave a comment:


  • Kcornelius
    replied
    Sure,

    I have posted one in a related thread:

    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc

    Leave a comment:


  • upper
    replied
    Originally posted by Kcornelius View Post
    Hi Shen,

    when you use cuffmerge, do you see any skipped regions with mouse?
    Have you checked the transcript lengths of the new assembly?
    I have seen in my output extremely long merged genes which were in fact severel different refseq IDs.
    Hi Kcornelius,

    you mean the merged.gtf that cuffmerge output?
    I check some transcript, but almost in know region. can you show a example.

    Leave a comment:


  • Kcornelius
    replied
    Hi Shen,

    when you use cuffmerge, do you see any skipped regions with mouse?
    Have you checked the transcript lengths of the new assembly?
    I have seen in my output extremely long merged genes which were in fact severel different refseq IDs.

    Leave a comment:


  • upper
    started a topic cuffcompare or cuffmerge for cuffdiff

    cuffcompare or cuffmerge for cuffdiff

    Hi ,all
    This is an old topic in our community.see here and here
    although C.Tapnell recommend cufflinks->cuffmerge->cuffdiff flow for diff exp analysis in hereand this new paper ,I must bring it again,beacause Too much confusion.

    I have 3 pair-end samples and hava two targets:
    [1] discovery new isoform and there structure
    [2] differential gene and transcript exp anlalysis and there structure
    tophat+cufflinks has no problem for 3 samples.

    for the this two aim.I use coffcompare analyze the transfrags which cufflinks assempled and cuffdiff analyze diff exp.

    one flow:
    cuffcompare -o compare -s genomic_seq.fa -r known.gtf tanscriptA.gtf transcriptB.gtf transcriptC.gtf
    cuffmerge -g known.gtf -s genomic_seq.fa 3_assembly_GTF_list.txt
    cuffdiff -o -b genomic_seq.fa -L A,B,C -u -p 6 merged.gtf A.bam B.bam C.bam
    I use cuffcompare,because cuffcompare output .refmap and tmap for each sample. I can extract every cuff_transcript's ref_gen and region from cufflinks result transcripts.gtf like this:
    for transcript ENSMUST00000048860
    IN sample A:
    Gene_name Transcript_id Class_code Cufflinks_transcript_id FPKM Coverage Transcript_length Ref_Transcript_length Chromosome Strand Start End Exon_num Exon_start-Exon_end;ditto
    Mreg ENSMUST00000048860 c Sample_A.442.1 9.998678 41.470300 243 2493 1 . 72205812 72206054 1 72205812-72206054;
    Mreg ENSMUST00000048860 = Sample_A.444.1 25.753304 108.941130 1695 2493 1 - 72206430 72258706 5 72206430-72207593;72208896-72209059;72210646-72210736;72238617-72238776;72258591-72258706;
    In sample B:
    Mreg ENSMUST00000048860 j Sample_B.478.1 0.355742 1.460597 1682 2493 1 - 72206370 72243058 5 72206370-72207593;72208896-72209059;72210646-72210736;72238617-72238776;72243016-72243058;
    Mreg ENSMUST00000048860 = Sample_B.478.2 1.652110 6.783196 1742 2493 1 - 72206370 72258693 5 72206370-72207593;72208896-72209059;72210646-72210736;72238617-72238776;72258591-72258693;
    and in compare.tracking
    TCONS_00001024 XLOC_000542 Mreg|ENSMUST00000048860 c q1:Sample_A.442|Sample_A.442.1|100|9.998678|4.225939|15.771418|41.470300|- - -
    TCONS_00001025 XLOC_000542 Mreg|ENSMUST00000048860 = q1:Sample_A.444|Sample_A.444.1|100|25.753304|24.064335|27.442272|108.941130|1695 q2:Sample_B.478|Sample_B.478.2|100|1.652110|1.194891|2.109330|6.783196|1742 -
    TCONS_00002413 XLOC_000542 Mreg|ENSMUST00000048860 j - q2:Sample_B.478|Sample_B.478.1|22|0.355742|0.086913|0.624571|1.460597|- -
    and this gene in cuffdiff result(treated):
    Tracking_id Gene_id Gene_name Class_code Nearest_ref_id TSS Locus Sample_1 Sample_2 FPKM_1 FPKM_2 Foldchange log2(fold_change) test_stat p_value q_value Significant
    TCONS_00004275 XLOC_001277 Mreg j ENSMUST00000048860 TSS2418 1:72205806-72258881 sample_A sample_B 8.99002 0.315128 0.0350531 -4.83432 3.98775 6.67042e-05 0.00727599 yes
    see if I foucus ENSMUST00000048860 due to cuffdiff result based foldchange.I need back compare result find this known transcript matched cufflinks assembled transcripts result to decide the assembled transcripts is known(class code = or c) or novel(class code j).
    But the cuffdiff id TCONS_00004275 is not same with cuffcompare TCONS_id and the Locus 1:72205806-72258881 also not same. This make me couldnot find interest ENSMUST00000048860's nearest structure in sample A and SampleB.IS Sample_A.442.1 or Sample_A.444.1 or other?


    so I change the workflow (without cuffmerge):

    cuffcompare -o compare -s genomic_seq.fa -r known.gtf tanscriptA.gtf transcriptB.gtf transcriptC.gtf
    cuffdiff -o -b genomic_seq.fa -L A,B,C -u -p 6 combined.gtfA.bam B.bam C.bam
    also use example ENSMUST00000048860
    for compare result(treated):
    IN sample A:
    Mreg ENSMUST00000048860 c Sample_A.443.1 9.998678 41.470300 243 2493 1 . 72205812 72206054 172205812-72206054;
    Mreg ENSMUST00000048860 = Sample_A.444.1 25.753304 108.941130 1695 2493 1 - 72206430 72258706 572206430-72207593;72208896-72209059;72210646-72210736;72238617-72238776;72258591-72258706;
    IN sample B:
    Mreg ENSMUST00000048860 j Sample_B.479.1 0.355742 1.460597 1682 2493 1 - 72206370 72243058 572206370-72207593;72208896-72209059;72210646-72210736;72238617-72238776;72243016-72243058;
    Mreg ENSMUST00000048860 = Sample_B.479.2 1.652110 6.783196 1742 2493 1 - 72206370 72258693 572206370-72207593;72208896-72209059;72210646-72210736;72238617-72238776;72258591-72258693;
    another confused,same cufflink+cuffcompare program but the cuff_id is diff ,Sample_A.442.1 Sample_A.444.1 with Sample_A.443.1 Sample_A.443.1 also in Sample_B

    in compare.tracking
    TCONS_00001024 XLOC_000542 Mreg|ENSMUST00000048860 c q1:Sample_A.443|Sample_A.443.1|100|9.998678|4.225939|15.771418|41.470300|- - -
    TCONS_00001025 XLOC_000542 Mreg|ENSMUST00000048860 = q1:Sample_A.444|Sample_A.444.1|100|25.753304|24.064335|27.442272|108.941130|1695 q2:Sample_B.479|Sample_B.479.2|100|1.652110|1.194891|2.109330|6.783196|1742 -
    TCONS_00002413 XLOC_000542 Mreg|ENSMUST00000048860 j - q2:Sample_B.479|Sample_B.479.1|22|0.355742|0.086913|0.624571|1.460597|- -
    TCONS_00003426 XLOC_000542 Mreg|ENSMUST00000048860 c - - q3:Sample_C.463|Sample_C.463.1|100|2.478294|1.853823|3.102766|10.598946|-
    TCONS_00003427 XLOC_000542 Mreg|ENSMUST00000048860 c - - q3:Sample_C.464|Sample_C.464.1|100|2.878927|1.557985|4.199870|11.712125|-
    this gene in cuffdiff result(treated):
    TCONS_00001025 XLOC_000542 Mreg = ENSMUST00000048860 TSS1655 1:72206327-72258693 sample_A sample_B 18.4693 1.30708 0.0707704 -3.82072 3.82424 0.000131174 0.0148275 yes

    the ENSMUST00000048860 TCONS_00001025 is same as one of comcompare TCONS_id and i konw it mapped Sample_A.444.1 and Sample_B.479.2. then i can find Sample_A.444.1 and Sample_B.479.2 structure
    Strand Start End Exon_num Exon_start-Exon_end;ditto
    - 72206430 72258706 5 72206430-72207593;72208896-72209059;72210646-72210736;72238617-- - 72238776;72258591-72258706;
    - 72206370 72258693 5 72206370-72207593;72208896-72209059;72210646-72210736;72238617-72238776;72258591-72258693;
    Then i can do next analysis

    but from this two flow the cuffdiff result are very different about this trascript ENSMUST00000048860
    cuffdiff result(treated):
    Tracking_id Gene_id Gene_name Class_code Nearest_ref_id TSS Locus Sample_1 Sample_2 FPKM_1 FPKM_2 Foldchange log2(fold_change) test_stat p_value q_value Significant
    TCONS_00004275 XLOC_001277 Mreg j ENSMUST00000048860 TSS2418 1:72205806-72258881 sample_A sample_B 8.99002 0.315128 0.0350531 -4.83432 3.98775 6.67042e-05 0.00727599 yes
    TCONS_00001025 XLOC_000542 Mreg = ENSMUST00000048860 TSS1655 1:72206327-72258693 sample_A sample_B 18.4693 1.30708 0.0707704 -3.82072 3.82424 0.000131174 0.0148275 yes
    whatever class code,fpkm,foldchange,and also there are other diff between two pipeline. no same known transcrips in the two cuffdiff result.
    I want to know which cuffdiff result is more credible,and how workflow can meet the needs of my analysis.

    Thanks
    Shen
    Last edited by upper; 05-08-2012, 12:39 AM.

Latest Articles

Collapse

  • seqadmin
    Non-Coding RNA Research and Technologies
    by seqadmin


    Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

    [Article Coming Soon!]...
    Yesterday, 08:07 AM
  • seqadmin
    Recent Developments in Metagenomics
    by seqadmin





    Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
    09-23-2024, 06:35 AM
  • seqadmin
    Understanding Genetic Influence on Infectious Disease
    by seqadmin




    During the COVID-19 pandemic, scientists observed that while some individuals experienced severe illness when infected with SARS-CoV-2, others were barely affected. These disparities left researchers and clinicians wondering what causes the wide variations in response to viral infections and what role genetics plays.

    Jean-Laurent Casanova, M.D., Ph.D., Professor at Rockefeller University, is a leading expert in this crossover between genetics and infectious...
    09-09-2024, 10:59 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 10-02-2024, 04:51 AM
0 responses
14 views
0 likes
Last Post seqadmin  
Started by seqadmin, 10-01-2024, 07:10 AM
0 responses
24 views
0 likes
Last Post seqadmin  
Started by seqadmin, 09-30-2024, 08:33 AM
1 response
31 views
0 likes
Last Post EmiTom
by EmiTom
 
Started by seqadmin, 09-26-2024, 12:57 PM
0 responses
20 views
0 likes
Last Post seqadmin  
Working...
X