Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks 1.0.0: Major new features in assembly and differential expression

    Hi all,

    I'm extremely pleased to announce the release of Cufflinks 1.0.0. We've incorporated feedback from users here and elsewhere to make Cufflinks much more powerful and accessible. Please don't hesitate to try it out and post questions and feedback here. Highlights of the release are listed below, and further information can be found at:

    http://cufflinks.cbcb.umd.edu

    Thanks,

    --The Cufflinks Team

    ***********************

    1.0.0 release - 5/5/2011

    This release represents a huge leap for Cufflinks in terms of performance and features. It is highly recommended that all users upgrade to this version of Cufflinks. Updates and improvements include:

    * A new Reference Annotation Based Transcript (RABT) assembly mode has been added. More details can be found in the How Cufflinks Works section.
    * Major improvements to Cuffdiff. Handling of replicates in Cuffdiff have been dramatically overhauled. Cuffdiff now models fragment count overdispersion with a beta negative binomial distribution in each condition prior to testing. See the substantially updated page on How Cufflinks works for more details.
    * Bias correction described here is now enabled with the -b/--frag-bias-correct option (-r/--reference-seq is no longer in use). A path to the reference multi-fasta used in mapping must be supplied following the option.
    * Added support for improved handling multi-mapping reads. Enable with the -u/--multi-read-correct option. See How Cufflinks Works for more details.
    * Trimming has been instituted to more accurately locate the 3' ends of transcripts during assembly based on coverage.
    * Cufflinks now includes a new tool called Cuffmerge to help merge assemblies from multiple samples into a single GTF for use with Cuffdiff. The tool also helps integrate a reference annotation file. See the Getting Started page for more details.
    * Output file formats have been made consistent between Cufflinks and Cuffdiff. See the Manual for more details on the new formats.
    * Both GFF3 and GTF2.2 annotations are now fully supported as input to all programs (see here).
    * Improved reporting of map properties.
    * The programs now check for available updates automatically on launch.
    * Upper-quartile normalizaion has been fixed to be consistent with published literature (enable with -N/--upper-quartile-norm).
    * Fixed a bug where some splice-junction reads were lost in quantitation.
    * Fixed a bug where reads landing in introns were over-filtered in assembly.
    * Numerous improvements in speed for both assembly and quantitation.
    * Cuffdiff now uses dramatically less memory. Cufflinks' memory footprint has also shrunk.
    * Numerous minor bug fixes.

  • #2
    Awesome!

    This is great for many reasons. I think the new option to have Reference Annotation Based Transcript (RABT) assembly will be a really nice feature.

    Furthermore, I'm excited to see how CuffDiff handles biological replicates now when it comes to differential splicing and promoter usage because we have a lot of samples (19 vs 20 for a case and control experiment) and what was being identified in Cuffdiff previously as significant didn't really do a great job representing the variation in the RNA-seq data samples we have (it seemed like outliers could really throw the results off) when we made plots of the individual samples run through Cufflinks.

    Your use of the JS divergence metric is such a cool idea for a way to find differences when doing comparisons, so we were working on our own way of utilizing that while still taking into account the variability of our biological replicates, but if the new version of CuffDiff does that better, we're definitely excited to utilize it instead.

    Comment


    • #3
      Dear all,
      FPKM calculation with 0.9.3 and 1.0.0 gives significantly different results. The best genes on our cancer samples had FPKM of approx. 15000, now this values changed to ~1.5M. Whats more, when you sort genes.fpkm_tracking you will obtain many different regions with the same FPKM values.

      Btw, great update Cole.
      Tomasz Stokowy
      www.sequencing.io.gliwice.pl

      Comment


      • #4
        Thanks!

        I'm very curious to see how the new cuffdiff goes for you. We're doing similar analyses, but on a more diverse collection of tissues and conditions and with fewer replicates in each. I'd love to hear how cuffdiff performs in designs like yours, so please don't hesitate to contact us with questions, suggestions, or problems.

        Comment


        • #5
          I am looking forward to test this version, thanks for the news!

          One thing I noticed, the gffread program mentioned in the website to test the validation of GFF3 files is not present in the Mac/OS or Linux binaries. Any comments on that?

          Comment


          • #6
            Originally posted by berath View Post
            I am looking forward to test this version, thanks for the news!

            One thing I noticed, the gffread program mentioned in the website to test the validation of GFF3 files is not present in the Mac/OS or Linux binaries. Any comments on that?
            Sure - my comment is that I'm a bozo for not including them in the binary packages. The script that builds those packages hadn't been correctly updated. I just posted a microrelease (v1.0.1) that fixes this and several other issues.

            Comment


            • #7
              Thanks for all your awesome work Cole and team!

              cuffmerge is an awesome new feature, particularily when cuffdiff is on tap. Thanks for adding this. I found that using the stats.combined.gtf GTFs was the way to go rather than the transcripts.gtf for cuffmerge. Using the transcripts.gtf will give these errors.

              Code:
              Error: duplicate GFF ID 'SAMPLE1-T24-MEDIA.ENSG00000105855.2' encountered!
                      [FAILED]
              Error: could not execute gtf_to_sam
              Traceback (most recent call last):
                File "/packages/cufflinks/cufflinks-1.0.1/bin/cuffmerge", line 513, in ?
                  sys.exit(main())
                File "/packages/cufflinks/cufflinks-1.0.1/bin/cuffmerge", line 492, in main
                  sam_input_files = convert_gtf_to_sam(gtf_input_files)
                File "/packages/cufflinks/cufflinks-1.0.1/bin/cuffmerge", line 287, in convert_gtf_to_sam
                  sam_out = gtf_to_sam(line)
                File "/packages/cufflinks/cufflinks-1.0.1/bin/cuffmerge", line 247, in gtf_to_sam
                  exit(1)
              TypeError: 'str' object is not callable
              I assume this is as expected?

              Comment


              • #8
                Hi Cole,

                On the manual of cufflinks


                One option of cuffmerge should be -g/--ref-gtf, not -r/--ref-gtf showed on current website,
                right?

                Thank you,

                Comment


                • #9
                  Hi Cole,

                  I am trying to run cuffmerge on my data - 3 x gtf-files created by cufflinks (galaxy).
                  Now running cuffmerge on the commandline I'm using a reference gtf and fasta file + the manifest file.

                  Everything works until I reach the stage where it compares against the reference file (gtf). Then I get the error below:

                  [Wed May 11 15:09:34 2011] Comparing against reference file /Users/aneldavanderwalt/data/maize/ZmB73_5a_WGS.gtf
                  You are using Cufflinks v1.0.1, which is the most recent release.
                  No fasta index found for /Users/aneldavanderwalt/Desktop/ZmB73_RefGen_v2.fasta. Rebuilding, please wait..
                  Fasta index rebuilt.
                  Error: duplicate GFF ID 'CUFF.AC155624.2' encountered!
                  [FAILED]
                  Error: could not execute cuffcompare


                  I looked at the run log and saw this:

                  cuffcompare -o tmp_meta_asm -r /Users/aneldavanderwalt/data/maize/ZmB73_5a_WGS.gtf -s /Users/aneldavanderwalt/Desktop/ZmB73_RefGen_v2.fasta R.cuffmerge//transcripts.gtf R.cuffmerge//transcripts.gtf


                  Seems like cuffmerge is comparing the same file twice? I'm not sure if I make the right assumption or how to solve it if this is the case?

                  Any suggestions?

                  Thanks!

                  Anelda

                  Comment


                  • #10
                    Originally posted by Anelda View Post
                    Hi Cole,

                    I am trying to run cuffmerge on my data - 3 x gtf-files created by cufflinks (galaxy).
                    Now running cuffmerge on the commandline I'm using a reference gtf and fasta file + the manifest file.

                    Everything works until I reach the stage where it compares against the reference file (gtf). Then I get the error below:

                    [Wed May 11 15:09:34 2011] Comparing against reference file /Users/aneldavanderwalt/data/maize/ZmB73_5a_WGS.gtf
                    You are using Cufflinks v1.0.1, which is the most recent release.
                    No fasta index found for /Users/aneldavanderwalt/Desktop/ZmB73_RefGen_v2.fasta. Rebuilding, please wait..
                    Fasta index rebuilt.
                    Error: duplicate GFF ID 'CUFF.AC155624.2' encountered!
                    [FAILED]
                    Error: could not execute cuffcompare


                    I looked at the run log and saw this:

                    cuffcompare -o tmp_meta_asm -r /Users/aneldavanderwalt/data/maize/ZmB73_5a_WGS.gtf -s /Users/aneldavanderwalt/Desktop/ZmB73_RefGen_v2.fasta R.cuffmerge//transcripts.gtf R.cuffmerge//transcripts.gtf


                    Seems like cuffmerge is comparing the same file twice? I'm not sure if I make the right assumption or how to solve it if this is the case?

                    Any suggestions?

                    Thanks!

                    Anelda
                    What's in the assembly manifest that you provide as input to cuffmerge?

                    Comment


                    • #11
                      Cole, I am having the same problem with cuffcompare. I wonder if this is related to my previous post regarding problems with cuffmerge?

                      This is using ensembl Homo_sapiens.GRCh37.61.

                      Code:
                      ~/bin/cuffcompare \
                              -r ${REFPATH}/Homo_sapiens.GRCh37.61.gtf \
                              -s ${REFPATH}/Homo_sapiens.GRCh37.61.ensembl.fa \
                              -p SAMPLE1-T24-MEDIA \
                              -o SAMPLE1-T24-MEDIA.stats \
                              SAMPLE1-T24-MEDIA.ensembl.transcripts.gtf
                      Dies:
                      Code:
                      Error: duplicate GFF ID 'SAMPLE1-T24-MEDIA.ENSG00000105855.2' encountered!
                      I have 56 samples this experiment -- about half work fine, the other half throw these errors. Using cufflinks 1.0.1 and the patched version of cuffcompare that fixed the GFF code changes regarding the treatment of gene_id vs gene_name. I had this same problem before the patch FWIW.

                      Comment


                      • #12
                        Hi,

                        I notice that the manual states that cufflinks does not currently support SAM files with CIGAR strings using operators other than 'M' and 'N'. Is this still accurate for cufflinks 1.0.x? I've just run cufflinks on a SAM produced by tophat with --allow-indels and it completed without any errors. Should I trust this output?

                        Comment


                        • #13
                          Originally posted by Cole Trapnell View Post
                          What's in the assembly manifest that you provide as input to cuffmerge?
                          The paths to the three cufflinks gtf files

                          /Users/aneldavanderwalt/Downloads/R1.gtf
                          /Users/aneldavanderwalt/Downloads/R2.gtf
                          /Users/aneldavanderwalt/Downloads/R3.gtf


                          With a newline after the last file - each one on it's own line.

                          Let me know if you need more info?

                          Comment


                          • #14
                            Originally posted by Anelda View Post
                            The paths to the three cufflinks gtf files

                            /Users/aneldavanderwalt/Downloads/R1.gtf
                            /Users/aneldavanderwalt/Downloads/R2.gtf
                            /Users/aneldavanderwalt/Downloads/R3.gtf


                            With a newline after the last file - each one on it's own line.

                            Let me know if you need more info?
                            Oh yes, running on Mac OS x 10.6.7 - version 1.0.1 (cuffmerge)

                            Thanks,
                            Anelda

                            Comment


                            • #15
                              Hm. Very strange. Can you send us a small amount of GTF from one or more of your assemblies along with the reference GTF so that we can reproduce this? It's hard to say what's happening here, but the log looks very odd. We'll keep the data to ourselves of course, and chuck it once the bug is fixed.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Advanced Tools Transforming the Field of Cytogenomics
                                by seqadmin


                                At the intersection of cytogenetics and genomics lies the exciting field of cytogenomics. It focuses on studying chromosomes at a molecular scale, involving techniques that analyze either the whole genome or particular DNA sequences to examine variations in structure and behavior at the chromosomal or subchromosomal level. By integrating cytogenetic techniques with genomic analysis, researchers can effectively investigate chromosomal abnormalities related to diseases, particularly...
                                Yesterday, 06:26 AM
                              • seqadmin
                                How RNA-Seq is Transforming Cancer Studies
                                by seqadmin



                                Cancer research has been transformed through numerous molecular techniques, with RNA sequencing (RNA-seq) playing a crucial role in understanding the complexity of the disease. Maša Ivin, Ph.D., Scientific Writer at Lexogen, and Yvonne Goepel Ph.D., Product Manager at Lexogen, remarked that “The high-throughput nature of RNA-seq allows for rapid profiling and deep exploration of the transcriptome.” They emphasized its indispensable role in cancer research, aiding in biomarker...
                                09-07-2023, 11:15 PM
                              • seqadmin
                                Methods for Investigating the Transcriptome
                                by seqadmin




                                Ribonucleic acid (RNA) represents a range of diverse molecules that play a crucial role in many cellular processes. From serving as a protein template to regulating genes, the complex processes involving RNA make it a focal point of study for many scientists. This article will spotlight various methods scientists have developed to investigate different RNA subtypes and the broader transcriptome.

                                Whole Transcriptome RNA-seq
                                Whole transcriptome sequencing...
                                08-31-2023, 11:07 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, Today, 06:57 AM
                              0 responses
                              6 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, Yesterday, 07:53 AM
                              0 responses
                              8 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-25-2023, 07:42 AM
                              0 responses
                              14 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 09-22-2023, 09:05 AM
                              0 responses
                              44 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X