Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cufflinks memory usage

    Hi all,
    I am having trouble with running Cufflinks on PE RNA-Seq libraries generated from HiSeq machine. I have used Tophat to successfully mapped those PE reads (about 160 millions reads), which gave me a BAM file about 6GB for each library. Then I fed the BAM file to Cufflinks running with 20 cores. Now the problem is it seems Cufflinks is taking more than 120GB ram, and is taking very long (about a week) to run one library. Have any of you had similar experience? Am I doing something wrong? Any suggestions? Thanks!

  • #2
    Which reference annotations are you using? I had similar experience with gencode annotations (>2.5 millions annotations). I then switched to RefSeq annotations and the Cufflinks runs are now much shorter. Of course, it's quantify much less potential transcripts, but for most applications, that can be sufficient.

    Comment


    • #3
      in addition to posting your reference, you may want to post which options you're utilizing in cufflinks.

      Comment


      • #4
        Right. I am also using the Gencode annotation. I think I will experiment with other annotation file to see how it goes. The options I am utilizing in Cufflinks are simply specifying the reads come from PE reads (i.e. --fr-unstranded).

        Comment


        • #5
          oscar,

          did you ever get an answer or a work around? I am running cufflinks on a similar sized bam file and am also running out of memory.

          cheers,

          Comment


          • #6
            I couldn't get the job done until upgraded to the latest version of Cufflinks which seems to use less memory. Good luck!

            Comment


            • #7
              Hey Oscar,

              I am running the newest version of Cufflinks (I literally downloaded it this week). I got Cufflinks to run on a small sub-set of reads (about 4 million paired end reads, 100nt with ~200nt inner gap). But the whole data-set is ~50X bigger. How many reads did you use? And what (approximately) was the memory usage for your file size (RAM per GB of the bam/sam file)?

              Cheers,

              Comment


              • #8
                Hi,
                I don't remember the exact numbers as it was about a year ago. One thing I do remember is I was lucky enough to utilize a machine with 1TB RAM, and I used about 500GB for about 160 million reads. I hope this helps. Good luck!

                Comment


                • #9
                  We only have one compute node with that much memory on our cluster and I didn't want to usurp it if I didn't have to. But I guess that's what the resources are for. Thanks Oscar. (Also, about how long did it take to run?)

                  Cheers,

                  Comment


                  • #10
                    Expect it to run longer than a week.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Latest Developments in Precision Medicine
                      by seqadmin



                      Technological advances have led to drastic improvements in the field of precision medicine, enabling more personalized approaches to treatment. This article explores four leading groups that are overcoming many of the challenges of genomic profiling and precision medicine through their innovative platforms and technologies.

                      Somatic Genomics
                      “We have such a tremendous amount of genetic diversity that exists within each of us, and not just between us as individuals,”...
                      05-24-2024, 01:16 PM
                    • seqadmin
                      Recent Advances in Sequencing Analysis Tools
                      by seqadmin


                      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                      05-06-2024, 07:48 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, 05-30-2024, 03:16 PM
                    0 responses
                    20 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-29-2024, 01:32 PM
                    0 responses
                    23 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-24-2024, 07:15 AM
                    0 responses
                    213 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-23-2024, 10:28 AM
                    0 responses
                    228 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X