The patched version is working more than twice as fast (actually went just a little bit faster but using -p16 instead of -p32), so that's good.
But now I'm having a problem that I've encountered with other Tuxedo tools (cuffquant and cuffdiff) with both patched and unpatched 2.2.1, I'm getting a segfault even though I'm not running of of memory. In this case I used patched Cufflinks 2.2.1 on two different .bam files and got a segfault for both files at the same locus near the beginning of "Re-estimating abundances with bias and multi-read correction". I don't have an overabundance of reads at or near this location when I visualize the aligned files. But I don't think it's an issue with a specific locus since in cuffdiff I tried masking the offending locus then it just segfaulted somewhere else, in both cases near the beginning of "Testing for differential expression and regulation in locus".
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by offspring View PostOne "long runtime" problem in cufflinks 2.2.1 related to an inefficient data structure has been reported and fixed recently: https://groups.google.com/forum/#!to...rs/UzLCJhj3lUE
It will be part of cufflinks 2.2.2 (not released yet), would be interesting to know if this fixes your issue.
The commit in question: https://github.com/cole-trapnell-lab...a0292d507f17b6
ChrisLast edited by biocomputer; 01-07-2015, 02:43 PM.
Leave a comment:
-
One "long runtime" problem in cufflinks 2.2.1 related to an inefficient data structure has been reported and fixed recently: https://groups.google.com/forum/#!to...rs/UzLCJhj3lUE
It will be part of cufflinks 2.2.2 (not released yet), would be interesting to know if this fixes your issue.
The commit in question: https://github.com/cole-trapnell-lab...a0292d507f17b6
Chris
Leave a comment:
-
Thank you, yes I definitely plan to try other programs besides cufflinks/Tuxedo to compare results.
Leave a comment:
-
Originally posted by biocomputer View PostCufflinks 2.2.1 is taking a really long time. I start with 45 million 100bp paired-end, rRNA depleted, stranded reads aligned with STAR, 24 million uniquely align, 15 million are multimappers, using a sorted .bam as the input for cufflinks. Cufflinks command is:
Code:cufflinks -o outputFolder -p 32 -g gencode.v2.annotation.gtf -M maskFile.gtf -b mm10.fa -u --library-type fr-secondstrand inputSorted.bam
-p 32, all 32 CPU's are in use for pretty much the entire time, here's the usage for the past 24 hours from the 32-CPU node I've been using, you can see it going down as threads are completing at the end of the cufflinks run.
The mask file is masking out a few very highly expressed genes which make up almost 20% of all reads. When I didn't mask these out it got hung up at these loci.
The library type is reversed because I'm following these instructions.
After 3.5 days I think it's just about done (it's at "waiting for 18 threads to complete"). Given that the number of input reads isn't huge (especially once all the masked reads are accounted for) and I'm using 32 CPU's, I'm surprised it's taking so long. It doesn't seem like it's getting hung up at any specific spots, but it does seem to slow down as it goes, until it's taking many hours for each of the last few percent.
Is this runtime normal? Anything I can do to speed it up?
Currently, DESeq seems to be a better choice than Cufflinks.
Leave a comment:
-
Cufflinks runtime
Cufflinks 2.2.1 is taking a really long time. I start with 45 million 100bp paired-end, rRNA depleted, stranded reads aligned with STAR, 24 million uniquely align, 15 million are multimappers, using a sorted .bam as the input for cufflinks. Cufflinks command is:
Code:cufflinks -o outputFolder -p 32 -g gencode.v2.annotation.gtf -M maskFile.gtf -b mm10.fa -u --library-type fr-secondstrand inputSorted.bam
-p 32, all 32 CPU's are in use for pretty much the entire time, here's the usage for the past 24 hours from the 32-CPU node I've been using, you can see it going down as threads are completing at the end of the cufflinks run.
The mask file is masking out a few very highly expressed genes which make up almost 20% of all reads. When I didn't mask these out it got hung up at these loci.
The library type is reversed because I'm following these instructions.
After 3.5 days I think it's just about done (it's at "waiting for 18 threads to complete"). Given that the number of input reads isn't huge (especially once all the masked reads are accounted for) and I'm using 32 CPU's, I'm surprised it's taking so long. It doesn't seem like it's getting hung up at any specific spots, but it does seem to slow down as it goes, until it's taking many hours for each of the last few percent.
Is this runtime normal? Anything I can do to speed it up?
Latest Articles
Collapse
-
by seqadmin
Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.
Nobel Prize for MicroRNA Discovery
This week,...-
Channel: Articles
10-07-2024, 08:07 AM -
-
by seqadmin
Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...-
Channel: Articles
09-23-2024, 06:35 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 10-02-2024, 04:51 AM
|
0 responses
103 views
0 likes
|
Last Post
by seqadmin
10-02-2024, 04:51 AM
|
||
Started by seqadmin, 10-01-2024, 07:10 AM
|
0 responses
111 views
0 likes
|
Last Post
by seqadmin
10-01-2024, 07:10 AM
|
||
Started by seqadmin, 09-30-2024, 08:33 AM
|
1 response
114 views
0 likes
|
Last Post
by EmiTom
10-07-2024, 06:46 AM
|
||
Started by seqadmin, 09-26-2024, 12:57 PM
|
0 responses
20 views
0 likes
|
Last Post
by seqadmin
09-26-2024, 12:57 PM
|
Leave a comment: