Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cufflinks stuck at a locus?

    When attempting to run cufflinks v2.0.2 (command line below) it quickly processes for moments and then gets stuck processing locus chr1:16765605-16765782. I've read on other threads to wait it out, but the job has been running for over 51hrs on 8 processors. It doesn't matter if I use the -g option or not or the -r option with a .gtf for all known rRNAs and tRNAs (downloaded from rmsk on UCSC table browser filtering for rRNA and tRNAs and then cat the files together). The reads were aligned with tophat2 without problems (~90% aligned or 50,000,000 paired-end reads from ribosomal depletion). Any suggestions? My only thought is to use the -r option with all of rmsk, but I'm not sure if that will help.

    Code:
    cufflinks -p 8 -M rrnatrnacatandsort.gtf -o C31_cufout accepted_hits.bam
    or
    Code:
    cufflinks -p 8 -M rrnatrnacatandsort.gtf -g ucscknowngenes.gtf -o C31_cufout accepted_hits.bam
    or
    Code:
    cufflinks -p 8 -g ucscknowngenes.gtf -o C31_cufout accepted_hits.bam
    or
    Code:
    cufflinks -p 8 -o C31_cufout accepted_hits.bam

  • #2
    Hi Caballien,

    Has you found out what's going on about cufflinks got stuck at a locus?

    I've got a similar problem too at a stage of doing "Inspecting reads and determining fragment length distribution". I have no idea what's happening with this problem. I've used a very trick way to run my data as well, but it doesn't mean I've addressed this problem.

    Code:
    #cufflinks -p 8 -M mask.gff -o ./data.th.cl ./data.th/accepted_hits.bam
    Here is how I ran cufflinks with the same dataset on 3 computers? The mask.gff is a file to exclude some genes I don't need.

    Computer-1:
    HP G7 server with CentOS6 and 28GB RAM. It got stuck at "Processing Locus Tb427_01_v4:1064380-1064569" .
    The same problem occurred when running with cufflinks complied from source code.

    Computer-2:
    DELL PC with CentOS 5.7 and 8GB RAM. It got stuck at "Processing Locus Tb427_01_v4:1064380-1064569".

    PS: 1 )Tb427 is T.brucei species, 01 is chr1 and v4 is just a version. The length of chr1 is 1064569.
    2) It works when the maks.gff includes all of items expect CDS and exon, but it doesn't make sense.

    Computer-3:
    Rocks cluster server with CentOS 5.6 and 256GB RAM. It was running as well but was over my head to figure out why it works.

    ---

    What I found among these three computers are that 1) Computer-3 makes great use of virtual memory and stack data when tracking from "top" command line, 2) the other two computers get stuck at processing loci when virtual memory and stack data reach about 3GB even if it's running whole the day, 3) There has a lot of physical memory left for both computer-1/2, and 4) All of three computers have more then 10GB swap space.

    Does anyone has a idea to explain this case or does anyone think that cufflinks allocating data to shuffle among swap ,stack and physical memory has something wrong?

    Many thanks in advance.

    Comment


    • #3
      For my test to Cufflinks in these couple of weeks, to ignore reads annotated like a rRNA, tRNA and so forth will solve problem of getting stuck at "Processing Locus ... ... ...". It has mentioned in the Cufflinks website.

      Hopefully, the information will help people meet the same problem in the future.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      31 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      33 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Working...
      X