Seqanswers Leaderboard Ad



No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • CuffDiff cannot opn BAM file made in TopHat

    I am running a number of Tuxedo pipelines, starting with TopHat and ending in CuffDiff. Most of these pipelines work fine. However, I encountered the following error in CuffDiff for one of my pipelines:

    open: No such file or directory
    File /gpfs/group1/f/flyinv/Outputs_TopHat/transcriptomeSequence_exonCDS/AR_DM1005_Female/accepted_hits.bam doesn't appear to be a valid BAM file, trying SAM...
    Error: cannot open alignment file /gpfs/group1/f/flyinv/Outputs_TopHat/transcriptomeSequence_exonCDS/AR_DM1005_Female/accepted_hits.bam for reading

    I have checked over the input file carefully, and cannot find any errors in the file paths.

    The accepted_hits.bam file was made in TopHat. I have tried remaking it in TopHat and get the same result. I have looked over it in SamTools and can see no obvious errors.

    The only thing that separates this pipelines from my others is that I am using a gff file (made in TopHat from a Flybase GFF3 file) that contains exon and CDS data. Other GFF files with different data combinations seem to work (e.g. exons only, CDS + UTR). Has anyone encountered a similar problem? Is this a bug in CuffDiff?

    My full CuffDiff and CuffMerge pipeline is below.

    module load tophat/2.0.9
    module load bowtie/2.1.0
    module load cufflinks/2.1.1
    cuffmerge \
    -o /gpfs/group1/f/flyinv/Outputs_CuffMerge/exonCDS/Test_ARDM1005 \
    -g /gpfs/group1/f/flyinv/working_index/ExonCDS_.gff \
    -s /gpfs/group1/f/flyinv/RNASeq/Dpse3_0.fasta \
    cuffdiff \
    -o "/gpfs/group1/f/flyinv/Outputs_CuffDiff/exonCDS/Test_ARDM1005" \
    -L AR_DM1005_Male,AR_DM1005_Female \
    --total-hits-norm \
    --frag-bias-correct /gpfs/group1/f/flyinv/working_index/Dpse3_0_1.fa \
    --multi-read-correct \
    --library-norm-method classic-fpkm \
    /gpfs/group1/f/flyinv/Outputs_CuffMerge/exonCDS/Test_ARDM1005/merged.gtf \
    /gpfs/group1/f/flyinv/Outputs_TopHat/transcriptomeSequence_ExonCDS/AR_DM1005_Male/accepted_hits.bam \

  • #2
    Help!! Cuffdiff cannot open alignment file for reading

    Hi gwilymh!

    I am also getting the exact same error message that you've gotten when trying to run cuffdiff

    open: No such file or directory
    File /Volumes/Data/2013-08-20_Gp_Cell_Cycle_Transcriptome/Olson_Samples_Run2/SampleXp1/accepted_hits.bam doesn't appear to be a valid BAM file, trying SAM...
    Error: cannot open alignment file /Volumes/Data/2013-08-20_Gp_Cell_Cycle_Transcriptome/Olson_Samples_Run2/SampleXp1/accepted_hits.bam for reading

    My accepted_hits.bam file was generated in Tophat v2.0.8. I used Cufflinks v2.1.1 but got an error when I tried to use cuffmerge, so I used a slightly older version of Cuffmerge (v2.0.2) to merge my transcript.gtf files generated by Cufflinks (I can provide this error message later if anyone is interested).

    I have successfully used previous versions of the Tuxedo software to analyze my RNA-seq data (Tophat v2.0.8, Bowtie, and Cufflinks v2.0.2), so maybe it is a bug with the new version of Cufflinks?

    I did notice that the version of samtools has changed to, where the analyses that I've done with the previous Tuxedo software used samtools Could it also be a samtools issue?

    If anyone else is getting this error message, please help us!!!


    • #3
      The first thing to check is if you can "samtools view" the accepted_hits.bam file. If not, it was likely corrupted at some point. gwilymh's problem might be a bug in cuffdiff, it'd be good to make stripped down BAM and annotation files and see if the error persists.

      I should note that the initial "open: No such file or directory" error would suggest that the path for the accepted_hits.bam file is simply being misspecified.
      Last edited by dpryan; 09-12-2013, 03:05 AM. Reason: I should really proof read before posting!


      Latest Articles


      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin

        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin

        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM





      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 12:08 PM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      Last Post seqadmin