Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • cuffmerge failed

    any idea about this failure? how to make fasta have the same length? and what does this mean? thank you!

    [Wed Sep 28 20:19:22 2011] Beginning transcriptome assembly merge
    -------------------------------------------

    [Wed Sep 28 20:19:22 2011] Preparing output location ./merged_asm/
    [Wed Sep 28 20:19:22 2011] Converting GTF files to SAM
    gtf_to_sam: /usr/lib64/libz.so.1: no version information available (required by gtf_to_sam)
    [20:19:22] Loading reference annotation.
    gtf_to_sam: /usr/lib64/libz.so.1: no version information available (required by gtf_to_sam)
    [20:19:22] Loading reference annotation.
    gtf_to_sam: /usr/lib64/libz.so.1: no version information available (required by gtf_to_sam)
    [20:19:23] Loading reference annotation.
    [Wed Sep 28 20:19:23 2011] Quantitating transcripts
    cufflinks: /usr/lib64/libz.so.1: no version information available (required by cufflinks)
    You are using Cufflinks v1.1.0, which is the most recent release.
    [bam_header_read] EOF marker is absent.
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    File ./merged_asm/tmp/mergeSam_fileu98BBT doesn't appear to be a valid BAM file, trying SAM...
    [20:19:23] Loading reference annotation.
    [20:19:24] Inspecting reads and determining fragment length distribution.
    Processed 4610 loci.
    > Map Properties:
    > Total Map Mass: 27713.00
    > Read Type: 0bp single-end
    > Fragment Length Distribution: Truncated Gaussian (default)
    > Default Mean: 200
    > Default Std Dev: 80
    [20:19:24] Assembling transcripts and estimating abundances.
    Processed 4610 loci.
    [Wed Sep 28 20:19:53 2011] Comparing against reference file xxx.gtf
    You are using Cufflinks v1.1.0, which is the most recent release.
    No fasta index found for ../bowtie-0.12.7/genomes/chr.fasta. Rebuilding, please wait..
    Error: sequence lines in a FASTA record must have the same length!
    [FAILED]
    Error: could not execute cuffcompare

    Traceback (most recent call last):
    File "/home/student/yujinhai/cufflinks-1.1.0.Linux_x86_64/cuffmerge", line 573, in ?
    sys.exit(main())
    File "/home/student/yujinhai/cufflinks-1.1.0.Linux_x86_64/cuffmerge", line 556, in main
    compare_meta_asm_against_ref(params.ref_gtf, params.fasta, output_dir+"/transcripts.gtf")
    File "/home/student/yujinhai/cufflinks-1.1.0.Linux_x86_64/cuffmerge", line 406, in compare_meta_asm_against_ref
    tmap = compare_to_reference(gtf_input_file, ref_gtf, fasta_file)
    File "/home/student/yujinhai/cufflinks-1.1.0.Linux_x86_64/cuffmerge", line 342, in compare_to_reference
    exit(1)
    TypeError: 'str' object is not callable

  • #2
    The command line that you ran would be useful, as well as the first few lines of input files. The warnings about conversion and lack of GTF version information are a little odd.

    how to make fasta have the same length? and what does this mean?
    Looking at the cufflinks code, it's expecting a fairly standard fasta format with equal-length lines (except for the last line) in each fasta record. Something like this:
    Code:
    >gi|347448407|gb|JN582205.1| Dorylomorpha spinosa voucher KNWR:Ento:4382 cytochrome oxidase subunit 1 (COI) gene, partial cds; mitochondrial
    AACATTATATTTTATATTTGGTGCCTGAGCAGGAATAGTGGGTACATCCCTAAGAATCCTTATTCGAGCT
    GAACTAGGACATCCAGGATCACTAATTGGAGATGACCAAATTTATAACGTAATTGTAACAGCTCATGCTT
    TTGTGATAATTTTTTTTATAGTAATACCTATTATAATTGGAGGATTCGGGAATTGACTAGTACCCCTAAT
    ACTAGGAGCTCCTGACATAGCATTCCCTCGTATAAACAATATAAGATTTTGAATATTACCCCCATCATTA
    TCCCTTCTACTCCTTAGAAGAATAACTAACAACGGAGCTGGTACCGGATGAACGGTATACCCACCACTAT
    CATCAAACATCGCCCACGAAGGTGCATCAGTTGATTTAGCTATTTTTTCATTACATTTAGCAGGAATTTC
    ATCAATTCTAGGAGCAGTAAATTTTATTACTACAGTAATTAATATACGTTCAACAGGAATTTCATTTGAC
    CGAATACCTTTATTTGTATGGGCAGTAGTAATTACAGCATTATTACTTCTTTTATCATTACCAGTTCTTG
    CAGGAGCCATTACTATACTATTAACAGACCGAAATTTTAATACTTCATTCTTTGACCCGGCTGGAGGAGG
    TGACCCAATTTTATACCAACATTTATTT
    rather than this:
    Code:
    >gi|347448407|gb|JN582205.1| Dorylomorpha spinosa voucher KNWR:Ento:4382 cytochrome oxidase subunit 1 (COI) gene, partial cds; mitochondrial
    AACATTATATTTTATATTTGGTGCCTGAGCAGGAATAGTGGGTACATCCCTAAGAATCCTTATT
    CGAGCTGAACTAGGACATCCAGGATCACTAATTGGAGATGACCAAATTTATAACGTAATTGTAACAGCTCATGCTT
    TTGTGATAATTTTTTTTATAGTAATACCTATTATAATTGGAGGATTCGG
    GAATTGACTAGTACCCCTAATACTAGGAGCTCCTGACATAGCATTCCCTCGTATAAACAATATAAGATTTTGAATATTACCCCCATCATTA
    TCCCTTCTACTCCTTAGAAGAATAACTAACAACGGAGCTGGTACCGGATGAACGGTATACCCACCAC
    TATCATCAAACATCGCCCACGAAGGTGCATCAGTTGATTTAGCTATTTTTTCATTACATTTAGCAGGAATTTC
    ATCAATTCTAGGAGCAGTAAATTTTATTACTACAGTAATTAATATACGTTCAACAGGAATTTCATTTGA
    CCGAATACCTTTATTTGTATGGGCAGTAGTAATTACAGCATTATTACTTCTTTTATCATTACCAGTTCTTG
    CAGGAGCCATTACTATACTATTAACAGACCGAAATTTTAATACTTC
    ATTCTTTGACCCGGCTGGAGGAGGTGACCCAATTTTATACCAACATTTATTT

    Comment


    • #3
      thank you for reply

      Thank you for your reply, i checked my fasta file, and it seems to be right as you mentioned.(though they seem not have same length, but actually they do.)

      >chr
      CCGCGGCGCTGCTCCCGGCGCTCCGCGCCGGGAGACGGGGCGAGTCGCTGCGCTCCCCGCCAGGGAGCCG
      CTGCGCGGCTCGCAGTGGGTCGATTCCCGTTGCCGTCGATCGAGTCGCTTCGCTCCTCTGAGTTTCCGAG
      ATTAGGTTCTCGCCTGCACTTTTCATCGTCCCGTTCGATCCGGTCCCCCGCACCCCAACGGGGCTGGAGA
      AGCGGGAGGGTGTGCCCGACCCGCCGCCCACTCGCCTTCCCGCACCGCTCCATGTCATACCCACAGCATA
      CCACCCGGCACCCTCGAATCCCAAAACAGACGAAAAACTTAAAACACCCATATCTGTTGATAATCAACCT
      TTTTCGAACCTTACAATCTGAAAAACGTGCACAACCCACGTAAAAACTTACTCACCAAGTAATTACCCAA
      ACATGTTGTCAATCAATACCTTTCAGAAACGGCTCGAAAACGGACGAAGCAGACACCCCCACGCCCGCCG
      ACACCCCGGCGCCGGCACTCACCCGACAGGTCGCCGACACCCCACATCACAAACCGGAGACATGTCATCA
      CACCAGGTCACAGCACCATTACGCCCGCGGGTGCCGAGGTCGCATCGATCCCACCCAGAATGGGCAGCAG
      AGATTCAGCAGCGGATCGCGTCGCTCGCGGCGACGCCTTGGGGCGGGAGCGCGTCATCGTGTTCGAGAGA
      TTCTCCGATCAAGCCCGCCACGTGGTCGTCCTCGCCGCCGGCGCCGCCCGCACCCACCACCAGAACTGGC


      so i don't know how to fix it.

      the warnings are really wired, i don't even have a clue what's going wrong.


      [QUOTE=gringer;52590]The command line that you ran would be useful, as well as the first few lines of input files. The warnings about conversion and lack of GTF version information are a little odd.



      Looking at the cufflinks code, it's expecting a fairly standard fasta format with equal-length lines (except for the last line) in each fasta record. Something like this:
      Code:
      >gi|347448407|gb|JN582205.1| Dorylomorpha spinosa voucher KNWR:Ento:4382 cytochrome oxidase subunit 1 (COI) gene, partial cds; mitochondrial
      AACATTATATTTTATATTTGGTGCCTGAGCAGGAATAGTGGGTACATCCCTAAGAATCCTTATTCGAGCT
      GAACTAGGACATCCAGGATCACTAATTGGAGATGACCAAATTTATAACGTAATTGTAACAGCTCATGCTT
      TTGTGATAATTTTTTTTATAGTAATACCTATTATAATTGGAGGATTCGGGAATTGACTAGTACCCCTAAT
      ACTAGGAGCTCCTGACATAGCATTCCCTCGTATAAACAATATAAGATTTTGAATATTACCCCCATCATTA
      TCCCTTCTACTCCTTAGAAGAATAACTAACAACGGAGCTGGTACCGGATGAACGGTATACCCACCACTAT
      CATCAAACATCGCCCACGAAGGTGCATCAGTTGATTTAGCTATTTTTTCATTACATTTAGCAGGAATTTC
      ATCAATTCTAGGAGCAGTAAATTTTATTACTACAGTAATTAATATACGTTCAACAGGAATTTCATTTGAC
      CGAATACCTTTATTTGTATGGGCAGTAGTAATTACAGCATTATTACTTCTTTTATCATTACCAGTTCTTG
      CAGGAGCCATTACTATACTATTAACAGACCGAAATTTTAATACTTCATTCTTTGACCCGGCTGGAGGAGG
      TGACCCAATTTTATACCAACATTTATTT
      Attached Files

      Comment


      • #4
        This is a really old thread, but just in case somebody has the same problem,
        I solved it removing the extra new line at the end of my fasta file.

        Since my fasta file only had one record I only removed one empty line.
        I don't know if all empty lines would need to be removed for multiple fasta files.

        Comment


        • #5
          I was getting this error until I removed all empty lines between separate sequences.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Choosing Between NGS and qPCR
            by seqadmin



            Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
            10-18-2024, 07:11 AM
          • seqadmin
            Non-Coding RNA Research and Technologies
            by seqadmin




            Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

            Nobel Prize for MicroRNA Discovery
            This week,...
            10-07-2024, 08:07 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 11-01-2024, 06:09 AM
          0 responses
          13 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-30-2024, 05:31 AM
          0 responses
          16 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-24-2024, 06:58 AM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 10-23-2024, 08:43 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X