Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Missing RepeatMasker output? Literally nothing.

    It is a real head-scratcher for me.

    After no errors and completely running through all cycles, RepeatMasker finishes but there are no output files. The only trace of the analysis is the rmblastdb.log file in the RepeatMasker/Libraries directory which reads:

    Building a new DB, current time: 05/29/2014 10:54:02
    New DB name: /home/mtollis/RepeatMasker/Libraries/20140131/anolis/specieslib
    New DB title: /home/mtollis/RepeatMasker/Libraries/20140131/anolis/specieslib
    Sequence type: Nucleotide
    Keep Linkouts: T
    Keep MBits: T
    Maximum file size: 1000000000B
    Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 100% ambiguous nucleotides (shouldn't be over 40%)
    Error: (1431.1) FASTA-Reader: Warning: FASTA-Reader: First data line in seq is about 100% ambiguous nucleotides (shouldn't be over 40%)
    Adding sequences from FASTA; added 776 sequences in 0.108892 seconds.

    Perhaps the makeblastdb "error" is harmless and maybe it is merely coincidental that my analysis fails. I don't see how either true ambiguities or line endings (as in a Unix to Mac or vice-versa) are the problem, as my database is hardly novel: I am using the RepBase update and the -species command. the command appears to work, as it creates the species specific library as well as the general library in the RepeatMasker/Libraries directory.

    Does anyone know why RepeatMasker would run without throwing any errors and then leave no output files whatsoever?

  • #2
    Have you tried to run some test data through to see if the install of repeatmasker is working as expected?

    Comment


    • #3
      Yes, a smaller test data set worked and produced the expected output. So it rules out the makeblastdb error as the source (i think). Perhaps it's the size and number of sequences in the full data set? It's an NGS genome assembly with several thousand scaffolds. Although I have run RepeatMasker on similar data sets in the past with no problems.

      Comment


      • #4
        Happened again, found this error in the standard output.

        Can't call method "getScore" on unblessed reference at /home/mtollis/RepeatMasker/PRSearchResult.pm line 164.

        Comment


        • #5
          Problem solved

          From the RepeatMasker developer, who suggested the following two fixes:

          "The culprit is the processing of the alignment data using the "-a" flag. I tracked it down to a bug
          in a routine which handles joining DNA transposons. The ugly match set was:

          334 C21533332 2812 2859 + HAT1_DR#DNA/hAT-Ac 598 645
          299 C21533332 2812 2859 C hAT-N76_DR#DNA/hAT 2324 2371

          And the line in ProcessRepeats is ( line 7852 )

          # add fused element to our derived from list
          if ( $options{'source'} ) {
          $lastAnnot->addDerivedFromAnnot( $member );
          }

          This should be:

          # add fused element to our derived from list
          if ( $options{'source'} ) {
          $lastAnnot->addDerivedFromAnnot( $member->{'annot'} );
          }
          "

          "I found something which causes ProcessRepeats to go into an infinite loop. It keeps expanding an array until the computer runs out of memory and the process is killed. It didn't print the
          "Can't call method "getScore" on unblessed reference at /home/mtollis/RepeatMasker/PRSearchResult.pm line 164"
          You have seen before though. I am not sure how you got that a second time. In any case I fixed this problem and I wondered if you might rerun this file on your system. The fix is in the PRSearchResult.pm module. You can download a patched copy of the module here:

          http://www.repeatmasker.org/~rhubley...chResult.pm.gz

          Copy this into your RepeatMasker directory, backup your old file and unzip this one:

          mv PRSearchResult.pm PRSearchResult.pm.bak
          gunzip PRSearchResult.pm.gz

          I hope this fixes your problem. Thanks for reporting this!"
          Last edited by marct; 02-02-2015, 03:04 PM.

          Comment


          • #6
            Thank you for taking the time to come back to an old thread to submit the solution.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            32 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            35 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            29 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Working...
            X