Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • chrisbioinfo
    Junior Member
    • Oct 2012
    • 1

    RepeatModeler

    I,

    I want to use repeatModeler:
    I have created my database witout error, but when I launch the RepeatModeler script, I have this error in the output file :


    nohup: ignoring input
    RepeatModeler Version open-1.0.5
    ================================
    Search Engine = ncbi
    Database = PM ..
    - Sequences = 12940
    - Bases = 93195127
    Using temporary directory = /home/chris/ReapeatModeler/PM/RM_15527.TueOct161418482012


    RepeatModeler Round # 1
    ========================
    Searching for Repeats
    -- Sampling from the database...
    BeGINNING...
    - Gathering up to 40000000 bp

    RepeatModeler::sampleFromDB() Could not obtain sequence ncbi ( entry = 1-0-5, start = 1 end = 46138 ) from the database!





    Have you an idea?
    Thanks by advance

    Chris
  • mkdir
    Member
    • Feb 2012
    • 19

    #2
    I got the same problem, cannot figure out......

    Comment

    • themwg
      Junior Member
      • Jan 2011
      • 6

      #3
      RepeatModeler

      Any luck with figuring out your problem? I'm similarly lost.

      Comment

      • Lyn Hsiong
        Member
        • Sep 2011
        • 14

        #4
        Originally posted by chrisbioinfo View Post
        I,

        I want to use repeatModeler:
        I have created my database witout error, but when I launch the RepeatModeler script, I have this error in the output file :


        nohup: ignoring input
        RepeatModeler Version open-1.0.5
        ================================
        Search Engine = ncbi
        Database = PM ..
        - Sequences = 12940
        - Bases = 93195127
        Using temporary directory = /home/chris/ReapeatModeler/PM/RM_15527.TueOct161418482012


        RepeatModeler Round # 1
        ========================
        Searching for Repeats
        -- Sampling from the database...
        BeGINNING...
        - Gathering up to 40000000 bp

        RepeatModeler::sampleFromDB() Could not obtain sequence ncbi ( entry = 1-0-5, start = 1 end = 46138 ) from the database!





        Have you an idea?
        Thanks by advance

        Chris
        Hi, I think you could try to change the engine by "-engine abblast". I don't know why, but it works for me when I have a similar problem.
        lyn

        Comment

        • tando
          Junior Member
          • Nov 2012
          • 2

          #5
          I also got the same problem using NCBI rpsblast engine.
          After several fixation below, RepeatModeler started to run, though I don't know whether there are some problems or not, and there remains the possiblity that my fasta input might have been incorrect. At least, rondomely selected genomic DNA sequences were generated for statistic calculation of repetition.
          Anyway, the main problem was calling of "blastdbcmd" from the RepeatModeler perl script.

          (The below Line numbers might be inaccurate because I modified the file.)

          Line 281: Modification
          `$RepModelConfig::NCBIDBCMD_PRGM -db $genomeDB -entry all -outfmt "%g %l"`
          ( "%t %l" -> "%g %l" )
          #In my environment, the outfmt %t outputted nothing. So, I used %g instead.

          Line 1779: Modification
          my $openCoordStart = $start
          ( $start - 1 -> $start )
          #In my database, $start often outputted zero (0) though blastdbcmd program doesn't accept zero as input in -range option. So, I deleted "- 1" in the script.

          Line 1780: Insertion
          $seq = `$RepModelConfig::NCBIDBCMD_PRGM -db $dbFile -entry $seqID -range openCoordStart-$openCoordEnd`;
          #It seems that the program does not accept input without regitering our rmsblast database with gi| tags. So, I ignored " if ( $seqID =~ /gi\|(\d+)/ ) { ..." sentence and inserted another input line.

          Line 1783: Modification
          `$RepModelConfig::NCBIDBCMD_PRGM -db $dbFile -entry $seqID -range $openCoordStart-$openCoordEnd`;
          ( -range $openCoordEnd-$openCoordStart -> -range $openCoordStart-$openCoordEnd )
          #The correct input format of coordinate values for "-range" option of blastdbcmd is "Start"-"End". However, the order was reverse in the script.

          Comment

          • themwg
            Junior Member
            • Jan 2011
            • 6

            #6
            I think you might be my new hero Tando, thanks!

            I will point out that my version of the script 1.0.5 is slightly different..
            For me these changes got the program to work:

            line 281: change ( %t --> %g )

            line 1775: remove -1; ($start - 1 --> $start)

            line 1776: remove the If condition
            it seems that when i use BuildDatabase the seqID takes the form: gi|1:3333 (as opposed to gi|1 ).. I just removed the statement.. so my $seqID is a full gi|1:333 and not just a number. IF this becomes a problem then I should just redefine $seqID

            line 1778: my script was
            $seq = `$RepModelConfig::NCBIDBCMD_PRGM -db $dbFile -entry $1 -range $openCoordStart-$openCoordEnd`;
            I changed ( -entry $1 --> -entry $seqID ).
            $1 is defined as the seqID earlier in the script but that value doesn't get passed to the subdomain for sampleFromDB() . rather it uses some other definition of $1, and it ended up using "1.0.5" (the script version number) as the entry number. my perl skills are pretty weak and I couldn't determine what exactly was happening here, but your version makes more sense and seems to work.

            Thanks again, I for one, appreciate it!

            Comment

            • tando
              Junior Member
              • Nov 2012
              • 2

              #7
              $1, $2, $3 ... are the special variables that receive the 1st, 2nd and 3rd ... matches of regular expression, respectively.

              The script is assuming that $1 receives sequence IDs when conducting RegExp match at Line:1780 ( if ( $seqID =~ /gi\|(\d+)/...).

              However, without any match in this line (without the gi| tag), $1 (and $seq in the successive if sentence) are not renewed, and unfortunately, there remains the previously matched characters of the script version, "1-0-5" in $1.

              This causes aborting at the next "die if ($seq eq "") ..." lines and output "1-0-5" message.

              Comment

              • jaZt
                Junior Member
                • Oct 2010
                • 2

                #8
                Hi guys,

                I got the same problem with RepeatModeler_1.0.5

                I changed the script like you proposed, except the Lines 1776 and 1778:
                if ( $seqID =~ /gi\|(\d+)/ ) {
                $seq =
                `$RepModelConfig::NCBIDBCMD_PRGM -db $dbFile -entry $1 -range $openCoordStart-$openCoordEnd`;
                }


                Tando, you said, that you ignored the 1st line and inserted another one.

                How does these lines have to look like exactly then?

                I would be very grateful for some help.
                Thanks in advance!

                Comment

                • HeyIamNuria
                  Member
                  • Dec 2012
                  • 19

                  #9
                  Hi guys!
                  I am trying to install RrepeatModeler, but when I give it RepeatMasker path it returns:
                  “RepeatMasker is too old. Must be open-4.0.0 or later. Install a newer version of RepeatMasker and re-run configure.”

                  So I re-installed the latest version of RepeatMasker (Latest Released Version: 1/10/2013: RepeatMasker-open-4-0-0.tar.gz) and tried again with RepeatModeler, but it keeps saying the same, even if it is the version it is asking for.

                  It may be because of the name of the file. My file doesn’t have the version number (open-4.0.0), when I unpacked it changes to RepeatMasker only. But it may not be this.

                  Any ideas?
                  Thanks in advance

                  Nuria

                  Comment

                  • stephrom
                    Junior Member
                    • Jan 2013
                    • 1

                    #10
                    Originally posted by HeyIamNuria View Post
                    Hi guys!
                    I am trying to install RrepeatModeler, but when I give it RepeatMasker path it returns:
                    “RepeatMasker is too old. Must be open-4.0.0 or later. Install a newer version of RepeatMasker and re-run configure.”

                    So I re-installed the latest version of RepeatMasker (Latest Released Version: 1/10/2013: RepeatMasker-open-4-0-0.tar.gz) and tried again with RepeatModeler, but it keeps saying the same, even if it is the version it is asking for.

                    It may be because of the name of the file. My file doesn’t have the version number (open-4.0.0), when I unpacked it changes to RepeatMasker only. But it may not be this.

                    Any ideas?
                    Thanks in advance

                    Nuria
                    Hi Nuria,

                    modify in the configure script line 214
                    '$version <= 400' should be '$version < 400'

                    Stephane

                    Comment

                    • HeyIamNuria
                      Member
                      • Dec 2012
                      • 19

                      #11
                      Thank you

                      Thank you very much for your help Stephane

                      I changed it and it worked!!

                      Nuria

                      Comment

                      • antben
                        Junior Member
                        • Mar 2011
                        • 3

                        #12
                        RepeatScout fails in RepeatModeler

                        Hello all,

                        I am able to successfully run RepeatModeler (1-0-7) and it returns several hundred repeat models in my genome. However, all of these models are a result of RECON; nothing is returned by RepeatScout. RepeatScout is called during RepeatModeler round 1 but at the end it says "NOTE: RepeatScout did not return any models." RepeatScout is not called again by RepeatModeler. However, when I run RepeatScout directly on my genome it returns several hundred repeat models.

                        Has anybody successfully gotten RepeatScout to return repeat models within RepeatModeler? I don't understand why this would happen since RepeatScout works when I run it outside of RepeatModeler.

                        Any ideas?

                        Thanks,
                        Ben

                        Comment

                        • abaten
                          Junior Member
                          • Sep 2009
                          • 2

                          #13
                          Hi Everyone,
                          I've exactly the same problem as Ben described above, no models returned by 'RepeatScout' with 'RepeatModeler' run, however, many repeat models with independent 'RepeatScout' run.
                          Also is there any option to make 'RepeatModeler' run faster (e.g. parallel processing like that of RepeatMasker ?

                          Cheers.

                          Comment

                          • rhubley
                            Member
                            • Sep 2012
                            • 10

                            #14
                            Hi Ben,

                            RepeatScout is a great program for finding highly conserved repetitive elements. As a consequence we run RepeatScout first ( and only one round ) in order to find and remove the young elements first before moving on to RECON. RepeatScout will often will find tandem repeats and low complexity sequences in its return set. These are filtered out in RepeatModeler. You may want to check your hand-run result set isn't completely simple/low complexity by running nseg/trf on it. Another consideration is your choice of lmer size for RepeatScout. To fairly compare the results from both programs you need to use the same lmer size and the same sample ( from the input ) sequence. I rarely check seqanswers so please feel free to contact us through our website if you have further questions ( www.repeatmasker.org ).

                            -R

                            Comment

                            • antben
                              Junior Member
                              • Mar 2011
                              • 3

                              #15
                              Thank you for your input Robert. My problem turned out to be with RepeatScout, not RepeatModeler. Line 26 and 27 of the RepeatScout script "filter-stage-1.prl" are:

                              my $TRF_COMMAND = $ENV{'TRF_COMMAND'} || "trf";
                              my $NSEG_COMMAND = $ENV{'NSEG_COMMAND'} || "nseg";

                              I changed this to:

                              my $TRF_COMMAND = "trf";
                              my $NSEG_COMMAND = "nseg";

                              Note that both "trf" and "nseg" are executables in my path.

                              I don't know perl so I don't fully understand what is going on, but I think that RepeatScout was failing to find tandem repeat finder (TRF) and, without anything back from TRF, it determined that everything was a tandem repeat and filtered it all out. However, this must have something to do with calling TRF from within RepeatModeler, as RepeatScout returned models for me when I used it independently, so something funny appears to be happening with paths. Regardless, the RepeatModeler pipeline is now fully functional for me and recovers repeat models from RepeatScout as well as RECON.

                              Comment

                              Latest Articles

                              Collapse

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              13 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              24 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 12:03 PM
                              0 responses
                              28 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-02-2026, 11:40 AM
                              0 responses
                              22 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...