Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Jeca
    Junior Member
    • Jan 2010
    • 1

    Running 454 mapper by command line CLI

    Hello everybody,

    I am a total novice in linux trying to figure out the syntax to run by CLI and not knowing how to specify the annotation and target sequence file. I know how to start a mapping project and pull in reference and data files (ie runMapping referencepath [email protected]) but don't know how to specify and where to put snp130.txt, geneRef.txt, or my targetsequences.giff files. I know that this must be a pretty basic question for a bionformatician. Many thanks

    Jeca
  • westerman
    Rick Westerman
    • Jun 2008
    • 1104

    #2
    You need to set the GOLDENPATH environmental variable to point to the directory containing those files. See the documentation. Play around a bit.

    Comment

    • sulicon
      Member
      • Aug 2010
      • 41

      #3
      I have met the same problem.

      I have put the genomic sequences under "chromosomes" directory, and the annotation files, refGene.txt and snp131.txt in the neighboring folder called "database". According to the document from Roche, the annotation files could be found this way. But I failed to let it work.

      Moreover, I have set the parent directory of the above directories, hg19, as the "GOLDENPATH":
      echo $GOLDENPATH
      /home/sulicon/data/hg19

      It didn't work neither...

      I have also tried to set "GOLDENPATH" as "/home/sulicon/data", and used "hg19", as the name of reference genome that would be used. Unfortunately, gsMapper wasn't able to recognize the folder structure..

      Any suggestion is appreciated.

      Comment

      • sklages
        Senior Member
        • May 2008
        • 628

        #4
        The GOLDENPATH is pointing to "/home/sulicon/data" which contains a subfolder "hg19" which contains the subfolders "chromosomes" (single fa files) and "database" (containing snp131.txt, refLink.txt, refGene.txt and productName.txt), correct?

        And how did you start the mapper?

        Minimal example (aasuming EST data as input):
        $ runMapping -cdna -gref hg19 READS.sff

        This should work ...

        Sven

        Comment

        • sulicon
          Member
          • Aug 2010
          • 41

          #5
          Hi Sven,

          Thanks very much! I have tried what you said:

          $ echo $GOLDENPATH
          /home/shuli/data
          $ ls /home/sulicon/data/hg19
          chromosomes database
          $ runMapping -cdna -gref hg19 /path/to/reads/reads.sff
          Error: Reference file/directory does not exist: hg19

          I have noticed you mentioned "single fa files" should be put into the "chromosomes" folder, whereas I have put fasta files each corresponding to a chromosome. Maybe this is the problem? Will have a try later on...

          Comment

          • sklages
            Senior Member
            • May 2008
            • 628

            #6
            Originally posted by sulicon View Post
            Hi Sven,

            Thanks very much! I have tried what you said:

            $ echo $GOLDENPATH
            /home/shuli/data
            $ ls /home/sulicon/data/hg19
            chromosomes database
            $ runMapping -cdna -gref hg19 /path/to/reads/reads.sff
            Error: Reference file/directory does not exist: hg19

            I have noticed you mentioned "single fa files" should be put into the "chromosomes" folder, whereas I have put fasta files each corresponding to a chromosome. Maybe this is the problem? Will have a try later on...

            1) This is how my (probably too fat) UCSC dir tree looks like,


            2) you set GOLDENPATH to /home/shuli/data and were using a different path to store the data /home/sulicon/data. You are telling gsMapper to look in /home/shuli/data/hg19 which is probably not the correct dir ..

            hth, Sven

            Comment

            • sulicon
              Member
              • Aug 2010
              • 41

              #7
              Thanks again.
              The GOLDENPATH variable is corrected now but the reference seq still can't be recognized...

              The following is the structure of my hg19 directory. It looks similar with yours.
              Code:
              $ tree hg19
              hg19
              |-- chromosomes
              |   |-- chr1.fa
              |   |-- chr10.fa
              |   |-- chr11.fa
              |   |-- chr11_gl000202_random.fa
              |   |-- chr12.fa
              |   |-- chr13.fa
              |   |-- chr14.fa
              |   |-- chr15.fa
              |   |-- chr16.fa
              |   |-- chr17.fa
              |   |-- chr17_ctg5_hap1.fa
              |   |-- chr17_gl000203_random.fa
              |   |-- chr17_gl000204_random.fa
              |   |-- chr17_gl000205_random.fa
              |   |-- chr17_gl000206_random.fa
              |   |-- chr18.fa
              |   |-- chr18_gl000207_random.fa
              |   |-- chr19.fa
              |   |-- chr19_gl000208_random.fa
              |   |-- chr19_gl000209_random.fa
              |   |-- chr1_gl000191_random.fa
              |   |-- chr1_gl000192_random.fa
              |   |-- chr2.fa
              |   |-- chr20.fa
              |   |-- chr21.fa
              |   |-- chr21_gl000210_random.fa
              |   |-- chr22.fa
              |   |-- chr3.fa
              |   |-- chr4.fa
              |   |-- chr4_ctg9_hap1.fa
              |   |-- chr4_gl000193_random.fa
              |   |-- chr4_gl000194_random.fa
              |   |-- chr5.fa
              |   |-- chr6.fa
              |   |-- chr6_apd_hap1.fa
              |   |-- chr6_cox_hap2.fa
              |   |-- chr6_dbb_hap3.fa
              |   |-- chr6_mann_hap4.fa
              |   |-- chr6_mcf_hap5.fa
              |   |-- chr6_qbl_hap6.fa
              |   |-- chr6_ssto_hap7.fa
              |   |-- chr7.fa
              |   |-- chr7_gl000195_random.fa
              |   |-- chr8.fa
              |   |-- chr8_gl000196_random.fa
              |   |-- chr8_gl000197_random.fa
              |   |-- chr9.fa
              |   |-- chr9_gl000198_random.fa
              |   |-- chr9_gl000199_random.fa
              |   |-- chr9_gl000200_random.fa
              |   |-- chr9_gl000201_random.fa
              |   |-- chrM.fa
              |   |-- chrUn_gl000211.fa
              |   |-- chrUn_gl000212.fa
              |   |-- chrUn_gl000213.fa
              |   |-- chrUn_gl000214.fa
              |   |-- chrUn_gl000215.fa
              |   |-- chrUn_gl000216.fa
              |   |-- chrUn_gl000217.fa
              |   |-- chrUn_gl000218.fa
              |   |-- chrUn_gl000219.fa
              |   |-- chrUn_gl000220.fa
              |   |-- chrUn_gl000221.fa
              |   |-- chrUn_gl000222.fa
              |   |-- chrUn_gl000223.fa
              |   |-- chrUn_gl000224.fa
              |   |-- chrUn_gl000225.fa
              |   |-- chrUn_gl000226.fa
              |   |-- chrUn_gl000227.fa
              |   |-- chrUn_gl000228.fa
              |   |-- chrUn_gl000229.fa
              |   |-- chrUn_gl000230.fa
              |   |-- chrUn_gl000231.fa
              |   |-- chrUn_gl000232.fa
              |   |-- chrUn_gl000233.fa
              |   |-- chrUn_gl000234.fa
              |   |-- chrUn_gl000235.fa
              |   |-- chrUn_gl000236.fa
              |   |-- chrUn_gl000237.fa
              |   |-- chrUn_gl000238.fa
              |   |-- chrUn_gl000239.fa
              |   |-- chrUn_gl000240.fa
              |   |-- chrUn_gl000241.fa
              |   |-- chrUn_gl000242.fa
              |   |-- chrUn_gl000243.fa
              |   |-- chrUn_gl000244.fa
              |   |-- chrUn_gl000245.fa
              |   |-- chrUn_gl000246.fa
              |   |-- chrUn_gl000247.fa
              |   |-- chrUn_gl000248.fa
              |   |-- chrUn_gl000249.fa
              |   |-- chrX.fa
              |   |-- chrY.fa
              |   `-- chromFa.tar.gz
              `-- database
                  |-- refGene.txt
                  |-- refLink.txt
                  `-- snp131.txt

              Comment

              • sklages
                Senior Member
                • May 2008
                • 628

                #8
                what about "productName.txt"?

                Comment

                • sulicon
                  Member
                  • Aug 2010
                  • 41

                  #9
                  I don't have this file. Is it required? And the problem is that even the reference genome can't be recognized:
                  "Error: Reference file/directory does not exist: hg19"

                  Maybe I need the "bigZips" folder as you did?

                  Comment

                  • sklages
                    Senior Member
                    • May 2008
                    • 628

                    #10
                    As Roche stated in their manual that the suite recognizes the UCSC directory structure I went the lazy way, I just used the whole tree. I have not really tested which files/folder can be omitted ..

                    Though it is very strange that you get an error message stating that the file has not been found. .. it sounds as if there is still a "mismatch" between the GOLDENPATH path and the actual data location ..

                    Maybe it is best to try the whole tree and (if you are patient) remove the parts not necessary for your mapping (but probably it is not worth removing files).

                    Comment

                    • sulicon
                      Member
                      • Aug 2010
                      • 41

                      #11
                      It turns out that the reason for this is I've forgot to 'export' the GOLDENPATH variable... Everything is OK now.

                      Comment

                      • G.Chevignon
                        Junior Member
                        • Apr 2011
                        • 1

                        #12
                        Hello everybody

                        I want to achieve a mapping of reads 454 on a genome with a threshold of 5 or 10 reads to the formation of consensus. This setting is in the GUI version of GSMapper "Minimum contig depht" but I can not find it in the CLI version ofGSMapper.

                        This parameter is there in the CLI version?

                        Thank you for your help

                        Comment

                        Latest Articles

                        Collapse

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by SEQadmin2, 06-09-2026, 11:58 AM
                        0 responses
                        21 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-05-2026, 10:09 AM
                        0 responses
                        27 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-04-2026, 08:59 AM
                        0 responses
                        38 views
                        0 reactions
                        Last Post SEQadmin2  
                        Started by SEQadmin2, 06-02-2026, 12:03 PM
                        0 responses
                        61 views
                        0 reactions
                        Last Post SEQadmin2  
                        Working...