Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • muzz56
    Member
    • Sep 2010
    • 14

    How to get index files using picard

    Hi,
    Am new to this and just reading manuals to get some work done quickly. Am trying to generate an index file to use with Scripture using picard-tools but am getting an error that I don't understand. Does anyone have a clue what am missing here. The error is below:

    /Data/RNA_seq$ java -jar /home/mh/Data/seq/picard-tools-1.36/SortSam.jar I=GSM520_ES.aligned.sam O=GSM520.sorted.sam SO=coordinate
    [Sat Feb 12 23:29:02 EST 2011] net.sf.picard.sam.SortSam INPUT=GSM520_ES.aligned.sam OUTPUT=GSM520.sam SORT_ORDER=unsorted TMP_DIR=/tmp/mh VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    [Sat Feb 12 23:29:02 EST 2011] net.sf.picard.sam.SortSam done.
    Runtime.totalMemory()=252379136
    Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. Empty sequence dictionary.; Line 1
    Line: SL-XAS:8:97:1621:1389#0 0 chr1 3044508 0 76M * 0 0 AGAGCGCATAGCCCAAGCCTTACCACTCCCACTATTCGGCCATTTCCCTTATATGAAAGAGGAGCGAGGACCTTCC abab`b`abaaa^`aab`ababaYa`aa_Za_aaaa`aa^_aaaa_]ZbaabXbbV\Wb_aa\a]aUa_^\VV\_W NM:i:0
    at net.sf.samtools.SAMTextReader.reportErrorParsingLine(SAMTextReader.java:220)
    at net.sf.samtools.SAMTextReader.access$500(SAMTextReader.java:40)
    at net.sf.samtools.SAMTextReader$RecordIterator.parseLine(SAMTextReader.java:424)
    at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:268)
    at net.sf.samtools.SAMTextReader$RecordIterator.next(SAMTextReader.java:240)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:612)
    at net.sf.samtools.SAMFileReader$AssertableIterator.next(SAMFileReader.java:590)
    at net.sf.picard.sam.SortSam.doWork(SortSam.java:58)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:156)
    at net.sf.picard.cmdline.CommandLineProgram.instanceMainWithExit(CommandLineProgram.java:117)
    at net.sf.picard.sam.SortSam.main(SortSam.java:66)
    Thanks
  • jstjohn
    Member
    • Jun 2010
    • 35

    #2
    It looks like the error is complaining about not finding a "sequence dictionary" in your sam file. Here is something to try, not sure if it will work:

    First run CreateSequenceDictionary.jar which outputs a SAM file with just the sequence dictionary given a reference sequence.

    Next run MergeSamFiles.jar to merge the above sam file with a sequence dictionary with the original sam file which looks like it is missing a sequence dictionary.

    Now maybe SortSam.jar will work on the merged file?

    Comment

    • muzz56
      Member
      • Sep 2010
      • 14

      #3
      Thanks. However that doesn't seem to solve the problem. Anything else I can try?

      Comment

      • nilshomer
        Nils Homer
        • Nov 2008
        • 1283

        #4
        You probably don't have a header in your SAM file. Tell me what "samtools view -SH GSM520_ES.aligned.sam" outputs.

        Comment

        • muzz56
          Member
          • Sep 2010
          • 14

          #5
          I was probably quick to declare failure. I've just looked at the merge file output and its empty. Tried running the mergesam again and there's an error there "dictionaries are not of the same size (0, 34)". Am still scratching my head to see where this is from and how to solve it. Any ideas will be appreciated.

          Comment

          • muzz56
            Member
            • Sep 2010
            • 14

            #6
            I sorted out the problem and it's now working just fine. Thanks for your help

            Comment

            • Graham Etherington
              Member
              • Apr 2010
              • 22

              #7
              Originally posted by muzz56 View Post
              I sorted out the problem and it's now working just fine. Thanks for your help
              Would you care to share your solution so that people searching for this problem have a possible answer.

              Comment

              • polarise
                Member
                • Jan 2011
                • 13

                #8
                Can Someone Verify This Solution?

                1. Create the dictionary, say, dict.sam
                java -jar CreateSequenceDictionary.jar OUTPUT=dict.sam R=ref.fa

                2. Create a new file (unsorted_file.sam) that has both the dictionary and the aligned reads.
                cat dictionary.sam > unsorted_file.sam && cat file.sam >> unsorted_file.sam

                3. Sort the SAM file
                java -jar SortSam.jar INPUT=unsorted_file.sam OUTPUT=sorted_file.sam SO=coordinate

                That's what has worked for me.

                P.K.

                Comment

                • azroger
                  Junior Member
                  • Oct 2010
                  • 7

                  #9
                  This worked for me as described above verbatim.

                  It did not work for me when I tried this:
                  java -jar CreateSequenceDictionary.jar OUTPUT=dict.sam R=references_directory/ref.fas

                  I copied my reference file to the same directory and changed the ending to .fa. I'm not sure which (or both changes) were important. Thanks for the help!

                  Comment

                  • Shani_A
                    Junior Member
                    • Sep 2014
                    • 1

                    #10
                    change .fas to .fa

                    I think it is a matter of simply changing the .fas to .fa . I tried it a while ago, and .fas gives an error but .fa created the dict file smoothly.

                    Comment

                    Latest Articles

                    Collapse

                    • GATTACAT
                      Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by GATTACAT
                      Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                      07-01-2026, 11:43 AM
                    • SEQadmin2
                      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by SEQadmin2


                      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                      Here are nine questions we think about, in roughly the order they matter, before...
                      06-18-2026, 07:11 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, 07-02-2026, 11:08 AM
                    0 responses
                    7 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-30-2026, 05:37 AM
                    0 responses
                    12 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-26-2026, 11:10 AM
                    0 responses
                    20 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-17-2026, 06:09 AM
                    0 responses
                    54 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...