Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    Originally posted by gen2prot View Post
    Hello nilshomer,

    I downloaded picard. I have the .jar files on MAC osx 10.6. Yet these jar files won't open. I have them saved on the Desktop. How do I run it?

    Thanks
    Abhijit
    I am assuming you have familiarity with the Terminal and a Unix-based environment. If this is wrong, you need to become familiar with these environments (search this site for recommended books and tutorials). I cannot teach you how to use the Terminal and such basic questions.

    Use the command for the respective jar:
    Code:
    java -jar SortSam.jar

    Comment

    • gen2prot
      Member
      • Apr 2010
      • 68

      Thank you.

      Comment

      • gen2prot
        Member
        • Apr 2010
        • 68

        Hello,

        I gave the following command using the Picard tool Sortsam

        java -jar SortSam.jar I=../test/testsam.sam O=../test/sortedtest.sam SO=queryname

        My Input file looks like this:

        @HD VN:1.0 SO:sorted
        @PG ID:TopHat VN:1.0.13 CL:/share/apps/bin/tophat -o ./s1 --solexa1.3-quals -p 2 GeneIndex /home/asanyal/data/Flydata/Exp_100423/100423_HWI-EAS313_0001_61G2CAAXX.birchlerj/s_1_sequence.txt
        HWI-EAS313_0001:1:80:8942:6680#0 0 FBgn0000003 1 3 42M * 0 0 CGGACTGGAAGGTTGGCAGCTTCTGTAATCACGCTTCTGTGA GGGFGGGGGGGFFEGGGGGGGFGGGGGGDDGFGGGGGGGGFE NM:i:2
        HWI-EAS313_0001:1:108:8254:11808#0 0 FBgn0000003 9 3 42M * 0 0 AAGGTTGGCAGCTTCTGTAATCACGCTTCTGTGAGGTCTGAT C::?>ACCCCD?EDEB=EEEEEECEE?:E??@C@CEBED=4? NM:i:0

        However, I get the following error message.

        Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. Empty sequence dictionary.; Line 3
        Line: HWI-EAS313_0001:1:80:8942:6680#0 0 FBgn0000003 1 3 42M * 0 0 CGGACTGGAAGGTTGGCAGCTTCTGTAATCACGCTTCTGTGA GGGFGGGGGGGFFEGGGGGGGFGGGGGGDDGFGGGGGGGGFE NM:i:2

        Do I have to give the program the reference sequences? Or do I need to create a sequence dictionary using CreateSequenceDictionary

        Thanks
        Abhijit

        Comment

        • nilshomer
          Nils Homer
          • Nov 2008
          • 1283

          Originally posted by gen2prot View Post
          Hello,

          I gave the following command using the Picard tool Sortsam

          java -jar SortSam.jar I=../test/testsam.sam O=../test/sortedtest.sam SO=queryname

          My Input file looks like this:

          @HD VN:1.0 SO:sorted
          @PG ID:TopHat VN:1.0.13 CL:/share/apps/bin/tophat -o ./s1 --solexa1.3-quals -p 2 GeneIndex /home/asanyal/data/Flydata/Exp_100423/100423_HWI-EAS313_0001_61G2CAAXX.birchlerj/s_1_sequence.txt
          HWI-EAS313_0001:1:80:8942:6680#0 0 FBgn0000003 1 3 42M * 0 0 CGGACTGGAAGGTTGGCAGCTTCTGTAATCACGCTTCTGTGA GGGFGGGGGGGFFEGGGGGGGFGGGGGGDDGFGGGGGGGGFE NM:i:2
          HWI-EAS313_0001:1:108:8254:11808#0 0 FBgn0000003 9 3 42M * 0 0 AAGGTTGGCAGCTTCTGTAATCACGCTTCTGTGAGGTCTGAT C::?>ACCCCD?EDEB=EEEEEECEE?:E??@C@CEBED=4? NM:i:0

          However, I get the following error message.

          Exception in thread "main" net.sf.samtools.SAMFormatException: Error parsing text SAM file. Empty sequence dictionary.; Line 3
          Line: HWI-EAS313_0001:1:80:8942:6680#0 0 FBgn0000003 1 3 42M * 0 0 CGGACTGGAAGGTTGGCAGCTTCTGTAATCACGCTTCTGTGA GGGFGGGGGGGFFEGGGGGGGFGGGGGGDDGFGGGGGGGGFE NM:i:2

          Do I have to give the program the reference sequences? Or do I need to create a sequence dictionary using CreateSequenceDictionary

          Thanks
          Abhijit

          There are no "SQ" fields in your SAM file. You could try giving it the reference sequence if they are not present.

          Comment

          • gen2prot
            Member
            • Apr 2010
            • 68

            Hello nilshomer,

            I specified the SQ field but I still get an error, probably because the reference sequences are drosophila gene sequences. Therefore the names of the reference sequences are different unless 2 or more reads match to the same gene. I cannot convert the names to a single reference sequence name, since I will loose information. I am stuck. Maybe I need to do the traditional perl sort (very time consuming for a 6GB file). Any better way of doing this?

            Thanks
            Abhijit

            Comment

            • gen2prot
              Member
              • Apr 2010
              • 68

              Hello,

              I was wondering if there was a score associated with each read in the SAM file, that would give an indication on the strength of the match between the read and the subject sequence. The CIGAR string helps to some extent, but since "M" denotes match or mismatch, I was wondering if there was a way to differentiate between the two. Sort of like an E-value or a blast score.

              Abhijit

              Comment

              • JohnK
                Senior Member
                • Feb 2010
                • 106

                Hi guys,

                Had a quick question regarding the SAM- CIGAR column. I understand the M attribute designates both matches and mismatches. Is there a way to get at the literal number of mismatches without resorting to comparing the tags or sequence to the reference using the SAM format? Sorry if this question is a re-post. I tried searching, but couldn't find anything.

                Comment

                • nilshomer
                  Nils Homer
                  • Nov 2008
                  • 1283

                  Originally posted by gen2prot View Post
                  Hello,

                  I was wondering if there was a score associated with each read in the SAM file, that would give an indication on the strength of the match between the read and the subject sequence. The CIGAR string helps to some extent, but since "M" denotes match or mismatch, I was wondering if there was a way to differentiate between the two. Sort of like an E-value or a blast score.

                  Abhijit
                  Probably a good idea to create a new thread (this one is getting long!).
                  See the mapping quality field.

                  Originally posted by JohnK View Post
                  Hi guys,

                  Had a quick question regarding the SAM- CIGAR column. I understand the M attribute designates both matches and mismatches. Is there a way to get at the literal number of mismatches without resorting to comparing the tags or sequence to the reference using the SAM format? Sorry if this question is a re-post. I tried searching, but couldn't find anything.
                  Probably a good idea to create a new thread (this one is getting long!).
                  Try the NM optional tag if it is available (aligner specific).

                  Comment

                  • win804
                    Member
                    • Apr 2010
                    • 18

                    Hi All,
                    Is "sorted" BAM file smaller in size compare to unsorted BAM file?
                    If that's the case, why is that so?

                    I sort a lot of BAM files using the samtools, with this command:
                    samtools sort chr1-aligned.bam chr1-aligned.sorted

                    file size of chr1-aligned.bam ==> 353,618,735 bytes
                    but the file size of chr1-aligned.sorted.bam ==> 295,208,534 bytes

                    I have checked for all my unsorted and sorted bam files. All of the sorted bam files are smaller in size compare to the unsorted ones.

                    Thanks.

                    Comment

                    • lh3
                      Senior Member
                      • Feb 2008
                      • 686

                      Sorted files are compressed better.

                      Comment

                      • maubp
                        Peter (Biopython etc)
                        • Jul 2009
                        • 1544

                        Originally posted by win804 View Post
                        Hi All,
                        Is "sorted" BAM file smaller in size compare to unsorted BAM file?
                        If that's the case, why is that so?...
                        You already asked this on a separate thread

                        Comment

                        • win804
                          Member
                          • Apr 2010
                          • 18

                          Thanks Li Heng. I just want to confirm that nothing is wrong with the sorted bam file.

                          Thanks a lot.

                          Comment

                          • win804
                            Member
                            • Apr 2010
                            • 18

                            Originally posted by maubp View Post
                            You already asked this on a separate thread
                            http://seqanswers.com/forums/showthread.php?t=5684
                            Yes, I wanted to delete the previous thread before, however, I have no idea of how to do it. Any idea?

                            Thanks.

                            Comment

                            • glacierbird
                              Member
                              • Dec 2009
                              • 15

                              Originally posted by lh3 View Post
                              @corthay

                              You can convert with "blast2sam.pl -s" to save the sequence in SAM. Currently, samtools cannot parse SAM without sequence, although the specification allows this.

                              Hi Li,
                              I parse .sam with sequence, but samtools view still gave such error msg. Could you please take a look of the post:


                              Thanks.

                              Comment

                              • gsjlucky
                                Junior Member
                                • May 2010
                                • 1

                                maq2sam-long

                                Originally posted by lh3 View Post
                                maq2sam-short is for the .map files generated by maq-0.6.x, while maq2sam-long for files generated by maq-0.7.x. Sorry for the confusion, and one of the aims of SAM is to avoid such confusions in future.
                                maq2sam <in.map> [<readGroup>], I want know how to use the option parameter 'readGroup',can it add library info from map to sam?

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  Yesterday, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, Yesterday, 12:03 PM
                                0 responses
                                19 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, Yesterday, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...