Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Solyris
    Junior Member
    • Mar 2010
    • 1

    Hi,

    I am quite new to NGS data here and I work with a commercial software from CLCbio which also offers a mapping algorithm of its own, called Genomic Workbench.

    I would want to convert my SAM output from the software to BAM to allow using the samtools function like pileup.

    I get the following error when i ran the command in Ubuntu OS

    >./samtools view -huS -o DATA/test.bam DATA/s_2_1_sequence_SS200_LAwMM.sam
    [samopen] SAM header is present: 24 sequences.
    Parse error at line 113: CIGAR and sequence length are inconsistent
    Aborted

    I read somewhere in this thread that currently the samtools does not allow sam file processing without the reference sequence, so is the whats giving the problem? If so can anyone point me to a place to generate the correct reference sequence file, I tried reading through the manual but there is nowhere telling me how the reference file should be formatted. And I am looking at the whole human reference genome with 24 gbk files from NCBI.

    Any help is appreciated.

    Thanks
    Sol

    Comment

    • drio
      Senior Member
      • Oct 2008
      • 323

      Originally posted by Solyris View Post
      Hi,

      I am quite new to NGS data here and I work with a commercial software from CLCbio which also offers a mapping algorithm of its own, called Genomic Workbench.

      I would want to convert my SAM output from the software to BAM to allow using the samtools function like pileup.

      I get the following error when i ran the command in Ubuntu OS

      >./samtools view -huS -o DATA/test.bam DATA/s_2_1_sequence_SS200_LAwMM.sam
      [samopen] SAM header is present: 24 sequences.
      Parse error at line 113: CIGAR and sequence length are inconsistent
      Aborted

      I read somewhere in this thread that currently the samtools does not allow sam file processing without the reference sequence, so is the whats giving the problem? If so can anyone point me to a place to generate the correct reference sequence file, I tried reading through the manual but there is nowhere telling me how the reference file should be formatted. And I am looking at the whole human reference genome with 24 gbk files from NCBI.

      Any help is appreciated.

      Thanks
      Sol
      samtools performs some sanity checks in the CIGAR string and it is telling you something is not right. Have you looked to that particular alignment to confirm if the CIGAR is correct?
      -drd

      Comment

      • GoneSouth
        Member
        • Aug 2008
        • 11

        why do deletions in the pileup-file have a quality attached

        Hi guys,

        Does anyone know why deletions in the pileup file have an quality attached??? How can a deletion have a quality?
        And how is this calculated??

        For example:

        YHet 23690 N 1 a-1n Q
        YHet 23691 N 1 * [
        YHet 23692 N 1 c [


        or

        YHet 25409 N 5 AAA-2NNa-2nnA-2NN VTW`a
        YHet 25410 N 5 A$A$*** USR`a
        YHet 25411 N 3 *** SG`


        best ro

        Comment

        • jeffhsu3
          Junior Member
          • Jan 2010
          • 5

          If an insertion or deletion occurs at the end of the pileup read bases string, they don't seem to the extra character after the '\+[0-9]+[ACGTNacgtn]+' pattern.

          For example:
          chr1 2263 C 4 ,$.$.,+1t CC9C FFFF.

          Am I missing something? The pattern is described here: pileup format, and it mentions the in/del pattern '\+[0-9]+[ACGTNacgtn]+' but there appears to be an extra character in the examples given on the page:

          seq2 156 A 11 .$......+2AG.+2AG.+2AGGG <975;:<<<<<

          That extra character appears to be missing if the in/del occurs at the end of the read bases string. Including that extra character as part of the insertion/deletion it makes the read_bases match with the read number.
          Last edited by jeffhsu3; 04-05-2010, 12:03 PM. Reason: Made more clear and added examples.

          Comment

          • jdiezperezj
            Junior Member
            • Mar 2010
            • 3

            So, is it already possible to convert soap aligner output format to SAM or BAM formats.
            Best.
            Javi

            Originally posted by lh3 View Post
            To corthay:

            You are quick. I am planning a new bwa release as I realized that I could improve it a little without much work (PS: the new version is released now). Wgsim, wgsim_eval.pl and converters for soap and bowtie are available from SVN only:

            svn co https://samtools.svn.sourceforge.net...s/dev/samtools samtools

            Comment

            • RockChalkJayhawk
              Senior Member
              • Mar 2009
              • 192

              FLAGS for fusion detection

              Lets say I have RNA-Seq data (Paired-End) and I want to find out if the mates are mapped > 1 Mb on the same chromosome or map to 2 different chromosomes. How do I determine that from the FLAGS?

              Comment

              • nilshomer
                Nils Homer
                • Nov 2008
                • 1283

                Originally posted by RockChalkJayhawk View Post
                Lets say I have RNA-Seq data (Paired-End) and I want to find out if the mates are mapped > 1 Mb on the same chromosome or map to 2 different chromosomes. How do I determine that from the FLAGS?
                You can use the MRNM and MPOS fields in the SAM file.

                Comment

                • RockChalkJayhawk
                  Senior Member
                  • Mar 2009
                  • 192

                  Originally posted by nilshomer View Post
                  You can use the MRNM and MPOS fields in the SAM file.
                  So in that case, my MRNM does not equal "=" OR MRNM equals "=" and the difference between POS and MPOS > 1 million.

                  Is this correct?
                  Last edited by RockChalkJayhawk; 04-13-2010, 01:17 PM. Reason: Incorrect assumption

                  Comment

                  • nilshomer
                    Nils Homer
                    • Nov 2008
                    • 1283

                    Originally posted by RockChalkJayhawk View Post
                    So in that case, my MRNM does not equal "=" OR MRNM equals "=" and the difference between POS and MPOS > 1 million.

                    Is this correct?
                    Perfect!

                    Comment

                    • RockChalkJayhawk
                      Senior Member
                      • Mar 2009
                      • 192

                      Originally posted by nilshomer View Post
                      Perfect!
                      Thanks Nils! Youre the best!

                      Comment

                      • menenuh
                        Junior Member
                        • Jan 2010
                        • 8

                        non-unique reads

                        Hello,
                        In my sam file I have both unique and non-unique reads. What happens to non-unique reads when I call SNPs from the sam file? Are they included in the SNP calling process?

                        thanks

                        Comment

                        • bair
                          Member
                          • Jan 2010
                          • 65

                          denovo on sam format

                          Dear all,

                          I have alignment results in bam file which includes pair-end, mate-pair reads in different length (101 and 35, 36bp). Does anybody know that Soap or other denovo program can handle with bam format directly or I have to use the raw reads files?

                          Many thanks!

                          Comment

                          • gen2prot
                            Member
                            • Apr 2010
                            • 68

                            Hello All,

                            Does anybody know how can I sort the .sam file on the basis of the first column? That is the column containing the unique read identifiers? Right now its sorted on the 3rd.

                            Thanks
                            Abhijit

                            Comment

                            • nilshomer
                              Nils Homer
                              • Nov 2008
                              • 1283

                              Originally posted by gen2prot View Post
                              Hello All,

                              Does anybody know how can I sort the .sam file on the basis of the first column? That is the column containing the unique read identifiers? Right now its sorted on the 3rd.

                              Thanks
                              Abhijit
                              SAMtools and Picard will both sort by read name. See their documentation.

                              Comment

                              • gen2prot
                                Member
                                • Apr 2010
                                • 68

                                Hello nilshomer,

                                I downloaded picard. I have the .jar files on MAC osx 10.6. Yet these jar files won't open. I have them saved on the Desktop. How do I run it?

                                Thanks
                                Abhijit

                                Comment

                                Latest Articles

                                Collapse

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-05-2026, 10:09 AM
                                0 responses
                                17 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-04-2026, 08:59 AM
                                0 responses
                                34 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                37 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                24 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...