Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • knostrov
    Junior Member
    • Oct 2012
    • 7

    Converting ELAND export -> SAM -> BAM

    Hi everyone,

    I apologize because a variation of this question has been asked several times before, but I still can't seem to figure out what's wrong.

    I have ELAND export files that I converted to SAM (I believe), using Samtools 0.1.18 and the command

    perl samtools-0.1.18/misc/export2sam.pl --read1=exportfile > exportfile.sam

    When I try to make a bam file using

    samtools view -Sbh exportfile.sam > exportfile.bam

    I get the following error message:
    [samopen] no @SQ lines in the header.
    [sam_read1] missing header? Abort!

    This is day 1 for me, so I'd appreciate any pointers.

    Thanks
  • swbarnes2
    Senior Member
    • May 2008
    • 910

    #2
    My guess is that the sam didn't get made with headers. Find out what they are supposed to look like, and add them yourself.

    Comment

    • maubp
      Peter (Biopython etc)
      • Jul 2009
      • 1544

      #3
      Are these unaligned reads? If so it is legitimate to have a SAM file with no @SQ lines.

      Comment

      • knostrov
        Junior Member
        • Oct 2012
        • 7

        #4
        They are aligned reads (using ELAND). I am now trying to make my own header, as suggested, but additionally I suppose I would like to understand WHY there is no @SQ header -- is there something I need to supply to the export2sam.pl script that I am not supplying?

        Comment

        • BAMseek
          Senior Member
          • Apr 2011
          • 124

          #5
          I don't believe eland2sam.pl will give the header. For one, it doesn't know the length of the sequences - so at best it could give you a length as long as the largest alignment position. Also, it would need to go through the Eland file twice: once to calculate the header and the second to write the reads.

          This post suggests using the FASTA reference file to get the headers. One issue would be if your data has splice junctions - since the "chromosome" in the Eland alignment would be represented as a concatenation of exon coordinates. I think you would be able to create a legitimate BAM file, but the splice junctions will have coordinates relative to the exon junction rather than the chromosome.

          Justin

          Comment

          • knostrov
            Junior Member
            • Oct 2012
            • 7

            #6
            Thank you, that is very helpful. This is DNA sequence, so I think there should be no issues.

            So what I need is a ref.fa file. Do I make this myself by concatenating the FASTA files for all chromosomes?

            Comment

            • swbarnes2
              Senior Member
              • May 2008
              • 910

              #7
              Originally posted by knostrov View Post
              Thank you, that is very helpful. This is DNA sequence, so I think there should be no issues.

              So what I need is a ref.fa file. Do I make this myself by concatenating the FASTA files for all chromosomes?
              Yes, you can do that. You shouldn't need a single reference fasta to make the header.

              Is it possible that lying around somewhere is an old sample that someone analyzed the right way? You can grab the headers off of that. It's something you'll only need once.

              Comment

              • knostrov
                Junior Member
                • Oct 2012
                • 7

                #8
                So this is basically a shot in the dark, but what I did is to make a ref.fa file by concatenating fasta files for all chromosomes.
                Then I ran

                samtools faidx chromFa/ref.fa

                and got back a file that looked like this:
                chr1 249250621 6 50 51
                chr10 135534747 254235647 50 51
                chr11 135006516 392481096 50 51
                chr12 133851895 530187750 50 51
                chr13 115169878 666716690 50 51
                chr14 107349540 784189973 50 51
                chr15 102531392 893686511 50 51
                chr16 90354753 998268538 50 51
                chr17 81195210 1090430394 50 51
                chr18 78077248 1173249516 50 51

                etc.

                Then I ran

                samtools view -t chromFa/ref.fa.fai -bSh exportfile.sam -o exportfile.bam

                and got back this very impressive error message (verbatim!):
                [sam_header_read2] 24 sequences loaded.
                ?1`?-!??2?p??yF??b?ZQ?_TbT???W???ge?g?g???c?Z?ZT???X?Z?_?????VR?X??T\???S?k?g?gh???Y???b???nQjb????_???a?s?{??{????y|?!T????u?@??Qd?`?|??
                ߳~??x?
                aQw?`<?48ϔ??Q?3cpܒ?
                [main_samview] random alignment retrieval only works for indexed BAM files.
                ?3??p?Ƴ`p?]?Y2L?ic?:̈??m?aF
                ?x?
                ????`<#??0m?
                rS?rC9&
                )'R`S??\P??5?1?cΐ?\? ?X0????r,??˂q"?:”E2???

                Really!

                If there are any comments, thank you. Unfortunately I don't have any files to steal headers from...

                Comment

                • swbarnes2
                  Senior Member
                  • May 2008
                  • 910

                  #9
                  Then I ran

                  samtools view -t chromFa/ref.fa.fai -bSh exportfile.sam -o exportfile.bam
                  Why? What you need to do is make header lines, and append them to the top of your .sam file.

                  For each chromosome in your reference, you need a line like

                  @SQ SN:chr10 LN:135534747
                  You can make the header file, and then cat it to your .sam file.

                  Comment

                  • BAMseek
                    Senior Member
                    • Apr 2011
                    • 124

                    #10
                    I think you are on the home stretch. According to the samtools reference, you can use the .fai file as input for the -t option. I think the message you see being printed out is actually the BAM output.

                    I would try this command instead

                    PHP Code:
                    samtools view -t chromFa/ref.fa.fai -bSh exportfile.sam exportfile.bam 
                    To check the bam file, you can do

                    PHP Code:
                    samtools view -h exportfile.bam less -
                    You should see the header lines (starting with the @ symbol) followed by the alignments.

                    Justin

                    Comment

                    • knostrov
                      Junior Member
                      • Oct 2012
                      • 7

                      #11
                      Yes! It worked! Thank you very much!

                      Ah, the satisfaction...

                      Comment

                      Latest Articles

                      Collapse

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, Yesterday, 10:09 AM
                      0 responses
                      10 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-04-2026, 08:59 AM
                      0 responses
                      20 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      27 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      21 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...