Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • sequence_hard
    Junior Member
    • Feb 2016
    • 5

    Separating a .sam or .bam file that is based on alignment to multiple sequences

    Hey people!
    I have the following setup. I have a .sam file based on the alignment of E.coli genome reads to a multifasta file containing the sequences of several plasmids. Now I want to separate the .sam file into several sam files, each containing the header and mapping information for one plasmid, so that I can later calculate the % coverage per plasmid.

    I would do it such that I create a dictionary in python, append the header of a sequence as key and the corresponding sequence (identifiable because the key should also be present in the line of the alignment/mapping block). Then I write that to a file, convert to .bam and so on.

    Now my questions: Is this an appropriate way to do what I want to do or would I lose information? Is there a better way to do this when the file is in .bam format?

    Thanks for your help!
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    Unless you want to write your own code you can use "bamtools split" (or other programs) as noted in this thread: https://www.biostars.org/p/46327/

    Comment

    • WhatsOEver
      Senior Member
      • Apr 2012
      • 215

      #3
      Which program did you for aligning your reads? If you have used a multi-fasta as reference, you should normally have the sequence headers of this multi-fasta as header in your sam file. Likewise, you should have the sequence ids as "reference id" in the sam line. Then, its quite straightforward to use the sam/bam file as it is and compute the coverage of the individual reference seqs.

      Comment

      • sequence_hard
        Junior Member
        • Feb 2016
        • 5

        #4
        @WhatsOEver: I used bowtie2. But how can I calculate the coverage of each reference sequence using only my file?

        @GenoMax: Thanks, that looks usefull!
        Last edited by sequence_hard; 02-09-2016, 06:41 AM.

        Comment

        • WhatsOEver
          Senior Member
          • Apr 2012
          • 215

          #5
          Either you use a different mapper which has this capability (eg bbmap does this with the scafstats parameter) or (what I prefer) you use a program like bedtools genomecov (eg bedtools genomecov -ibam yourAlignedData.bam -g yourMultiFasta.fasta). The function of bedtools is nicely explained here: http://bedtools.readthedocs.org/en/l...genomecov.html

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            Samtools depth may work in a pinch. You could also use Qualimap to get visual maps of coverage.

            Comment

            • sequence_hard
              Junior Member
              • Feb 2016
              • 5

              #7
              @WhatsOEver: Awesome, thanks! I am already using bedtools genomecov but I did not know it had that function. Nice!

              Comment

              • WhatsOEver
                Senior Member
                • Apr 2012
                • 215

                #8
                Originally posted by sequence_hard View Post
                @WhatsOEver: Awesome, thanks! I am already using bedtools genomecov but I did not know it had that function. Nice!
                You are welcome
                As to my experience, there is pretty much nothing you can't do with bedtools when it comes to sequence coverage.

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                12 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                23 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                28 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 11:40 AM
                0 responses
                22 views
                0 reactions
                Last Post SEQadmin2  
                Working...