Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • m_elena_bioinfo
    Member
    • Oct 2009
    • 99

    Separate bam file

    Hi NGS users,
    anyone knows how I can separate a BAM file in different chromosome?

    Starting BAM file is too large (about 10GB for a whole exome) so I want to divide it in different smaller BAM file for chromosome. But using:

    > samtools view *sorted.bam | awk '$3=="chr1"' > onlychr1.bam.sorted.bam

    returns me an incorrect file.
    Infact, when I run
    > samtools index onlychr1.bam.sorted.bam

    it returns me
    [bam_header_read] EOF marker is absent.
    [bam_header_read] invalid BAM binary header (this is not a BAM file).
    Segmentation fault

    Thanx
    ME
  • quinlana
    Senior Member
    • Sep 2008
    • 119

    #2
    You are trying to index a SAM file. You need to change your command to create BAM as follows. Note that you need headers as well.

    samtools view -h *sorted.bam | awk '$3=="chr1" || /^@/' | samtools view -Sb -> onlychr1.bam.sorted.bam

    Aaron

    Comment

    • m_elena_bioinfo
      Member
      • Oct 2009
      • 99

      #3
      It's perfect!
      Thank you very much Aaron!!

      Comment

      • m_elena_bioinfo
        Member
        • Oct 2009
        • 99

        #4
        A problem!
        I aligned my reads vs hg19.fasta (including sequence like chr9_gl000201_random, chrUn_gl000235....).
        If I want to split bwa in smaller files (with alignment in two or more chromosome), using Aaron advise:

        > samtools view -h *sorted.bam | awk '$3=="chr1" && $3=="chr3" || /^@/' | samtools view -Sb -> onlychr1_3.bam.sorted.bam

        the program returns me this error:
        [sam_read1] reference 'SN:chr18_gl000207_random LN:4262 ' is recognized as '*'. [main_samview] truncated file.


        If I split in single chromosome (only with $3=="chr1"), I have no problem.

        Someone can help me?

        Thanxxx!!!
        ME

        Comment

        • westerman
          Rick Westerman
          • Jun 2008
          • 1104

          #5
          I am not an 'awk' expert but I suspect that searching for both $3 being chr1 and $3 being chr3 will indeed give a truncated file -- perhaps not zero length but I suspect that it would not contain much useful information.

          E.g., the
          '$3=="chr1" && $3=="chr3"
          Looks bad to me. I would put parenthesis around the parts to group together.

          But caution is due to my non-awk experience.
          Last edited by westerman; 09-21-2010, 11:44 AM.

          Comment

          • m_elena_bioinfo
            Member
            • Oct 2009
            • 99

            #6
            You're right westerman, I'm a beginner with awk. But I can't find a system to define the search for both (or more) chromosomes in bam file...

            Comment

            • quinlana
              Senior Member
              • Sep 2008
              • 119

              #7
              To find chr1 or chr3 do the following

              Code:
              samtools view -h *sorted.bam | awk '$3=="chr1" || $3=="chr3" || /^@/' | samtools view -Sb -> onlychr1_3.bam.sorted.bam
              To find chr1 or chr3 or chr21 do the following

              Code:
              samtools view -h *sorted.bam | awk '$3=="chr1" || $3=="chr3" || $3=="chr21" ||  /^@/' | samtools view -Sb -> onlychr1_3_21.bam.sorted.bam

              Comment

              • us13
                Junior Member
                • Jul 2012
                • 2

                #8
                Originally posted by quinlana View Post
                You are trying to index a SAM file. You need to change your command to create BAM as follows. Note that you need headers as well.

                samtools view -h *sorted.bam | awk '$3=="chr1" || /^@/' | samtools view -Sb -> onlychr1.bam.sorted.bam

                Aaron
                I am new to this field but the following worked for me (please let me know if I am wrong):
                samtools sort file.bam file.sorted.bam
                samtools index file.sorted.bam
                samtools view -bh file.sorted.bam chr1 > chr1.file.bam

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                14 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-04-2026, 08:59 AM
                0 responses
                24 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                31 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 11:40 AM
                0 responses
                23 views
                0 reactions
                Last Post SEQadmin2  
                Working...