Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Remove unwanted contigs and update bam header

    Hi,
    I have some human genome data in bam format that I want to upload to some software (Congenica) to perform variant filtering.
    The problem I have is that presently the data has been mapped to a reference genome containing alternate contigs but the software does not accept them, it only accepts contigs from the standard chromosomes chr1-22, chrX, chrY and chrM.

    I therefore want to remove the alternate contigs and then update the header to only include the standard chromosomes. Sounds simple but nothing I have tried will actually give me a usable bam file to upload.

    I have started using a bam file where the header looks like:

    Click image for larger version

Name:	dataurl230738.png
Views:	24
Size:	235.9 KB
ID:	326478

    There are thousands of the alt contigs and then some lines with @RG and @PG data.

    I first used the following samtools command:

    samtools view -L bam_contigs_to_keep.bed -O BAM -o Foo_edit_1.bam Foo.bam

    where the bed file looks like:
    Click image for larger version

Name:	dataurl230743.png
Views:	34
Size:	72.6 KB
ID:	326476
    This has reduced the bam file size from ~12GB to ~9GB which I assume is a result of the alternate contigs being removed.
    However, the alternate contigs are still included in the header and so I’ve next used the following command to update the header:
    samtools reheader reordered_head_GRCh38.dict Foo_edit_1.bam > Foo_edit_2.bam
    the .dict file looks like this:
    Click image for larger version

Name:	dataurl230744.png
Views:	23
Size:	139.5 KB
ID:	326477

    This reduces the bam file size by just ~100KB which I’ve assumed is due to the alternate contigs being removed from the header.

    My problem is that when I run samtools flagstat the original bam and edit_1.bam have >1billion reads but edit_2.bam has ~600 and is truncated.

    I have also tried using the picard ReplaceSamHeader command, and this example from another forum:
    samtools view Foo.bam chr1 chr2 chr3 chr4 chr5 chr6 chr7 chr8 chr9 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22 chrX chrY chrM | samtools view -bo Foo_edit_2.bam -t corrected_bam_head.sam -

    Nothing has yet worked and given me a usable edited bam file.

    I would really appreciate any help or advice someone could give about this.

    Many thanks

    Hywel

Latest Articles

Collapse

  • seqadmin
    New Genomics Tools and Methods Shared at AGBT 2025
    by seqadmin


    This year’s Advances in Genome Biology and Technology (AGBT) General Meeting commemorated the 25th anniversary of the event at its original venue on Marco Island, Florida. While this year’s event didn’t include high-profile musical performances, the industry announcements and cutting-edge research still drew the attention of leading scientists.

    The Headliner
    The biggest announcement was Roche stepping back into the sequencing platform market. In the years since...
    03-03-2025, 01:39 PM
  • seqadmin
    Investigating the Gut Microbiome Through Diet and Spatial Biology
    by seqadmin




    The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
    02-24-2025, 06:31 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-03-2025, 01:15 PM
0 responses
179 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-28-2025, 12:58 PM
0 responses
272 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-24-2025, 02:48 PM
0 responses
657 views
0 likes
Last Post seqadmin  
Started by seqadmin, 02-21-2025, 02:46 PM
0 responses
267 views
0 likes
Last Post seqadmin  
Working...
X