No announcement yet.

Extracting HLA-E allele sequences from BAM for gene editing

  • Filter
  • Time
  • Show
Clear All
new posts

  • Extracting HLA-E allele sequences from BAM for gene editing

    I have WGS data (2x100bp) that has been aligned to hg19. Using HLA-VBSeq, I HLA typed the data, and now the researcher I am working with is interested in two allele sequences for HLA-E for gene editing purposes. I would really appreciate some feedback on my workflow since I have never done this type of analysis before.

    Using samtools, I subset the list of reads covering the genomic region of HLA-E, then used picard FilterSamReads to output the filtered bam file. I converted the bam to paired fastq and then aligned the fastq to the HLA-E genomic DNA sequences for alleles E*01:03:01:01 and E*01:01:01:01 from the IMGT/HLA database using BWA-MEM. I then sorted and indexed my bam files for each allele for viewing in IGV. Now, I have a couple questions:

    1. Should I have aligned the filtered bam to the nucleotide coding sequences for both alleles instead of the genomic sequences?
    2. The researcher wants all sequences aligned to E*01:03:01:01 and E*01:01:01:01. Would it be appropriate to convert the bam files to fasta files using seqtk to deliver the sequences?
    samtools bam2fq input.bam | seqtk seq -A > output.fa
    I'm not sure if my workflow is correct, or if delivering the fasta for each allele is appropriate for gene editing downstream. Help?