Hi,
I would like to use some data from the 1000 Genome project. As I understand, the BAM files for the individuals contain difference-reads to a reference sequence (HG18/HG19).
Now, given the reference sequence as (compressed) FASTA, one file for each chromosome, and the BAM file for let's say NA12283, I would like to obtain the FASTA files for each chromosome of NA12283. Is this possible, and how?
I tried to download a bulk of (s/b)am2X tools, but they only extract all the reads from the BAM file and write them into one FASTA file, without (re-)creating the actual chromosomes with respect to the reference.
Thanks and best Regards,
Daniel
I would like to use some data from the 1000 Genome project. As I understand, the BAM files for the individuals contain difference-reads to a reference sequence (HG18/HG19).
Now, given the reference sequence as (compressed) FASTA, one file for each chromosome, and the BAM file for let's say NA12283, I would like to obtain the FASTA files for each chromosome of NA12283. Is this possible, and how?
I tried to download a bulk of (s/b)am2X tools, but they only extract all the reads from the BAM file and write them into one FASTA file, without (re-)creating the actual chromosomes with respect to the reference.
Thanks and best Regards,
Daniel
Comment