Hello,
I try to use GATK and I think that my BAM files are not well formated. I read that:
I did the three first steps with samtools, but I'm not sure if my file is correctly sorted.
My main problem is for the latest steps. I don't know how to do and I don't understand the notion of read group.
Could someone clarify this please?
Thanks,
Jane
I try to use GATK and I think that my BAM files are not well formated. I read that:
The file must be binary (.bam).
The file must be indexed.
The file must be sorted in coordinate order with respect to the reference (i.e. the contig ordering in your bam must exactly match that of the reference you are using).
The file must have a proper bam header with read groups. Each read group must contain the platform (PL) and sample (SM) tags. For the platform value, we currently support 454, LS454, Illumina, Solid, ABI_Solid, and CG (all case-insensitive).
Each read in the file must be associated with exactly one read group.
The file must be indexed.
The file must be sorted in coordinate order with respect to the reference (i.e. the contig ordering in your bam must exactly match that of the reference you are using).
The file must have a proper bam header with read groups. Each read group must contain the platform (PL) and sample (SM) tags. For the platform value, we currently support 454, LS454, Illumina, Solid, ABI_Solid, and CG (all case-insensitive).
Each read in the file must be associated with exactly one read group.
My main problem is for the latest steps. I don't know how to do and I don't understand the notion of read group.
Could someone clarify this please?
Thanks,
Jane
Comment