Hello
I got puzzled by GATK way of explaining sample/library/lane to detect SNP.
It seems to me that their "sample" is equivalent to an individual. However after reading the SAM Format Specification I thought the individuals were indicated in the read group @RG ID tag and not in the SM sample tag...
I want to study SNPs in various individuals from illumina data so I wanted to know how to call individuals in the BAM file :
@RG ID:individual_1
or
@RG SM:individual_1 (and in this case, what is the ID tag for ?)
???
And about the library, if various individuals were tagged and sequenced together in the same illumina run, are these individuals forming a library ? But then what is the difference between a lane and a library ??
And just to be sure : is the lane encoded in the PU tag ?
Many thanks for your help
I got puzzled by GATK way of explaining sample/library/lane to detect SNP.
It seems to me that their "sample" is equivalent to an individual. However after reading the SAM Format Specification I thought the individuals were indicated in the read group @RG ID tag and not in the SM sample tag...
I want to study SNPs in various individuals from illumina data so I wanted to know how to call individuals in the BAM file :
@RG ID:individual_1
or
@RG SM:individual_1 (and in this case, what is the ID tag for ?)
???
And about the library, if various individuals were tagged and sequenced together in the same illumina run, are these individuals forming a library ? But then what is the difference between a lane and a library ??
And just to be sure : is the lane encoded in the PU tag ?
Many thanks for your help
Comment