Hi,
I'm learning data analysis with R, using RStudio.
I have my data in one file. I have variants from 3 tissue types, from 130 individuals. I would like to filter the variants within each individual, so that I can get: 1. unique variants for all 3 tissues, 2. variants common to all tissues, 3. variants common between 2 tissue types.
The file that I'm working on has 118 columns (vcf annotated file), I would like to keep all columns.
I've tried VennDiagram package, but it filters out variants that are present in any other individual, so in the overlapping files I loose some variants.
I understand that I would need to use loop filtering, so that 1. I pick individuals by "ID" column, then sort variants by "variant" column and compare them between "tissue" column to output into all filtered types of variants (unique and common).
Does anyone know how to do that???
Cheers,
A.
I'm learning data analysis with R, using RStudio.
I have my data in one file. I have variants from 3 tissue types, from 130 individuals. I would like to filter the variants within each individual, so that I can get: 1. unique variants for all 3 tissues, 2. variants common to all tissues, 3. variants common between 2 tissue types.
The file that I'm working on has 118 columns (vcf annotated file), I would like to keep all columns.
I've tried VennDiagram package, but it filters out variants that are present in any other individual, so in the overlapping files I loose some variants.
I understand that I would need to use loop filtering, so that 1. I pick individuals by "ID" column, then sort variants by "variant" column and compare them between "tissue" column to output into all filtered types of variants (unique and common).
Does anyone know how to do that???
Cheers,
A.
Comment