Hello seqanswers,
I am interested in compiling, to the degree that we can, a list of potential actions to take in deciding what variants to prioritize after a whole exome or whole genome analysis.
I understand that there are numerous issues associated with this.
First of all, we don't work on the same type of diseases, so while someone studying a Mendelian trait would be most interested in finding rare variants that produce non-synonymous changes, and things of that nature, someone studying a disease that may at one time have been adaptive might be looking for other patterns of variants.
Further, we all know that the p-values you get back from gene burden tests are at best loose guidelines, and that ultimately it is our assessment from numerous different tools like UCSC, pubmed, SnpEff, Sift Polyphen, IVA, Annovar, etc. etc. etc. that convinces us ultimately whether to spend money on following up on a gene or variant.
Because of all this, I could have titled this post "towards a unified framework for VCF analysis" but I didnt...I think the specifics of our diseases, the variants we find, the evidence available for a given gene, etc. etc. preclude the possibility that one pipeline could analyze any variant optimally - at least, not without a tremendous amount of work.
As such, what I am asking for instead is this: What are all the things that you as analysts take into account when assessing a particular class of variant?
Has anyone gone so far as to write scripts that pulls together more than one tool (e.g., takes a VCF and generates info. on the gene, its function, sift scores of the variants, methylation and DNAase hypersensitivity, etc.)? Or, maybe you don't have a script, but you have a written or mental list of 8 tests you always do for a given kind of variant? (e.g. for exonic there are a variety of prediction tools that don't exist for other types of variants).
Also, I have learned of this list here: http://seqanswers.com/wiki/Software I guess my question more relates to how to organize these together such that they are leveraged in a semi-programmatic fashion only when you want to.
Thank you so much, and I am happy to share my list such as it is, although the purpose of my post is to help make a better framework.
I am interested in compiling, to the degree that we can, a list of potential actions to take in deciding what variants to prioritize after a whole exome or whole genome analysis.
I understand that there are numerous issues associated with this.
First of all, we don't work on the same type of diseases, so while someone studying a Mendelian trait would be most interested in finding rare variants that produce non-synonymous changes, and things of that nature, someone studying a disease that may at one time have been adaptive might be looking for other patterns of variants.
Further, we all know that the p-values you get back from gene burden tests are at best loose guidelines, and that ultimately it is our assessment from numerous different tools like UCSC, pubmed, SnpEff, Sift Polyphen, IVA, Annovar, etc. etc. etc. that convinces us ultimately whether to spend money on following up on a gene or variant.
Because of all this, I could have titled this post "towards a unified framework for VCF analysis" but I didnt...I think the specifics of our diseases, the variants we find, the evidence available for a given gene, etc. etc. preclude the possibility that one pipeline could analyze any variant optimally - at least, not without a tremendous amount of work.
As such, what I am asking for instead is this: What are all the things that you as analysts take into account when assessing a particular class of variant?
Has anyone gone so far as to write scripts that pulls together more than one tool (e.g., takes a VCF and generates info. on the gene, its function, sift scores of the variants, methylation and DNAase hypersensitivity, etc.)? Or, maybe you don't have a script, but you have a written or mental list of 8 tests you always do for a given kind of variant? (e.g. for exonic there are a variety of prediction tools that don't exist for other types of variants).
Also, I have learned of this list here: http://seqanswers.com/wiki/Software I guess my question more relates to how to organize these together such that they are leveraged in a semi-programmatic fashion only when you want to.
Thank you so much, and I am happy to share my list such as it is, although the purpose of my post is to help make a better framework.