Disease-specific non-coding variant annotation
What is it about?
DIVAN is a machine learning-based algorithm that is capable of predicting whether a mutation that occur anywhere in the genome is likely to be disease-associated. It is related to popular algorithms such as GWAVA, CADD, Eigen and GenomeCanyon. But a big difference is that DIVAN is disease-specific. it will make different predictions for the same mutation for different diseases or traits.
Why is it important?
90% of the disease-associated variants found by GWAS is non-coding. How to annotate non-coding variants is important yet challenging. In recent years, popular tools including GWAVA, CADD, Eigen, GenomeCanyon have been developed to solve this problem. Although these methods predict whether a mutation is likely to be "risky" or neutral. However, it is likely that a particular variant is only associated with one particular disease. We believe a disease-specific annotation of non-coding variants is much more important but has yet to receive much attention so far. Another secondary, but very surprising finding is that the most important feature for distinguishing risk and benign variants, is not the enrichment of open chromatin marks as many have previous noticed and reported , but the depletion of close chromatin marks around the risk variants, especially H3K9me3.
The following have contributed to this page: Dr Zhaohui S. Qin