Many current projects involve statistical methods for epigenetics, an exciting field of genome-wide biological inquiry. Epigenetics is the study of heritable changes in gene function not involving changes in DNA sequence, the best examples of which are DNA methylation and histone modification. Because these processes help to regulate gene expression, but also may be affected by the expression of certain genes, miRNAs, and other molecular factors, their genome-wide study necessarily requires methods for integrative genomics, and thus provides a rich opportunity for challenging statistical work. In addition, because environmental factors can influence epigenetic processes, epigenetics may play a central role in gene-environment interactions, and thus provides additional interesting opportunities for epidemiologic research.
The literature of DNA methylation emphasizes the "methylator phenotype", whose statistical description essentially involves clustering or latent class modeling. Recent statistical research suggests that, in the context of epigenetics, model-based clustering may outperform metric-based and other nonparametric approaches. My work has focused on finding computationally efficient solutions to model-based clustering problems in the context of DNA methylation, and the integration of DNA methylation data with other genomic data types.