|
Longhai Li  
(Ph.D. in Statisitcs,University of Toronto)
Assistant Professor
|
|
Performing classification and regression with high-dimensional measurements (e.g. gene expression data) is challenging for practical, computational, and statistical reasons. I am interested in developing Bayesian methodologies for modeling the high-dimensional variables, in order to find a better predictive distribution of the response given the high-dimensional variables.
A response variable (e.g. a certain disease) may be related to high-order interactions of a set of covariates (e.g. interactions between genes and environmental exposure). However, it is challenging to use these interactions in parametric statistical models, from both statistical and computational aspects. I am interested in using Bayesian methodologies to model this relationship and solving the arising computational problems.
Only about 1-2\% of the entire human genome corresponds to gene regions. It is believed that much of the mechanism for controlling when, where, and how much protein will be produced is located in the ``noncoding'' region located upstream the gene sequence. The identification of motifs and their transcription-factor binding sites are key steps toward understanding transcription regulation. The statistical problem in here is how to do cluster analysis for DNA sequence. I am interested in developing Bayesian methodologies that can model the dependency in DNA sequences, in order to find better methodologies for cluster analysis.
Bayesian Classification and Regression, Monte Carlo Methods, Machine Learning, Bioinformatics