Longhai Li's Research

My Research Objectives

Fitting complex models is necessary in various scientific areas and practical applications. For example it is necessary in many classification and regression problems such as: the diagnosis of diseases (e.g. cancer, diabetes) using gene expression data or other high throughput data; the diagnosis of diseases based on high-order interaction between genes and environmental factors (i.e. disease appearance is dependent on a group of factors, rather than only a single one or part of them); the prediction of the next character in a sequence of human language by considering high-order interactions between the precedent characters; the forecasting of financial and economic indices based on a long historic sequence.

Many traditional statistical methods, such as those based on maximizing likelihood functions, are not justified from their applicability to their theoretical foundations, when applied to fit complex models with a limited number of observations. The fundamental difficulty is that the information from the data alone is insufficient to support the inference for complex models. The Bayesian approach is therefore becoming necessary for such models, as its prior part enables the incorporation of extra information from other sources into our inference. That is, using the Bayesian approach allows fitting necessarily complex models with limited observations without resulting in overfitting. In actual practice, Bayesian analysis poses computational difficulties, as it requires calculating high-dimensional and analytically intractable integrals with respect to the posterior distributions for inferring unknown quantities. In the past decade, the development of Markov chain Monte Carlo (MCMC) algorithms has greatly eased this difficulty for many problems. However, applying MCMC to integrals over a large number (e.g. thousands) of variables remains largely infeasible in practice. My long-term research objective is to develop Bayesian methodologies for complex models that can be computed efficiently. I will take an approach of simultaneously investigating the specification of a model and the sampling method for it. I will publicly release software packages for my proposed methods in order to facilitate routine applications by other researchers and statistical practitioners.

Applied Topics Interesting Me

Classification and Regression with High-dimensional Measurements: There is an increasing demanding for efficient and accurate classification and regression algorithms based on high-dimensional features (e.g. hyperspectral data generated by remote sensing technology, gene expression data). I am interested in developing Bayesian methodologies to model the high-dimensional variables, in order to find a better predictive distribution of the response given the high-dimensional variables, and search differential features.

Classification and Regression with High-order Interactions: A response variable (e.g. a certain disease) may be related to high-order interactions of a set of covariates (e.g. interactions between genes and environmental exposures). I am interested in developing and applying efficient Bayesian methodologies to find such interactions.

Modelling DNA sequences: It is hypothesized that there is long-range dependency among DNA sequences. Modelling this dependency is crucial in many problems in bioinformatics, such as discovering transcription-factor binding sites and motifs, and halplotype inference. I am interested in developing Bayesian methodologies to model such dependency.

Talk to me about your things. I may feel interested too.

To Perspective Students

Oh yes, I am interesting in working with and supporting graduate students and post-doctoral researchers. You are welcomed to talk to me :).

Here I post a page of comments from an anonymous peer on my NSERC application. It doesn't have any other implicitness, but serves the purpose of attracting graduate students to work with me (which is somehow difficult for a researcher at this stage). Hereby I also want to thank this peer for his great encouragement on my research.


Back to Longhai Li's homepage, or go to his publications page, or his software release page.