Overview

I work in statistical problems that usually appear in the analysis of proteomics and genomics data. For example, in these applications, some variables may be measured with error, the number of features is much larger than the number of observations, the observations are not always independent, outliers are present in the data, among others. Some extensions of classical estimation approaches have been proposed to address some of these characteristics and others remain open problems for future research.

I have suggested the following papers to the students, but they are welcome to choose any topic that suits their interest. As a general goal, I expect the student to understand the theoretical contribution of the papers, to know the related literature, to reproduce and/or complete some of the theoretical results, to explain its relevance, and to test the proposed methodology in simulated and/or real data. Once the student selects one of these papers, I will give more specific details in an individual meeting with the student.

List of Suggested Papers:

Angrist, J. D., Imbens, G. W., and Rubin, D. B. (1996), Identification of Causal Effects Using Instrumental Variables (with discussion), Journal of the American Statistical Association, 91, 444–455.
J. Gertheiss and G Tutz. (2010) SPARSE MODELING OF CATEGORIAL EXPLANATORY VARIABLES. Ann of Applied Stat, 2010, Vol. *4*, No. 4, 2150–2180.
B. Efron (1988) Logistic Regression, Survival Analysis, and the Kaplan-Meier Curve Journal of the American Statistical Association 83, 414-425.