Christopher Fragneau (université Paris-Nanterre) : High dimensional estimation in the monotone single-index model.

Date(s) : 22/02/2021 iCal
14 h 00 min - 16 h 00 min

Résumé :

High dimensional estimation in the monotone single-index model

Fadoua Balabdaoui, C ́ecile Durot and Christopher Fragneau

Abstract

I study the monotone single-index model which assumes that a real variable Y is linked to a d dimensional real vector X through the relationship E[Y |X] = Ψ₀(α₀^T X) a.s., where the real monotonic function Ψ₀ and α0 are unknown. This model is well-known in economics, medecine and biostatistics where the monotonicity of Ψ₀ appears naturally. Given n replications of (X, Y ) and assuming that α0 belongs to S, the d unit dimensional sphere, my aim is to estimate (α₀ , Ψ₀ ) in the high dimensional context, where d is allowed to depend on n and to grow to innity with n.

To address this issue, I consider two different M-estimation procedures: a least-squares procedure, and a variant, which consist in minimizing an appropriate criterion over general classes K = F × C, where F is a given closed subset of S, and C is the set of all non- decreasing real valued functions. The facts that the unknown index α0 is bundled into the unknown ridge function, and that no smoothness assumption is made on the ridge function, make the estimation problem very challenging.

With the goal of studying the asymptotic behavior of both M-estimation procedures, I first consider the population least-squares criterion

(α, Ψ) → M(α, Ψ) := E[(Ψ₀(α₀^T X) − Ψ(α^T X))²].

I establish the pointwise convergence over K of the least-squares criterion, as the sample size goes to infinity, to the population least-squares criterion. Moreover, I prove existence of minimizers of the population least-squares criterion over K and I study the direction of variation of this criterion in order to describe the minimizers.

Second, I focus on constrained least-squares estimators over K. In a setting where d depends on n and the distribution of X is either bounded or sub-Gaussian, I establish the rates of convergence of the estimators of Ψ₀(α₀^T ), α₀ and Ψ₀ in case where (α₀, Ψ₀) ∈ K, as well as the consistency of estimators of Ψ₀(α₀^T ), otherwise. A simulation study of the estimators of Ψ₀(α₀^T ), α₀ on simulated data in case where F is the set of vector of S with few nonzero components, has shown good performance, particularly in terms of support recovery of α₀.

Third, I consider an estimation method of (α₀, Ψ₀) when X is assumed to be a Gaussian vector. This method fits a mispecified linear model, and estimates its parameter vector thanks to the de-sparcified Lasso method of Zhang and Zhang (2014). I show that the resulting estimator divided by its Euclidean norm is Gaussian and converges to α₀, at parametric rate. I provide estimators of Ψ₀(α₀^T ) and Ψ0, and I establish their rates of convergence. The advantage of this estimator as compared to the previous one is that it is less computationaly expensive and it requires the choice of a tuning parameter and X is assumed to be Gaussian. A simulation study of the estimators of Ψ₀(α₀^T ), α₀ from both two M-estimation procedures on simulated data has shown good performance, particularly in terms of support recovery of α₀.

Catégories

Séminaire Statistique