Date(s) : 22/02/2021 iCal
14 h 00 min - 16 h 00 min
High dimensional estimation in the monotone single-index model
Fadoua Balabdaoui, C ́ecile Durot and Christopher Fragneau
I study the monotone single-index model which assumes that a real variable Y is linked to a d dimensional real vector X through the relationship E[Y |X] = Ψ0(α0T X) a.s., where the real monotonic function Ψ0 and α0 are unknown. This model is well-known in economics, medecine and biostatistics where the monotonicity of Ψ0 appears naturally. Given n replications of (X, Y ) and assuming that α0 belongs to S, the d unit dimensional sphere, my aim is to estimate (α0 , Ψ0 ) in the high dimensional context, where d is allowed to depend on n and to grow to innity with n.
To address this issue, I consider two different M-estimation procedures: a least-squares procedure, and a variant, which consist in minimizing an appropriate criterion over general classes K = F × C, where F is a given closed subset of S, and C is the set of all non- decreasing real valued functions. The facts that the unknown index α0 is bundled into the unknown ridge function, and that no smoothness assumption is made on the ridge function, make the estimation problem very challenging.
With the goal of studying the asymptotic behavior of both M-estimation procedures, I first consider the population least-squares criterion
(α, Ψ) → M(α, Ψ) := E[(Ψ0(α0T X) − Ψ(αT X))2].
I establish the pointwise convergence over K of the least-squares criterion, as the sample size goes to infinity, to the population least-squares criterion. Moreover, I prove existence of minimizers of the population least-squares criterion over K and I study the direction of variation of this criterion in order to describe the minimizers.
Second, I focus on constrained least-squares estimators over K. In a setting where d depends on n and the distribution of X is either bounded or sub-Gaussian, I establish the rates of convergence of the estimators of Ψ0(α0T ), α0 and Ψ0 in case where (α0, Ψ0) ∈ K, as well as the consistency of estimators of Ψ0(α0T ), otherwise. A simulation study of the estimators of Ψ0(α0T ), α0 on simulated data in case where F is the set of vector of S with few nonzero components, has shown good performance, particularly in terms of support recovery of α0.
Third, I consider an estimation method of (α0, Ψ0) when X is assumed to be a Gaussian vector. This method fits a mispecified linear model, and estimates its parameter vector thanks to the de-sparcified Lasso method of Zhang and Zhang (2014). I show that the resulting estimator divided by its Euclidean norm is Gaussian and converges to α0, at parametric rate. I provide estimators of Ψ0(α0T ) and Ψ0, and I establish their rates of convergence. The advantage of this estimator as compared to the previous one is that it is less computationaly expensive and it requires the choice of a tuning parameter and X is assumed to be Gaussian. A simulation study of the estimators of Ψ0(α0T ), α0 from both two M-estimation procedures on simulated data has shown good performance, particularly in terms of support recovery of α0.