DESCRIPTION:Vivian VIALLON (Université Pierre Bernard Lyon et IARC)The ana
lysis of case-control studies with several disease subtypes is increasingl
y common\, e.g. in cancer epidemiology. For matched designs\, we show that
a natural strategy is based on a stratified conditional logistic regressi
on model. Then\, to account for the expected similarities among disease su
btypes\, we adapt the ideas of data shared lasso\, which has recently been
proposed for the estimation of regression models in a stratified setting.
For unmatched designs\, we compare two standard methods based on L1-norm
penalized multinomial logistic regression. We describe formal connections
between these two approaches\, from which practical guidance can be derive
d. We show that one of these approaches\, which is based on a symmetric fo
rmulation of the multinomial logistic regression model\, actually reduces
to a data shared lasso version of the other. Consequently\, the relative p
erformance of the two approaches critically depends on the level of simila
rity that exists among the disease subtypes: more precisely\, when similar
ity is moderate to high\, the non-symmetric formulation with controls as t
he reference is not recommended. Empirical results obtained from synthetic
data are presented\, which confirm the benefit of properly accounting for
similarity under both matched and unmatched designs. We also present prel
iminary results from the analysis a case-control study nested within the E
PIC cohort\, where the objective is to identify metabolites associated wit
h the risk of cancer subtypes.This is a joint work with Nadim Ballout and
Cedric Garcia.https://sites.google.com/site/vivianviallon/
