Covariate selection for the semiparametric additive risk model

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

This paper considers covariate selection for the additive hazards model. This model is particularly simple to study theoretically and its practical implementation has several major advantages to the similar methodology for the proportional hazards model. One complication compared with the proportional model is, however, that there is no simple likelihood to work with. We here study a least squares criterion with desirable properties and show how this criterion can be interpreted as a prediction error. Given this criterion, we define ridge and Lasso estimators as well as an adaptive Lasso and study their large sample properties for the situation where the number of covariates p is smaller than the number of observations. We also show that the adaptive Lasso has the oracle property. In many practical situations, it is more relevant to tackle the situation with large p compared with the number of observations. We do this by studying the properties of the so-called Dantzig selector in the setting of the additive risk model. Specifically, we establish a bound on how close the solution is to a true sparse signal in the case where the number of covariates is large. In a simulation study, we also compare the Dantzig and adaptive Lasso for a moderate to small number of covariates. The methods are applied to a breast cancer data set with gene expression recordings and to the primary biliary cirrhosis clinical data.
OriginalsprogEngelsk
TidsskriftScandinavian Journal of Statistics
Vol/bind36
Udgave nummer4
Sider (fra-til)602-619
ISSN0303-6898
DOI
StatusUdgivet - 2009

ID: 16215132