Stepwise methods for variable selection are frequently used to determine the predictors of an outcome in generalized linear models. Although it is widely used within the scientific community, it is well known that the tests on the explained deviance of the selected model are biased. This arises from the fact that the traditional test statistics upon which these methods are based were intended for testing pre-specified hypotheses; instead, the tested model is selected through a data-driven procedure. A multiplicity problem therefore arises. In this work, we define and discuss a nonparametric procedure to adjust the p-value of the selected model of any stepwise selection method. The unbiasedness and consistency of the method is also proved. A simulation study shows the validity of this procedure. Theoretical differences with previous works in the same field are also discussed.
Adjusting Stepwise p-Values in Generalized Linear Models
FINOS, LIVIO;SALMASO, LUIGI
2010
Abstract
Stepwise methods for variable selection are frequently used to determine the predictors of an outcome in generalized linear models. Although it is widely used within the scientific community, it is well known that the tests on the explained deviance of the selected model are biased. This arises from the fact that the traditional test statistics upon which these methods are based were intended for testing pre-specified hypotheses; instead, the tested model is selected through a data-driven procedure. A multiplicity problem therefore arises. In this work, we define and discuss a nonparametric procedure to adjust the p-value of the selected model of any stepwise selection method. The unbiasedness and consistency of the method is also proved. A simulation study shows the validity of this procedure. Theoretical differences with previous works in the same field are also discussed.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.