11 results on '"Sauerbrei, Willi"'
Search Results
2. The Use of Resampling Methods to Simplify Regression Models in Medical Statistics
- Author
-
Sauerbrei, Willi
- Published
- 1999
3. Investigation about a screening step in model selection
- Author
-
Sauerbrei, Willi, Holländer, Norbert, and Buchholz, Anika
- Published
- 2008
- Full Text
- View/download PDF
4. Subsampling versus bootstrapping in resampling-based model selection for multivariable regression
- Author
-
De Bin, Riccardo, Janitza, Silke, Sauerbrei, Willi, and Boulesteix, Anne-Laure
- Subjects
bootstrap ,model selection ,model stability ,subsampling - Abstract
In the last few years, increasing attention has been devoted to the problem of the stability of multivariable regression models, understood as the resistance of the model to small changes in the data on which it has been fitted. Resampling techniques, mainly based on the bootstrap, have been developed to address this issue. In particular, the approaches based on the idea of "inclusion frequency" consider the repeated implementation of a variable selection procedure, for example backward elimination, on several bootstrap samples. The analysis of the variables selected in each iteration provides useful information on the model stability and on the variables' importance. Recent findings, nevertheless, show possible pitfalls in the use of the bootstrap, and alternatives such as subsampling have started to be taken into consideration in the literature. Based on model selection frequencies and variable inclusion frequencies, we aim to empirically compare these two different resampling techniques, investigating the effect of their use in a model selection procedure for multivariable regression. We conduct our investigations by analyzing two real data examples and by performing a simulation study. Our results reveal some advantages in using a subsampling technique rather than the bootstrap in this context.
- Published
- 2014
- Full Text
- View/download PDF
5. Handling co-dependence issues in resampling-based variable selection procedures: a simulation study.
- Author
-
De Bin, Riccardo and Sauerbrei, Willi
- Subjects
- *
PERTURBATION theory , *ERROR probability , *COMPUTER simulation , *BIG data , *GAUSSIAN distribution , *MANAGEMENT - Abstract
If a number of candidate variables are available, variable selection is a key task aiming to identify those candidates which influence the outcome of interest. Methods as backward elimination, forward selection, etc. are often implemented, despite their drawbacks. One of these drawbacks is the instability of their results with respect to small perturbations in the data. To handle this issue, resampling-based procedures have been introduced; using a resampling technique, e.g. bootstrap, these procedures generate several pseudo-samples that are used to compute the inclusion frequency of each variable, i.e. the proportion of pseudo-samples in which the variable is selected. Based on the inclusion frequencies, it is possible to discriminate between relevant and irrelevant variables. These procedures may fail in case of correlated variables. To deal with this issue, two procedures based on 2×2 tables of inclusion frequencies have been developed in the literature. In this paper we analyse the behaviours of these two procedures and the role of their tuning parameters in an extensive simulation study. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
6. Bootstrap assessment of the stability of multivariable models
- Author
-
Royston, Patrick and Sauerbrei, Willi
- Subjects
mfpboot ,continuous covariates ,mfpboot_bif ,fractional polynomials ,mfp ,stability ,pmbevalfn ,multivariable modeling ,bagging ,pmbstabil ,pmbeval ,bootstrap ,Research Methods/ Statistical Methods ,fracpoly - Abstract
Assessing the instability of a multivariable model is important but is rarely done in practice. Model instability occurs when selected predictors—and for multivariable fractional polynomial modeling, selected functions of continuous predictors—are sensitive to small changes in the data. Bootstrap analysis is a useful technique for investigating variations among selected models in samples drawn at random with replacement. Such samples mimic datasets that are structurally similar to that under study and that could plausibly have arisen instead. The bootstrap inclusion fraction of a candidate variable usefully indicates the importance of the variable. We describe Stata tools for stability analysis in the context of the mfp command for multivariable model building. We offer practical guidance and illustrate the application of the tools to a study in prostate cancer.
- Published
- 2009
- Full Text
- View/download PDF
7. Subsampling versus Bootstrapping in Resampling-Based Model Selection for Multivariable Regression.
- Author
-
De Bin, Riccardo, Janitza, Silke, Sauerbrei, Willi, and Boulesteix, Anne‐Laure
- Subjects
STATISTICAL bootstrapping ,MULTIVARIATE analysis ,SIMULATION methods & models ,PATHOGENIC microorganisms ,BAYESIAN analysis - Abstract
In recent years, increasing attention has been devoted to the problem of the stability of multivariable regression models, understood as the resistance of the model to small changes in the data on which it has been fitted. Resampling techniques, mainly based on the bootstrap, have been developed to address this issue. In particular, the approaches based on the idea of "inclusion frequency" consider the repeated implementation of a variable selection procedure, for example backward elimination, on several bootstrap samples. The analysis of the variables selected in each iteration provides useful information on the model stability and on the variables' importance. Recent findings, nevertheless, show possible pitfalls in the use of the bootstrap, and alternatives such as subsampling have begun to be taken into consideration in the literature. Using model selection frequencies and variable inclusion frequencies, we empirically compare these two different resampling techniques, investigating the effect of their use in selected classical model selection procedures for multivariable regression. We conduct our investigations by analyzing two real data examples and by performing a simulation study. Our results reveal some advantages in using a subsampling technique rather than the bootstrap in this context. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
8. The practical utility of incorporating model selection uncertainty into prognostic models for survival data.
- Author
-
Augustin, Nicole, Sauerbrei, Willi, and Schumacher, Martin
- Subjects
- *
PROGNOSIS , *DIAGNOSIS , *STATISTICAL sampling , *SURVIVAL , *CONFIDENCE intervals , *STATISTICAL hypothesis testing - Abstract
Predictions of disease outcome in prognostic factor models are usually based on one selected model. However, often several models fit the data equally well, but these models might differ substantially in terms of included explanatory variables and might lead to different predictions for individual patients. For survival data, we discuss two approaches to account for model selection uncertainty in two data examples, with the main emphasis on variable selection in a proportional hazard Cox model. The main aim of our investigation is to establish the ways in which either of the two approaches is useful in such prognostic models. The first approach is Bayesian model averaging (BMA) adapted for the proportional hazard model, termed 'approx. BMA' here. As a new approach, we propose a method which averages over a set of possible models using weights estimated from bootstrap resampling as proposed by Buckland et al., but in addition, we perform an initial screening of variables based on the inclusion frequency of each variable to reduce the set of variables and corresponding models. For some necessary parameters of the procedure, investigations concerning sensible choices are still required. The main objective of prognostic models is prediction, but the interpretation of single effects is also important and models should be general enough to ensure transportability to other clinical centres. In the data examples, we compare predictions of our new approach with approx. BMA, with 'conventional' predictions from one selected model and with predictions from the full model. Confidence intervals are compared in one example. Comparisons are based on the partial predictive score and the Brier score. We conclude that the two model averaging methods yield similar results and are especially useful when there is a high number of potential prognostic factors, most likely some of them without influence in a multivariable context. Although the method based on bootstrap resampling lacks formal justification and requires some ad hoc decisions, it has the additional positive effect of achieving model parsimony by reducing the number of explanatory variables and dealing with correlated variables in an automatic fashion. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
9. Detecting an interaction between treatment and a continuous covariate: A comparison of two approaches
- Author
-
Sauerbrei, Willi, Royston, Patrick, and Zapien, Karina
- Subjects
- *
CLINICAL trials , *CLINICAL medicine , *MEDICAL experimentation on humans , *CANCER patients - Abstract
Abstract: In clinical trials, there is considerable interest in investigating whether a treatment effect is similar in all patients, or that some prognostic variable indicates a differential response to treatment. To examine this, a continuous predictor is usually categorized into groups according to one or more cutpoints. The treatment/covariate interaction is then analyzed in factorial fashion using multiplicative terms. The use of cutpoints raises several difficult issues for the analyst. It is preferable to keep continuous variables continuous in such a model. To achieve this, the MFP algorithm for multivariable model-building with fractional polynomials was recently extended to a new algorithm called multivariable fractional polynomial interaction (MFPI). With the latter, covariates may be binary, categorical or continuous, and cutpoints are avoided. MFPI is compared with a graphical technique, the subpopulation treatment-effect pattern plot or subpopulation treatment effect pattern plot (STEPP). Differences between MFPI and STEPP are illustrated by re-analysis of a randomized trial in kidney cancer. The stability of the two procedures is investigated by using the bootstrap. The Type I error probability of MFPI to ‘detect’ spurious interactions is estimated by simulation. MFPI and STEPP are found to exhibit similar treatment/covariate interactions. The tail-oriented variant of STEPP is found to give more stable and interpretable results than the sliding window variant. The type 1 error probabilty of MFPI is found to be close to its nominal value. [Copyright &y& Elsevier]
- Published
- 2007
- Full Text
- View/download PDF
10. Detection of influential points as a byproduct of resampling-based variable selection procedures.
- Author
-
De Bin, Riccardo, Boulesteix, Anne-Laure, and Sauerbrei, Willi
- Subjects
- *
RESAMPLING (Statistics) , *MATHEMATICAL variables , *WASTE products , *STATISTICAL bootstrapping , *OUTLIERS (Statistics) - Abstract
Influential points can cause severe problems when deriving a multivariable regression model. A novel approach to check for such points is proposed, based on the variable inclusion matrix, a simple way to summarize results from resampling-based variable selection procedures. The variable inclusion matrix reports whether a variable (column) is included in a regression model fitted on a pseudo-sample (row) generated from the original data (e.g., bootstrap sample or subsample). It is used to study the variable selection stability, to derive weights for model averaged predictors and in others investigations. Concentrating on variable selection, it also allows understanding whether the presence of a specific observation has an influence on the selection of a variable. From the variable inclusion matrix, indeed, the inclusion frequency (I-frequency) of each variable can be computed only in the pseudo-samples (i.e., rows) which contain the specific observation. When the procedure is repeated for each observation, it is possible to check for influential points through the distribution of the I-frequencies, visualized in a boxplot, or through a Grubbs’ test. Outlying values in the former case and significant results in the latter point to observations having an influence on the selection of a specific variable and therefore on the finally selected model. This novel approach is illustrated in two real data examples. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
11. On properties of predictors derived with a two-step bootstrap model averaging approach—A simulation study in the linear regression model
- Author
-
Buchholz, Anika, Holländer, Norbert, and Sauerbrei, Willi
- Subjects
- *
REGRESSION analysis , *MATHEMATICAL statistics , *MULTIVARIATE analysis , *STATISTICS - Abstract
Abstract: In many applications of model selection there is a large number of explanatory variables and thus a large set of candidate models. Selecting one single model for further inference ignores model selection uncertainty. Often several models fit the data equally well. However, these models may differ in terms of the variables included and might lead to different predictions. To account for model selection uncertainty, model averaging procedures have been proposed. Recently, an extended two-step bootstrap model averaging approach has been proposed. The first step of this approach is a screening step. It aims to eliminate variables with negligible effect on the outcome. In the second step the remaining variables are considered in bootstrap model averaging. A large simulation study is performed to compare the MSE and coverage rate of models derived with bootstrap model averaging, the full model, backward elimination using Akaike and Bayes information criterion and the model with the highest selection probability in bootstrap samples. In a data example, these approaches are also compared with Bayesian model averaging. Finally, some recommendations for the development of predictive models are given. [Copyright &y& Elsevier]
- Published
- 2008
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.