5 results on '"van Nee, Mirrelijn"'
Search Results
2. Co-data learning
- Author
-
van Nee, Mirrelijn Meander, van de Wiel, M.A., Wessels, Lodewyk, VUmc - School of Medical Sciences, van de Wiel, Mark, Wessels, Jeroen Niels, and VU University medical center
- Subjects
High-dimensional data, omics, clinical prediction, penalised generalised linear models, empirical Bayes, prior information, R ,High-dimensional data ,clinical prediction ,penalised generalised linear models ,empirical Bayes ,omics ,prior information - Abstract
Clinicians often research complex traits in which many variables may be involved in the process underlying the disease. Nowadays, with the advent of DNA sequencing techniques, clinical studies regularly research high‐dimensional data, in which the number of variables even far exceeds the number of samples. Generic research goals are to predict certain outcomes or to select variables, using the high-dimensional data possibly in addition to clinical variables like age and sex. Examples in cancer genomics are to diagnose cancer, classify cancer type and predict survival time based on gene expression data, or to find few genes that may predict these outcomes well. While the human genome contains around 20.000 genes, clinical studies usually only include measurements for around 100 patients. This limited amount of information makes it hard to find the “right” selection or combination of variables from the vast space of options. Luckily, more prior information on the variables is often available in the form of complementary data, or co-data, e.g. from public repositories or derived from domain knowledge. Co‐data may vary in type. Genes, for example, may be grouped in non‐overlapping groups for chromosomes, overlapping groups for pathways, hierarchical groups for gene ontology or assigned a continuous summary statistic derived from a similar study. We would like to learn from co‐data to improve prediction and variable selection. This dissertation presents three statistical methods and software for co‐data learning. We consider co‐data learnt penalised generalised linear and Cox survival models for the outcome. The penalties on the variables are informed by the co‐data, such that variables for more important co‐data are penalised less. For example, some biological functions may be more important than others, such that genes corresponding to these biological functions are ideally penalised relatively less. The penalty parameters are related to prior parameters for a prior distribution on the variables, which are estimated with an empirical Bayes approach. The presented methods differ in the type of co‐data and penalty or prior that may be used.
- Published
- 2023
- Full Text
- View/download PDF
3. Fast Marginal Likelihood Estimation of Penalties for Group-Adaptive Elastic Net.
- Author
-
van Nee, Mirrelijn M., van de Brug, Tim, and van de Wiel, Mark A.
- Subjects
- *
ASYMPTOTIC normality , *GENOMICS - Abstract
Elastic net penalization is widely used in high-dimensional prediction and variable selection settings. Auxiliary information on the variables, for example, groups of variables, is often available. Group-adaptive elastic net penalization exploits this information to potentially improve performance by estimating group penalties, thereby penalizing important groups of variables less than other groups. Estimating these group penalties is, however, hard due to the high dimension of the data. Existing methods are computationally expensive or not generic in the type of response. Here we present a fast method for estimation of group-adaptive elastic net penalties for generalized linear models. We first derive a low-dimensional representation of the Taylor approximation of the marginal likelihood for group-adaptive ridge penalties, to efficiently estimate these penalties. Then we show by using asymptotic normality of the linear predictors that this marginal likelihood approximates that of elastic net models. The ridge group penalties are then transformed to elastic net group penalties by matching the ridge prior variance to the elastic net prior variance as function of the group penalties. The method allows for overlapping groups and unpenalized variables, and is easily extended to other penalties. For a model-based simulation study and two cancer genomics applications we demonstrate a substantially decreased computation time and improved or matching performance compared to other methods. for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Flexible co‐data learning for high‐dimensional prediction.
- Author
-
van Nee, Mirrelijn M., Wessels, Lodewyk F.A., and van de Wiel, Mark A.
- Subjects
- *
EMPIRICAL Bayes methods , *BAYES' estimation , *MEDICAL research , *FORECASTING - Abstract
Clinical research often focuses on complex traits in which many variables play a role in mechanisms driving, or curing, diseases. Clinical prediction is hard when data is high‐dimensional, but additional information, like domain knowledge and previously published studies, may be helpful to improve predictions. Such complementary data, or co‐data, provide information on the covariates, such as genomic location or P‐values from external studies. We use multiple and various co‐data to define possibly overlapping or hierarchically structured groups of covariates. These are then used to estimate adaptive multi‐group ridge penalties for generalized linear and Cox models. Available group adaptive methods primarily target for settings with few groups, and therefore likely overfit for non‐informative, correlated or many groups, and do not account for known structure on group level. To handle these issues, our method combines empirical Bayes estimation of the hyperparameters with an extra level of flexible shrinkage. This renders a uniquely flexible framework as any type of shrinkage can be used on the group level. We describe various types of co‐data and propose suitable forms of hypershrinkage. The method is very versatile, as it allows for integration and weighting of multiple co‐data sets, inclusion of unpenalized covariates and posterior variable selection. For three cancer genomics applications we demonstrate improvements compared to other models in terms of performance, variable selection stability and validation. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
5. Flexible co-data learning for high-dimensional prediction
- Author
-
Mark A. van de Wiel, Lodewyk F. A. Wessels, Mirrelijn M. van Nee, van Nee, Mirrelijn M [0000-0001-7715-1446], and Apollo - University of Cambridge Repository
- Subjects
Statistics and Probability ,FOS: Computer and information sciences ,clinical prediction ,Epidemiology ,Computer science ,Stability (learning theory) ,Feature selection ,Machine Learning (stat.ML) ,Overfitting ,Machine learning ,computer.software_genre ,Methodology (stat.ME) ,Bayes' theorem ,Statistics - Machine Learning ,Covariate ,Humans ,Statistics - Methodology ,Proportional Hazards Models ,Hyperparameter ,business.industry ,Bayes Theorem ,Genomics ,Weighting ,omics ,prior information ,Domain knowledge ,Artificial intelligence ,penalized generalized linear models ,business ,computer ,empirical Bayes - Abstract
Clinical research often focuses on complex traits in which many variables play a role in mechanisms driving, or curing, diseases. Clinical prediction is hard when data is high-dimensional, but additional information, like domain knowledge and previously published studies, may be helpful to improve predictions. Such complementary data, or co-data, provide information on the covariates, such as genomic location or p-values from external studies. Our method enables exploiting multiple and various co-data sources to improve predictions. We use discrete or continuous co-data to define possibly overlapping or hierarchically structured groups of covariates. These are then used to estimate adaptive multi-group ridge penalties for generalised linear and Cox models. We combine empirical Bayes estimation of group penalty hyperparameters with an extra level of shrinkage. This renders a uniquely flexible framework as any type of shrinkage can be used on the group level. The hyperparameter shrinkage learns how relevant a specific co-data source is, counters overfitting of hyperparameters for many groups, and accounts for structured co-data. We describe various types of co-data and propose suitable forms of hypershrinkage. The method is very versatile, as it allows for integration and weighting of multiple co-data sets, inclusion of unpenalised covariates and posterior variable selection. We demonstrate it on two cancer genomics applications and show that it may improve the performance of other dense and parsimonious prognostic models substantially, and stabilises variable selection., Document consists of main content (20 pages, 10 figures) and supplementary material (14 pages, 13 figures)
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.