29 results on '"Vink, Gerko"'
Search Results
2. Roderick J. Little and Donald B. Rubin: Statistical Analysis with Missing Data: Wiley, Hoboken, NJ, 2020, 464 pp, $98.95 (hardcover), $79.00 (eBook), Print ISBN: 9780470526798, Online ISBN: 9781119482260
- Author
-
Vink, Gerko
- Published
- 2022
- Full Text
- View/download PDF
3. A note on imputing squares via polynomial combination approach
- Author
-
Cai, Mingyang and Vink, Gerko
- Published
- 2022
- Full Text
- View/download PDF
4. Graphical and numerical diagnostic tools to assess multiple imputation models by posterior predictive checking
- Author
-
Cai, Mingyang, van Buuren, Stef, and Vink, Gerko
- Published
- 2023
- Full Text
- View/download PDF
5. The Dance of the Mechanisms: How Observed Information Influences the Validity of Missingness Assumptions
- Author
-
Schouten, Rianne Margaretha and Vink, Gerko
- Abstract
Missing data in scientific research go hand in hand with assumptions about the nature of the missingness. When dealing with missing values, a set of beliefs has to be formulated about the extent to which the observed data may also hold for the missing parts of the data. It is vital that the validity of these missingness assumptions is verified, tested, and that assumptions are adjusted when necessary. In this article, we demonstrate how observed data structures could a priori indicate whether it is likely that our beliefs about the missingness can be trusted. To this end, we simulate complete data and generate missing values according several types of MCAR, MAR, and MNAR mechanisms. We demonstrate that in scenarios where the data correlations are either low or very substantial, strictly different mechanisms yield equivalent statistical inferences. In addition, we show that the choice of quantity of scientific interest together with the distribution of the nonresponse govern the validity of the missingness assumptions.
- Published
- 2021
- Full Text
- View/download PDF
6. Toward a standardized evaluation of imputation methodology.
- Author
-
Oberman, Hanne I. and Vink, Gerko
- Abstract
Developing new imputation methodology has become a very active field. Unfortunately, there is no consensus on how to perform simulation studies to evaluate the properties of imputation methods. In part, this may be due to different aims between fields and studies. For example, when evaluating imputation techniques aimed at prediction, different aims may be formulated than when statistical inference is of interest. The lack of consensus may also stem from different personal preferences or scientific backgrounds. All in all, the lack of common ground in evaluating imputation methodology may lead to suboptimal use in practice. In this paper, we propose a move toward a standardized evaluation of imputation methodology. To demonstrate the need for standardization, we highlight a set of possible pitfalls that bring forth a chain of potential problems in the objective assessment of the performance of imputation routines. Additionally, we suggest a course of action for simulating and evaluating missing data problems. Our suggested course of action is by no means meant to serve as a complete cookbook, but rather meant to incite critical thinking and a move to objective and fair evaluations of imputation methodology. We invite the readers of this paper to contribute to the suggested course of action. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Effectiveness and moderators of individual cognitive behavioral therapy versus treatment as usual in clinically depressed adolescents: a randomized controlled trial
- Author
-
Stikkelbroek, Yvonne, Vink, Gerko, Nauta, Maaike H., Bottelier, Marco A., Vet, Leonieke J. J., Lont, Cathelijne M., van Baar, Anneloes L., and Bodden, Denise H. M.
- Published
- 2020
- Full Text
- View/download PDF
8. The Effectiveness of Parent Management Training—Oregon Model in Clinically Referred Children with Externalizing Behavior Problems in The Netherlands
- Author
-
Thijssen, Jill, Vink, Gerko, Muris, Peter, and de Ruiter, Corine
- Published
- 2017
- Full Text
- View/download PDF
9. A blended distance to define 'people-like-me'
- Author
-
Fopma, Anaïs, Cai, Mingyang, van Buuren, Stef, and Vink, Gerko
- Subjects
Methodology (stat.ME) ,FOS: Computer and information sciences ,stat.ME ,Statistics - Methodology - Abstract
Curve matching is a prediction technique that relies on predictive mean matching, which matches donors that are most similar to a target based on the predictive distance. Even though this approach leads to high prediction accuracy, the predictive distance may make matches look unconvincing, as the profiles of the matched donors can substantially differ from the profile of the target. To counterbalance this, similarity between the curves of the donors and the target can be taken into account by combining the predictive distance with the Mahalanobis distance into a `blended distance' measure. The properties of this measure are evaluated in two simulation studies. Simulation study I evaluates the performance of the blended distance under different data-generating conditions. The results show that blending towards the Mahalanobis distance leads to worse performance in terms of bias, coverage, and predictive power. Simulation study II evaluates the blended metric in a setting where a single value is imputed. The results show that a property of blending is the bias-variance trade off. Giving more weight to the Mahalanobis distance leads to less variance in the imputations, but less accuracy as well. The main conclusion is that the high prediction accuracy achieved with the predictive distance necessitates the variability in the profiles of donors.
- Published
- 2022
10. Prevalence of questionable research practices, research misconduct and their potential explanatory factors:A survey among academic researchers in the Netherlands
- Author
-
Gopalakrishna, Gowri, ter Riet, Gerben, Vink, Gerko, Stoop, Ineke, Wicherts, Jelte M., Bouter, Lex M., Leerstoel van Buuren, Methodology and statistics for the behavioural and social sciences, Cardiology, ACS - Diabetes & metabolism, APH - Aging & Later Life, APH - Personalized Medicine, Leerstoel van Buuren, Methodology and statistics for the behavioural and social sciences, Department of Methodology and Statistics, Epidemiology and Data Science, APH - Methodology, ACS - Atherosclerosis & ischemic syndromes, and Research integrity
- Subjects
Male ,bepress|Physical Sciences and Mathematics ,Biomedical Research ,Science ,Scientific Misconduct ,Ethics, Research ,Biomedical Research/ethics ,Surveys and Questionnaires ,Prevalence ,Humans ,Scientific Misconduct/ethics ,General ,Research Design/standards ,Ethics ,MetaArXiv|Social and Behavioral Sciences ,Multidisciplinary ,Research ,bepress|Medicine and Health Sciences ,MetaArXiv|Medicine and Health Sciences ,Research Personnel ,MetaArXiv|Physical Sciences and Mathematics ,Cross-Sectional Studies ,Research Design ,Research Personnel/ethics ,bepress|Social and Behavioral Sciences ,Medicine ,Female - Abstract
BackgroundPrevalence of research misconduct, questionable research practices (QRPs) and their associations with a range of explanatory factors has not been studied sufficiently among academic researchers.Methods The National Survey on Research Integrity was aimed at all disciplinary fields and academic ranks in the Netherlands. The survey enquired about engagement in fabrication, falsification and 11 QRPs over the previous three years, and 12 explanatory factor scales. We ensured strict identity protection and used a randomized response method for questions on research misconduct. Results6,813 respondents completed the survey. Prevalence of fabrication was 4.3% (95% CI: 2.9, 5.7) and falsification 4.2% (95% CI: 2.8, 5.6). Prevalence of QRPs ranged from 0.6% (95% CI: 0.5, 0.9) to 17.5% (95 % CI: 16.4, 18.7) with 51.3% (95% CI: 50.1, 52.5) of respondents engaging frequently in ≥ 1 QRP. Being a PhD candidate or junior researcher increased the odds of frequently engaging in ≥ 1 QRP, as did being male. Scientific norm subscription (odds ratio (OR) 0.79; 95% CI: 0.63, 1.00) and perceived likelihood of detection by reviewers (OR 0.62, 95% CI: 0.44, 0.88) were associated with lower odds of research misconduct. Publication pressure was associated with higher odds of engaging frequently in ≥ 1 QRP (OR 1.22, 95% CI: 1.14, 1.30).ConclusionsWe found higher prevalence of misconduct than earlier surveys. Our results suggest that greater emphasis on scientific norm subscription, strengthening reviewers in their role as gatekeepers of research quality and curbing the “publish or perish” incentive system can promote research integrity.
- Published
- 2022
11. Generalizing Univariate Predictive Mean Matching to Impute Multiple Variables Simultaneously
- Author
-
Cai, Mingyang, van Buuren, Stef, Vink, Gerko, Arai, Kohei, Leerstoel van Buuren, and Methodology and statistics for the behavioural and social sciences
- Subjects
Predictive mean matching ,Multivariate analysis ,Control and Systems Engineering ,Computer Networks and Communications ,Missing data ,Signal Processing ,Multiple imputation ,Canonical regression analysis ,Block imputation - Abstract
Predictive mean matching (PMM) is an easy-to-use and versatile univariate imputation approach. It is robust against transformations of the incomplete variable and violation of the normal model. However, univariate imputation methods cannot directly preserve multivariate relations in the imputed data. We wish to extend PMM to a multivariate method to produce imputations that are consistent with the knowledge of derived data (e.g., data transformations, interactions, sum restrictions, range restrictions, and polynomials). This paper proposes multivariate predictive mean matching (MPMM), which can impute incomplete variables simultaneously. Instead of the normal linear model, we apply canonical regression analysis to calculate the predicted value used for donor selection. To evaluate the performance of MPMM, we compared it with other imputation approaches under four scenarios: 1) multivariate normal distributed data, 2) linear regression with quadratic terms; 3) linear regression with interaction terms; 4) incomplete data with inequality restrictions. The simulation study shows that with moderate missingness patterns, MPMM provides plausible imputations at the univariate level and preserves relations in the data.
- Published
- 2022
12. Prevalence of responsible research practices among academics in The Netherlands [version 2; peer review: 2 approved]
- Author
-
Gopalakrishna, Gowri, Wicherts, Jelte M., Vink, Gerko, Stoop, Ineke, van den Akker, Olmo R., ter Riet, Gerben, Bouter, Lex M., Faculteit Gezondheid, and Lectoraat Voeding en Beweging
- Abstract
Background: Traditionally, research integrity studies have focused on research misbehaviors and their explanations. Over time, attention has shifted towards preventing questionable research practices and promoting responsible ones. However, data on the prevalence of responsible research practices, especially open methods, open codes and open data and their underlying associative factors, remains scarce. Methods: We conducted a web-based anonymized questionnaire, targeting all academic researchers working at or affiliated to a university or university medical center in The Netherlands, to investigate the prevalence and potential explanatory factors of 11 responsible research practices. Results: A total of 6,813 academics completed the survey, the results of which show that prevalence of responsible practices differs substantially across disciplines and ranks, with 99 percent avoiding plagiarism in their work but less than 50 percent pre-registering a research protocol. Arts and humanities scholars as well as PhD candidates and junior researchers engaged less often in responsible research practices. Publication pressure negatively affected responsible practices, while mentoring, scientific norms subscription and funding pressure stimulated them. Conclusions: Understanding the prevalence of responsible research practices across disciplines and ranks, as well as their associated explanatory factors, can help to systematically address disciplinary- and academic rank-specific obstacles, and thereby facilitate responsible conduct of research.
- Published
- 2022
13. How to relate potential outcomes: Estimating individual treatment effects under a given specified partial correlation
- Author
-
Cai, Mingyang, van Buuren, Stef, and Vink, Gerko
- Subjects
FOS: Computer and information sciences ,multivariate data analysis ,Multiple imputation ,joint modeling imputation ,Statistics - Computation ,Computation (stat.CO) ,iterativeimputation - Abstract
In most medical research, the average treatment effect is used to evaluate a treatment's performance. However, precision medicine requires knowledge of individual treatment effects: What is the difference between a unit's measurement under treatment and control conditions? In most treatment effect studies, such answers are not possible because the outcomes under both experimental conditions are not jointly observed. This makes the problem of causal inference a missing data problem. We propose to solve this problem by imputing the individual potential outcomes under a specified partial correlation (SPC), thereby allowing for heterogeneous treatment effects. We demonstrate in simulation that our proposed methodology yields valid inferences for the marginal distribution of potential outcomes. We highlight that the posterior distribution of individual treatment effects varies with different specified partial correlations. This property can be used to study the sensitivity of optimal treatment outcomes under different correlation specifications. In a practical example on HIV-1 treatment data, we demonstrate that the proposed methodology generalises to real-world data. Imputing under the SPC, therefore, opens up a wealth of possibilities for studying heterogeneous treatment effects on incomplete data and the further adaptation of individual treatment effects.
- Published
- 2022
14. Anonymiced Shareable Data: Using mice to Create and Analyze Multiply Imputed Synthetic Datasets
- Author
-
Volker, Thom Benjamin, Vink, Gerko, Leerstoel Oberski, Leerstoel van Buuren, Methodology and statistics for the behavioural and social sciences, Leerstoel Oberski, Leerstoel van Buuren, and Methodology and statistics for the behavioural and social sciences
- Subjects
Data records ,mice ,multiple imputation ,synthetic data ,statistical disclosure control ,privacy ,Statistical disclosure control ,business.industry ,Computer science ,Usability ,computer.software_genre ,Pipeline (software) ,Synthetic data ,BF1-990 ,Software ,Psychology ,Pharmacology (medical) ,Data mining ,business ,Dissemination ,computer ,Research data - Abstract
Synthetic datasets simultaneously allow for the dissemination of research data while protecting the privacy and confidentiality of respondents. Generating and analyzing synthetic datasets is straightforward, yet, a synthetic data analysis pipeline is seldom adopted by applied researchers. We outline a simple procedure for generating and analyzing synthetic datasets with the multiple imputation software mice (Version 3.13.15) in R. We demonstrate through simulations that the analysis results obtained on synthetic data yield unbiased and valid inferences and lead to synthetic records that cannot be distinguished from the true data records. The ease of use when synthesizing data with mice along with the validity of inferences obtained through this procedure opens up a wealth of possibilities for data dissemination and further research on initially private data.
- Published
- 2021
15. Nationwide epidemiological approach to identify associations between keratoconus and immune-mediated diseases
- Author
-
Claessens, Janneau L.J., Godefrooij, Daniel A., Vink, Gerko, Frank, Laurence E., Wisse, Robert P.L., Leerstoel van Buuren, Methodology and statistics for the behavioural and social sciences, Leerstoel Heijden, Afd methoden en statistieken, Leerstoel van Buuren, Methodology and statistics for the behavioural and social sciences, Leerstoel Heijden, and Afd methoden en statistieken
- Subjects
medicine.medical_specialty ,Population ,Keratoconus ,Thyroiditis ,Dermatitis, Atopic ,Pathogenesis ,immunology ,Cellular and Molecular Neuroscience ,Degenerative disease ,cornea ,Epidemiology ,medicine ,Ethnicity ,Humans ,education ,Asthma ,education.field_of_study ,business.industry ,medicine.disease ,Sensory Systems ,Ophthalmology ,Logistic Models ,Bronchial hyperresponsiveness ,inflammation ,Immunology ,Etiology ,epidemiology ,business - Abstract
BackgroundThe aetiology of keratoconus (KC) remains poorly understood. KC has typically been described as a non-inflammatory disorder of the cornea. Nonetheless, there is increasing presumptive evidence for the role of the immune system in the pathogenesis of KC.AimTo evaluate the association between KC and immune-mediated diseases on a population level. We hypothesise that KC is immune-mediated rather than a predominantly degenerative disease.MethodsData were obtained from the largest health insurance provider in the Netherlands. Dutch residents are obligatorily insured. The data contained all medical claims and sociodemographic characteristics from all KC patients plus all those data from a 1:6 age-matched and sex-matched control group. The primary outcome was the association between KC and immune-mediated diseases, as assessed by conditional logistic regression.ResultsBased on our analysis of 2051 KC cases and 12 306 matched controls, we identified novel associations between KC and Hashimoto’s thyroiditis (OR=2.89; 95% CI: 1.41 to 5.94) and inflammatory skin conditions (OR=2.20; 95% CI: 1.37 to 3.53). We confirmed known associations between KC and atopic conditions, including allergic rash (OR=3.00; 95% CI: 1.03 to 8.79), asthma and bronchial hyperresponsiveness (OR=2.51; 95% CI: 1.63 to 3.84), and allergic rhinitis (OR=2.20; 95% CI: 1.39 to 3.49).ConclusionKeratoconus appears positively associated with multiple immune-mediated diseases, which provides a population-based argument that systemic inflammatory responses may influence its onset. The identification of these particular diseases might shed light on potential comparable pathways through which this proinflammatory state is achieved, paving the way for pharmacological treatment strategies.
- Published
- 2022
16. The Dance of the Mechanisms: How Observed Information Influences the Validity of Missingness Assumptions
- Author
-
Schouten, Rianne Margaretha, Vink, Gerko, Methodology and statistics for the behavioural and social sciences, and Leerstoel van Buuren
- Subjects
Dance ,Sociology and Political Science ,Computer science ,multivariate amputation ,05 social sciences ,050401 social sciences methods ,Missing data ,01 natural sciences ,010104 statistics & probability ,0504 sociology ,Econometrics ,missing data methodology ,0101 mathematics ,missingness assumptions ,Set (psychology) ,Social Sciences (miscellaneous) - Abstract
Missing data in scientific research go hand in hand with assumptions about the nature of the missingness. When dealing with missing values, a set of beliefs has to be formulated about the extent to which the observed data may also hold for the missing parts of the data. It is vital that the validity of these missingness assumptions is verified, tested, and that assumptions are adjusted when necessary. In this article, we demonstrate how observed data structures could a priori indicate whether it is likely that our beliefs about the missingness can be trusted. To this end, we simulate complete data and generate missing values according several types of MCAR, MAR, and MNAR mechanisms. We demonstrate that in scenarios where the data correlations are either low or very substantial, strictly different mechanisms yield equivalent statistical inferences. In addition, we show that the choice of quantity of scientific interest together with the distribution of the nonresponse govern the validity of the missingness assumptions.
- Published
- 2021
17. Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands.
- Author
-
Gopalakrishna, Gowri, ter Riet, Gerben, Vink, Gerko, Stoop, Ineke, Wicherts, Jelte M., and Bouter, Lex M.
- Subjects
RANDOMIZED response ,ODDS ratio ,FALSIFICATION - Abstract
Prevalence of research misconduct, questionable research practices (QRPs) and their associations with a range of explanatory factors has not been studied sufficiently among academic researchers. The National Survey on Research Integrity targeted all disciplinary fields and academic ranks in the Netherlands. It included questions about engagement in fabrication, falsification and 11 QRPs over the previous three years, and 12 explanatory factor scales. We ensured strict identity protection and used the randomized response method for questions on research misconduct. 6,813 respondents completed the survey. Prevalence of fabrication was 4.3% (95% CI: 2.9, 5.7) and of falsification 4.2% (95% CI: 2.8, 5.6). Prevalence of QRPs ranged from 0.6% (95% CI: 0.5, 0.9) to 17.5% (95% CI: 16.4, 18.7) with 51.3% (95% CI: 50.1, 52.5) of respondents engaging frequently in at least one QRP. Being a PhD candidate or junior researcher increased the odds of frequently engaging in at least one QRP, as did being male. Scientific norm subscription (odds ratio (OR) 0.79; 95% CI: 0.63, 1.00) and perceived likelihood of detection by reviewers (OR 0.62, 95% CI: 0.44, 0.88) were associated with engaging in less research misconduct. Publication pressure was associated with more often engaging in one or more QRPs frequently (OR 1.22, 95% CI: 1.14, 1.30). We found higher prevalence of misconduct than earlier surveys. Our results suggest that greater emphasis on scientific norm subscription, strengthening reviewers in their role as gatekeepers of research quality and curbing the "publish or perish" incentive system promotes research integrity. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Differences between resource control types revisited: A short term longitudinal study
- Author
-
Reijntjes, Albert, Vermande, Marjolijn, Olthof, Tjeert, Goossens, Frits A, Vink, Gerko, Aleva, Liesbeth, van der Meulen, Matty, Leerstoel Orobio de Castro, Leerstoel Baar, Leerstoel Heijden, Leerstoel Aken, Methodology and statistics for the behavioural and social sciences, Development and Treatment of Psychosocial Problems, Social and personality development: A transactional approach, Clinical Developmental Psychology, Educational Studies, Leerstoel Orobio de Castro, Leerstoel Baar, Leerstoel Heijden, Leerstoel Aken, Methodology and statistics for the behavioural and social sciences, Development and Treatment of Psychosocial Problems, Social and personality development: A transactional approach, and Developmental Psychology
- Subjects
Agreeableness ,Control theory (sociology) ,Longitudinal study ,STRATEGIES ,Sociology and Political Science ,social dominance ,CHILDREN ,Social preferences ,resource control ,EVOLUTIONARY PERSPECTIVE ,050105 experimental psychology ,Developmental psychology ,coercive strategies ,Resource (project management) ,Developmental and Educational Psychology ,bistrategic controllers ,0501 psychology and cognitive sciences ,resource control theory ,05 social sciences ,Peer group ,prosocial strategies ,Popularity ,perceived popularity ,SITUATIONS ,SOCIAL-DOMINANCE ,Prosocial behavior ,PRESCHOOLERS ,SCHOOL ,SKILLS ,AGGRESSION ,Psychology ,social preference ,Social psychology ,BEHAVIOR ,Social Sciences (miscellaneous) ,050104 developmental & child psychology - Abstract
Hawley's influential resource control theory (RCT) posits that both coercive and prosocial strategies may yield social dominance, as indexed by resource control. Based on differences in youths’ relative use of these strategies, RCT a priori defines five distinct subtypes. Several studies by Hawley and colleagues have revealed substantial differences between subtypes in terms of obtained resource control and various social characteristics (e.g., agreeableness). The present longitudinal study (N = 394; Mage = 10.3; SD = 0.5) expands on previous work. Firstly, because several items used to assess strategies in RCT appear to confound strategy use with the resulting benefits (resource control), we disentangled between strategy use as such and obtained resource control. Secondly whereas previous work has been exclusively cross-sectional, the present study was longitudinal. ANOVAs comparing subgroups provided support for some core tenets of RCT, but not for others. For instance, bistrategic children scored high on both resource control and perceived popularity. However, bistrategics engaged in elevated bullying, and whereas Hawley asserts that they are proficient in balancing ‘getting ahead’ with ‘getting along’, their behavior appeared to evoke clear negative reactions in the peer group at large. Findings also showed that non-controllers did not experience more negative outcomes than their peers across all domains.
- Published
- 2018
19. How to Obtain Valid Inference under Unit Nonresponse?
- Author
-
Boeschoten, Laura, Vink, Gerko, Hox, Joop J.C.M., Leerstoel Heijden, Leerstoel Hox, Methodology and statistics for the behavioural and social sciences, Leerstoel Heijden, Leerstoel Hox, Methodology and statistics for the behavioural and social sciences, and Department of Methodology and Statistics
- Subjects
Statistics and Probability ,Computer science ,Inference ,coverage ,mass imputation ,computer.software_genre ,Weighting ,01 natural sciences ,Auxiliary variables ,010104 statistics & probability ,Statistics ,050602 political science & public administration ,Statistics::Methodology ,Imputation (statistics) ,0101 mathematics ,sample imputation ,Statistics::Applications ,business.industry ,05 social sciences ,Usability ,Probability and statistics ,Data structure ,HA1-4737 ,0506 political science ,Register data ,Data mining ,business ,computer - Abstract
Weighting methods are commonly used in situations of unit nonresponse with linked register data. However, several arguments in terms of valid inference and practical usability can be made against the use of weighting methods in these situations. Imputation methods such as sample and mass imputation may be suitable alternatives, as they lead to valid inference in situations of item nonresponse and have some practical advantages. In a simulation study, sample and mass imputation were compared to traditional weighting when dealing with unit nonresponse in linked register data. Methods were compared on their bias and coverage in different scenarios. Both, sample and mass imputation, had better coverage than traditional weighting in all scenarios. Imputation methods can therefore be recommended over weighting as they also have practical advantages, such as that estimates outside the observed data distribution can be created and that many auxiliary variables can be taken into account. The use of sample or mass imputation depends on the specific data structure.
- Published
- 2017
20. Partitioned predictive mean matching as a large data multilevel imputation technique
- Author
-
Vink, Gerko, Lazendic, Goran, van Buuren, Stef, Leerstoel Heijden, and Methodology and statistics for the behavioural and social sciences
- Abstract
Large scale assessment data often has a multilevel structure. When dealing with missing values, such structures need to be taken into account to prevent underestimation of the intraclass correlation. We evaluate predictive mean matching (PMM) as a multilevel imputation technique and compare it to other imputation approaches for multilevel data. We propose partitioned predictive mean matching (PPMM) as an extension to the PMM algorithm to divide the big data multilevel problem into manageable parts that can be solved by standard predictive mean matching. We show that PPMM can be a very effective imputation approach for large multilevel datasets and that both PPMM and PMM yield plausible inference for continuous, ordered categorical, or even dichotomous multilevel data. We conclude that both the performance of PMM and PPMM is often comparable to dedicated methods for multilevel data.
- Published
- 2015
21. Generating missing values for simulation purposes: a multivariate amputation procedure
- Author
-
Schouten, Rianne Margaretha, Lugtig, Peter, Vink, Gerko, Methodology and statistics for the behavioural and social sciences, Leerstoel van Buuren, Leerstoel Heijden, Methodology and statistics for the behavioural and social sciences, Leerstoel van Buuren, and Leerstoel Heijden
- Subjects
Statistics and Probability ,Complete data ,Multivariate statistics ,multiple imputation ,Missing data ,multivariate amputation ,Machine learning ,computer.software_genre ,01 natural sciences ,010104 statistics & probability ,0504 sociology ,Statistical analyses ,Modelling and Simulation ,0101 mathematics ,Mathematics ,evaluation ,business.industry ,Applied Mathematics ,05 social sciences ,Statistics ,050401 social sciences methods ,Probability and statistics ,Modeling and Simulation ,Amputation procedure ,Probability and Uncertainty ,Artificial intelligence ,Statistics, Probability and Uncertainty ,business ,computer - Abstract
Missing data form a ubiquitous problem in scientific research, especially since most statistical analyses require complete data. To evaluate the performance of methods dealing with missing data, researchers perform simulation studies. An important aspect of these studies is the generation of missing values in a simulated, complete data set: the amputation procedure. We investigated the methodological validity and statistical nature of both the current amputation practice and a newly developed and implemented multivariate amputation procedure. We found that the current way of practice may not be appropriate for the generation of intuitive and reliable missing data problems. The multivariate amputation procedure, on the other hand, generates reliable amputations and allows for a proper regulation of missing data problems. The procedure has additional features to generate any missing data scenario precisely as intended. Hence, the multivariate amputation procedure is an efficient method to accurately evaluate missing data methodology.
- Published
- 2018
22. How to handle missing data: A comparison of different approaches
- Author
-
Peeters, Margot, Zondervan-Zwijnenburg, M. A. J., Vink, Gerko, van de Schoot, Rens, Youth in Changing Cultural Contexts, Leerstoel Vollebergh, Leerstoel Hoijtink, Leerstoel Heijden, Youth in Changing Cultural Contexts, Leerstoel Vollebergh, Leerstoel Hoijtink, Leerstoel Heijden, and 25959565 - Van de Schoot, Adrianus Gerardus Joanes
- Subjects
multiple imputation ,Social Psychology ,Listwise deletion ,high risk sample ,longitudinal research ,Missing data ,Confidence interval ,missing data ,Outcome variable ,Sample size determination ,Statistics ,Taverne ,Developmental and Educational Psychology ,Imputation (statistics) ,Psychology - Abstract
Many researchers face the problem of missing data in longitudinal research. Especially, high risk samples are characterized by missing data which can complicate analyses and the interpretation of results. In the current study, our aim was to find the most optimal and best method to deal with the missing data in a specific study with many missing data on the outcome variable. Therefore, different techniques to handle missing data were evaluated, and a solution to efficiently handle substantial amounts of missing data was provided. A simulation study was conducted to determine the most optimal method to deal with the missing data. Results revealed that multiple imputation (MI) using predictive mean matching was the most optimal method with respect to lowest bias and the smallest confidence interval (CI) while maintaining power. Listwise deletion and last observation carried backward also scored acceptable with respect to bias; however, CIs were much larger and sample size almost halved using these methods. Longitudinal research in high risk samples could benefit from using MI in future research to handle missing data. The paper ends with a checklist for handling missing data. http://www.tandfonline.com/toc/pedp20/12/4?nav=tocList http://dx.doi.org/10.1080/17405629.2015.1049526 http://www.tandfonline.com/doi/full/10.1080/17405629.2015.1049526
- Published
- 2015
23. Pooling multiple imputations when the sample happens to be the population
- Author
-
Vink, Gerko, Buuren, Stef van, Leerstoel Heijden, and Methodology and statistics for the behavioural and social sciences
- Subjects
stat.CO ,stat.TH ,math.ST - Abstract
Current pooling rules for multiply imputed data assume infinite populations. In some situations this assumption is not feasible as every unit in the population has been observed, potentially leading to over-covered population estimates. We simplify the existing pooling rules for situations where the sampling variance is not of interest. We compare these rules to the conventional pooling rules and demonstrate their use in a situation where there is no sampling variance. Using the standard pooling rules in situations where sampling variance should not be considered, leads to overestimation of the variance of the estimates of interest, especially when the amount of missingness is not very large. As a result, populations estimates are over-covered, which may lead to a loss of statistical power. We conclude that the theory of multiple imputation can be extended to the situation where the sample happens to be the population. The simplified pooling rules can be easily implemented to obtain valid inference in cases where we have observed essentially all units and in simulation studies addressing the missingness mechanism only.
- Published
- 2014
24. Predictive mean matching imputation of semicontinuous variables
- Author
-
Vink, Gerko, Frank, Laurence E., Pannekoek, Jeroen, van Buuren, Stef, Leerstoel Heijden, and Methodology and statistics for the behavioural and social sciences
- Subjects
Statistics and Probability ,Point mass ,Predictive mean matching ,Statistics::Applications ,Semicontinuous data ,Multiple imputation ,Statistics::Methodology ,Statistics, Probability and Uncertainty ,Skewed data - Abstract
Multiple imputation methods properly account for the uncertainty of missing data. One of those methods for creating multiple imputations is predictive mean matching (PMM), a general purpose method. Little is known about the performance of PMM in imputing non-normal semicontinuous data (skewed data with a point mass at a certain value and otherwise continuously distributed). We investigate the performance of PMM as well as dedicated methods for imputing semicontinuous data by performing simulation studies under univariate and multivariate missingness mechanisms. We also investigate the performance on real-life datasets. We conclude that PMM performance is at least as good as the investigated dedicated methods for imputing semicontinuous data and, in contrast to other methods, is the only method that yields plausible imputations and preserves the original data distributions.
- Published
- 2014
25. Multiple Imputation of Squared Terms.
- Author
-
Vink, Gerko and van Buuren, Stef
- Subjects
- *
MULTIPLE imputation (Statistics) , *REGRESSION analysis , *LATENT class analysis (Statistics) , *STATISTICAL bias , *STATISTICS methodology , *PREVENTION - Abstract
We propose a new multiple imputation technique for imputing squares. Current methods yield either unbiased regression estimates or preserve data relations. No method, however, seems to deliver both, which limits researchers in the implementation of regression analysis in the presence of missing data. Besides, current methods only work under a missing completely at random (MCAR) mechanism. Our method for imputing squares uses a polynomial combination. The proposed method yields both unbiased regression estimates, while preserving the quadratic relations in the data for both missing at random and MCAR mechanisms. [ABSTRACT FROM PUBLISHER]
- Published
- 2013
- Full Text
- View/download PDF
26. Metal-doping of nanoplastics enables accurate assessment of uptake and effects on Gammarus pulex
- Author
-
Redondo-Hasselerharm, Paulo E., Vink, Gerko, Mitrano, Denise, and Koelmans, Albert A.
- Subjects
13. Climate action ,6. Clean water - Abstract
Because of the difficulty of measuring nanoplastics (NP), the use of NPs doped with trace metals has been proposed as a promising approach to detect NP in environmental media and biota. In the present study, the freshwater amphipod Gammarus pulex were exposed to palladium (Pd)-doped NP via natural sediment at six spiking concentrations (0, 0.3, 1, 3, 10 and 30 g plastic per kg of sediment dry weight) with the aim of assessing their uptake and chronic effects using 28 days standardized single species toxicity tests. NP concentrations were quantified based on Pd concentrations measured by ICP-MS on digests of the exposed organisms and faecal pellets excreted during a post-exposure 24 hour depuration period. Additionally, NP concentrations were measured in sediments and water to demonstrate accuracy of NP dosing and to quantify the resuspension of NP from the sediment caused by the organisms. A significant positive linear relationship between the uptake of NP by G. pulex and the concentration of NP in the sediments was observed, yet no statistically significant effects were found on the survival or growth of G. pulex. A biodynamic model fitted well to the data and suggested bioaccumulation would occur in two kinetic compartments, the major one being reversible with rapid depuration to clean medium. Model fitting yielded a mass based trophic transfer factor (TTF), conceptually similar to the traditional biota sediment accumulation factor, for NP in the gut of 0.031. This value is close to a TTF value of 0.025 that was obtained for much larger microplastic particles in a similar experiment performed previously. Mechanistically, this suggests that ingestion of plastic is limited by the total volume of ingested particles. We demonstrated that using metal-doped plastics provides opportunities for precise quantification of NP accumulation and exposure in fate and effect studies, which can be a clear benefit for NP risk assessment., Environmental Science: Nano, 8 (6), ISSN:2051-8153, ISSN:2051-8161
27. Informed strategies for multivariate missing data
- Author
-
Cai, Mingyang, Methodology and statistics for the behavioural and social sciences, Leerstoel van Buuren, van Buuren, Stef, Vink, Gerko, and University Utrecht
- Subjects
Gezamenlijke modellering ,Volledig voorwaardelijke specificatie ,Hybrid imputation ,Missing data ,Multiple imputation ,Joint modeling ,Hybride toerekening ,Ontbrekende gegevens ,Meervoudige toerekening ,Fully conditional specificataion - Abstract
Joint modelling (JM) and fully conditional specification (FCS) are two widely used strategies for imputing multivariate missing data. JM involves specifying a multivariate distribution for the missing data and drawing imputations from their conditional distributions. The FCS approach specifies the distribution for each partially observed variable conditional on all other variables. The main advantage of FCS over JM is that FCS allows for tremendous flexibility in multivariate model design. However, there are often extra structures in the missing data that FCS cannot model properly in practice. Moreover, it is challenging to preserve the relations among multiple variables when performing the imputation on a variable-by-variable basis. This thesis aims to develop hybrid imputation that provides a strategy to specify hybrids of JM and FCS. To achieve this goal, I propose different solutions to missing data problems when applying FCS is not optimal. In chapter 2, I first discuss some general methods to impute squares. I improve the polynomial combination method and compare it with the substantive model compatible fully conditional specification method. Finally, I summarise the properties of both approaches. In chapter 3, I develop multivariate predictive mean matching, which allows simultaneous imputation of multiple missing variables. I combine the methodology of univariate predictive mean matching and canonical regression analysis. The advantage of this imputation method is the preservation of relations among a set of missing variables. Finally, I show the potential scenarios where multivariate predictive mean matching could be used and discuss the limitations. In chapter 4, I develop the hybrid imputation method to estimate individual treatment effects. The idea is that by imputing unobserved outcomes, we could calculate the differences between potential outcomes under different treatment conditions. However, there is a problem the data has no information about the correlation between potential outcomes. The proposed hybrid imputation method specifies the partial correlation and performs a sensitivity analysis to overcome this problem. Finally, I demonstrate the validity of the proposed hybrid imputation method and show how to apply it in practice. In chapter 5, I investigate the compatibility of FCS when the prior for conditional models are informative. Many authors illustrated the compatibility property of FCS when the prior for conditional models is non-informative. However, the compatibility property in the case of informative priors has not received much attention. I demonstrate that FCS under the normal linear model with an informative inverse-gamma prior is compatible with a joint distribution and provide the corresponding normal inverse-Wishart prior distribution for the joint distribution. In chapter 6, I develop a novel strategy to diagnose multiple imputation models based on posterior predictive checking. The general idea is that if the imputation model is congenial to the substantive model, the expected value of the observed data is in the centre of corresponding predictive posterior distributions. By applying the proposed diagnosis method, the researcher could compare the `over-imputed’ data with the observed data and evaluate the fitness of the imputation model.
- Published
- 2022
28. Alternative Information: Bayesian Statistics, Expert Elicitation and Information Theory in the Social Sciences
- Author
-
Veen, Duco, Leerstoel Schoot, Methodology and statistics for the behavioural and social sciences, van de Schoot, Rens, Vink, Gerko, Van Loey, N.E.E., and University Utrecht
- Subjects
Structural Equation Modelling, Prior-Data (dis)agreement ,Expert Elicitation ,Kullback-Leibler Divergence ,Prior Information ,Bayesian Statistics ,Hierarchical modelling ,Prior-Data (dis)agreement - Abstract
In this dissertation it is discussed how one can capture and utilize alternative sources of (prior) information compared to traditional method in the social sciences such as survey research. Specific attention is paid to expert knowledge. In Chapter 2 we propose an elicitation methodology for a single parameter that does not rely on specifying quantiles of a distribution. The proposed method is evaluated using a user feasibility study, a partial validation study and an empirical example of the full elicitation method. In Chapter 3 it is investigated how experts’ knowledge, as alternative source of information, can be contrasted with traditional data collection methods. At the same time, we explore how experts can be assessed and ranked borrowing techniques from information theory. We use the information theoretical concept of relative entropy or Kullback-Leibler divergence which assesses a loss of information when approximating one distribution by another. For those familiar with the concept of model selection, Akaike’s Information Criterion is an approximation of this (Burnham & Anderson, 2002, Chapter 2). In Chapter 4 an alternative way of enhancing the amount of information in a model is proposed. We introduce Bayesian hierarchical modelling to the field of infants’ speech discrimination analysis. This technique is not new on it’s own but was not applied to this field. Implementing this type of modelling enables individual analyses within a group structure. By taking the hierarchical structure of the data into account we can make the most of the, on individual level, small noisy data sets. In Chapter 5 we reflect on issues that come along with the estimation of increasingly complicated models. We show how even with weakly informative priors, adding the information that is available to us, sometimes we do not get a solution with our analysis plan. We guide the reader on what to do when this occurs and where to look for clues and possible causes. We provide some guidance and a textbook example that for once shows things not working out the way you would like. We believe this is important as there are few examples of this. In Chapter 6 we combine the previous chapters. We take more complex models and get experts to specify beliefs with respect to these models. We extend the method developed in Chapter 2 to elicit experts’ beliefs with respect to a hierarchical model, which is used in Chapters 4 and 5. In specific, we concern ourselves with a Latent Growth Curve model and utilize the information theoretical measures from Chapter 3 to compare the (groups) of experts to one another and to data collected in a traditional way. We do this in the context of Posttraumatic Stress Symptoms development in children with burn injuries. In Chapter 7 I reflect on the work and explanations provided within the chapters of this dissertation, including this introduction.
- Published
- 2020
29. Prevalence of responsible research practices among academics in The Netherlands.
- Author
-
Gopalakrishna G, Wicherts JM, Vink G, Stoop I, van den Akker OR, Ter Riet G, and Bouter LM
- Subjects
- Biomedical Research, Ethics, Research, Humans, Netherlands, Publishing ethics, Publishing standards, Scientific Misconduct ethics, Universities, Humanities, Research Personnel
- Abstract
Background: Traditionally, research integrity studies have focused on research misbehaviors and their explanations. Over time, attention has shifted towards preventing questionable research practices and promoting responsible ones. However, data on the prevalence of responsible research practices, especially open methods, open codes and open data and their underlying associative factors, remains scarce. Methods: We conducted a web-based anonymized questionnaire, targeting all academic researchers working at or affiliated to a university or university medical center in The Netherlands, to investigate the prevalence and potential explanatory factors of 11 responsible research practices. Results: A total of 6,813 academics completed the survey, the results of which show that prevalence of responsible practices differs substantially across disciplines and ranks, with 99 percent avoiding plagiarism in their work but less than 50 percent pre-registering a research protocol. Arts and humanities scholars as well as PhD candidates and junior researchers engaged less often in responsible research practices. Publication pressure negatively affected responsible practices, while mentoring, scientific norms subscription and funding pressure stimulated them. Conclusions: Understanding the prevalence of responsible research practices across disciplines and ranks, as well as their associated explanatory factors, can help to systematically address disciplinary- and academic rank-specific obstacles, and thereby facilitate responsible conduct of research., Competing Interests: No competing interests were disclosed., (Copyright: © 2022 Gopalakrishna G et al.)
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.