Back to Search
Start Over
Bayesian Federated Inference for regression models based on non-shared multicenter data sets from heterogeneous populations
- Publication Year :
- 2024
-
Abstract
- To estimate accurately the parameters of a regression model, the sample size must be large enough relative to the number of possible predictors for the model. In practice, sufficient data is often lacking, which can lead to overfitting of the model and, as a consequence, unreliable predictions of the outcome of new patients. Pooling data from different data sets collected in different (medical) centers would alleviate this problem, but is often not feasible due to privacy regulation or logistic problems. An alternative route would be to analyze the local data in the centers separately and combine the statistical inference results with the Bayesian Federated Inference (BFI) methodology. The aim of this approach is to compute from the inference results in separate centers what would have been found if the statistical analysis was performed on the combined data. We explain the methodology under homogeneity and heterogeneity across the populations in the separate centers, and give real life examples for better understanding. Excellent performance of the proposed methodology is shown. An R-package to do all the calculations has been developed and is illustrated in this paper. The mathematical details are given in the Appendix.<br />Comment: 33 pages, 1 figure, 7 tables
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2402.02898
- Document Type :
- Working Paper