1. Propensity score matching in semaglutide retrospective studies
- Author
-
Mohney, Elizabeth and Shvets, Alexey
- Subjects
Statistics - Applications - Abstract
Propensity Score Matching (PSM) is a causal inference technique that is used as a substitution for experimental methods when it is not possible to implement them due to logistical and ethical concerns. By using a logistic classifier to calculate the probability of assignment between the control and experimental groups a log odds value or 'logit' score is assigned to each data point. After assignment of a logit score every data point in the treatment group is assigned a comparable control in order to balance the potential confounding variables of an experiment. While a viable inference technique, many implementations of PSM fail to properly outline the methodology used, such as not explaining feature selection and matching techniques. This paper outlines multiple different techniques for both feature selection and matching which then are compared based on their efficiency. Three unique quantitative feature selection methods were utilized including random removal, feature importance calculation, and individual removal. Individual removal was the most efficient in consolidating the overlap between the treatment and control groups. The matching techniques used were bisect, binary insertion, nearest neighbors, and the most efficient, nearest neighbor with a caliper, in order to limit the error percentage and standard mean deviation. Only testing these techniques on a data set that included patients treated with semaglutide makes it not possible to definitively state which technique is the best. However, this paper explores the influence of methodology on the outcome of an experiment while providing ways in which to test efficiency of techniques. It is not only important for researchers to properly document methodology but explore different techniques to maximize results., Comment: 9 pages, 6 figures, 1 table, 1 algorithm
- Published
- 2025