Descriptor: "Bayesian Statistics" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Bayesian Statistics"' showing total 15,891 results

Start Over Descriptor "Bayesian Statistics"

15,891 results on '"Bayesian Statistics"'

1. Optimizing Bayesian Knowledge Tracing with Neural Network Parameter Generation

Author: Anirudhan Badrinath and Zachary Pardos
Abstract: Bayesian Knowledge Tracing (BKT) is a well-established model for formative assessment, with optimization typically using expectation maximization, conjugate gradient descent, or brute force search. However, one of the flaws of existing optimization techniques for BKT models is convergence to undesirable local minima that negatively impact performance and interpretability of the BKT parameters (i.e., parameter degeneracy). Recently, deep knowledge tracing methods such as context-aware attentive knowledge tracing have proven to be state-of-the-art in performance; however, these methods often lack the inherent interpretability or understanding provided by BKT's skill-level parameter estimates and student-level mastery probability estimates. We propose a novel optimization technique for BKT models using a neural network-based parameter generation approach, OptimNN, that leverages hypernetworks and stochastic gradient descent for training BKT parameters. We extend this approach and propose BK Transformer, a transformer-based sequence modeling technique that generates temporally-evolving BKT parameters for student response correctness prediction. With both approaches, we demonstrate improved performance compared to BKT and deep KT baselines, with minimal hyperparameter tuning. Importantly, we demonstrate that these techniques, despite their state-of-the-art expressive capability, retain the interpretability of skill-level BKT parameter estimates and student-level estimates of mastery and correctness probabilities. [The page range cited on the .pdf (p1-25) is incorrect. The correct page range is p41-65.]
Published: 2025

2. Stabilizing School Performance Indicators in New Jersey to Reduce the Effect of Random Error. Appendixes. REL 2025-009

Author: Regional Educational Laboratory Mid-Atlantic (ED/IES), Mathematica, and National Center for Education Evaluation and Regional Assistance (NCEE) (ED/IES)
Abstract: These are the appendixes for the report, "Stabilizing School Performance Indicators in New Jersey to Reduce the Effect of Random Error." This study applied a stabilization model called Bayesian hierarchical modeling to group-level data (with groups assigned according to demographic designations) within schools in New Jersey with the aim of improving reliability, particularly for small groups of students. The four appendixes in the document include: (1) Theoretical review; (2) Methods; (3) Supporting analysis; and (4) Other analyses: Implications of the use of group-level data arising from relationships among demographic composition, subgroup size, and scores.
Published: 2024

3. Stabilizing School Performance Indicators in New Jersey to Reduce the Effect of Random Error. REL 2025-009

Author: National Center for Education Evaluation and Regional Assistance (NCEE) (ED/IES), Regional Educational Laboratory Mid-Atlantic (ED/IES), Mathematica, Morgan Rosendahl, Brian Gill, and Jennifer E. Starling
Abstract: The Every Student Succeeds Act of 2015 requires states to use a variety of indicators, including standardized tests and attendance records, to designate schools for support and improvement based on schoolwide performance and the performance of groups of students within schools. Schoolwide and group-level performance indicators are also diagnostically relevant for district-level and school-level decisionmaking outside the formal accountability context. Like all measurements, performance indicators are subject to measurement error, with some having more random error than others. Measurement error can have an outsized effect for smaller groups of students, rendering their measured performance unreliable, which can lead to misidentification of groups with the greatest needs. Many states address the reliability problem by excluding from accountability student groups smaller than an established threshold, but this approach sacrifices equity, which requires counting students in all relevant groups. With the aim of improving reliability, particularly for small groups of students, this study applied a stabilization model called Bayesian hierarchical modeling to group-level data (with groups assigned according to demographic designations) within schools in New Jersey. Stabilization substantially improved the reliability of test -based indicators, including proficiency rates and median student growth percentiles. The stabilization model used in this study was less effective for non test based indictors, such as chronic absenteeism and graduation rate, for several reasons related to their statistical properties. When stabilization is applied to the indicators best suited for it (such as proficiency and growth), it leads to substantial changes in the lists of schools designated for support and improvement. These results indicate that, applied correctly, stabilization can increase the reliability of performance indicators for processes using these indicators, simultaneously improving accuracy and equity.
Published: 2024

4. Is Effort Moderated Scoring Robust to Multidimensional Rapid Guessing?

Author: Joseph A. Rios and Jiayi Deng
Abstract: To mitigate the potential damaging consequences of rapid guessing (RG), a form of noneffortful responding, researchers have proposed a number of scoring approaches. The present simulation study examines the robustness of the most popular of these approaches, the unidimensional effort-moderated (EM) scoring procedure, to multidimensional RG (i.e., RG that is linearly related to examinee ability). Specifically, EM scoring is compared with the Holman-Glas (HG) method, a multidimensional scoring approach, in terms of model fit distortion, ability parameter recovery, and omega reliability distortion. Test difficulty, the proportion of RG present within a sample, and the strength of association between ability and RG propensity were manipulated to create 80 total conditions. Overall, the results showed that EM scoring provided improved model fit compared with HG scoring when RG comprised 12% or less of all item responses. Furthermore, no significant differences in ability parameter recovery and omega reliability distortion were noted when comparing these two scoring approaches under moderate degrees of RG multidimensionality. These limited differences were largely due to the limited impact of RG on aggregated ability (bias ranged from 0.00 to 0.05 logits) and reliability (distortion was [less than or equal to] .005 units) estimates when as much as 40% of item responses in the sample data reflected RG behavior.
Published: 2025
Full Text: View/download PDF

5. Latent Interaction Effect in the CLPM Model: A Two-Step Multiple Imputation Analysis

Author: Ming-Chi Tseng
Abstract: This study aims to estimate the latent interaction effect in the CLPM model through a two-step multiple imputation analysis. The estimation of within x within and between x within latent interaction under the CLPM model framework is compared between the one-step Bayesian LMS method and the two-step multiple imputation analysis through a simulation study. The analysis show that the two-step multiple imputation analysis can provide unbiased estimation parameter, similar to the one-step Bayesian LMS method. This study also uses self-esteem and depression data from NLSY79 to perform a two-step multiple imputation analysis of the CLPM model, as well as an empirical example of latent interaction analysis. Mplus syntax is provided for researchers' reference.
Published: 2025
Full Text: View/download PDF

6. Introduction to the Bayes Factor: A Shiny/R App

Author: Jorge N. Tendeiro, Rink Hoekstra, Tsz Keung Wong, and Henk A. L. Kiers
Abstract: Most researchers receive formal training in frequentist statistics during their undergraduate studies. In particular, hypothesis testing is usually rooted on the null hypothesis significance testing paradigm and its p-value. Null hypothesis Bayesian testing and its so-called Bayes factor are now becoming increasingly popular. Although the Bayes factor is often introduced as being the Bayesian counterpart to the p-value, its computation, use, and interpretation are quite distinct from the p-value. There is now evidence confirming that the application of the Bayes factor in applied research is ill-devised. To improve the current status quo, we have created a Shiny/R app called "the Bayes factor," which offers a dynamic tutorial for learning all the basics about the Bayes factor. In this paper, we explain how the app works and we offer suggestions on how to use it in class or self-study settings. The app is freely available at https://statsedge.org/shiny/LearnBF/.
Published: 2025
Full Text: View/download PDF

7. Small Sample Methods in Multilevel Analysis

Author: Yasuhiro Yamamoto and Yasuo Miyazaki
Abstract: Bayesian methods have been said to solve small sample problems in frequentist methods by reflecting prior knowledge in the prior distribution. However, there are dangers in strongly reflecting prior knowledge or situations where much prior knowledge cannot be used. In order to address the issue, in this article, we considered to apply two Bayesian methods, one with weakly informative prior for random effect parameters (BayesW) and another with non-informative prior (BayesN) and compared their performance of parameter recovery with restricted maximum likelihood (REML) method, a frequentist-based estimation method commonly used in multilevel analysis. Specifically, we conducted two simulation studies using a two-level linear multilevel model and compared the performance of the BayesW, the BayesN, and the REML in various conditions that include scenarios of small sample data with a combination of small sample size both at level-1 and level-2. The results showed that the REML performed better than the BayesW, which showed overall better performance than the BayesN, in terms of convergence, bias, and coverage in some small sample scenarios. Therefore, based on the results of the present study, it is recommended to use the REML when adapting the two-level linear multilevel model and analyzing real-world data with small samples.
Published: 2025
Full Text: View/download PDF

8. Optimal Practices for Mediation Analysis in AB Single Case Experimental Designs

Author: Milica Miocevic, Fayette Klaassen, Mariola Moeyaert, and Gemma G. M. Geuke
Abstract: Mediation analysis in Single Case Experimental Designs (SCEDs) evaluates intervention mechanisms for individuals. Despite recent methodological developments, no clear guidelines exist for maximizing power to detect the indirect effect in SCEDs. This study compares frequentist and Bayesian methods, determining (1) minimum required sample size to detect indirect effects in AB designs, (2) relative power for proximal vs. distal mediators, and (3) optimal observation allocation between baseline and treatment phases. Simulation study results suggest highest power for proximal mediators with at least 60 observations evenly allocated to A and B phases. Findings from this study have important implications for the design and statistical analysis in numerous fields that routinely employ SCEDs, ranging from education to psychology and nursing.
Published: 2025
Full Text: View/download PDF

9. The Economic Way of Thinking in a Pandemic

Author: Alex Tabarrok
Abstract: During the pandemic, the economic way of thinking was extraordinarily useful, leading to a quick consensus among economists of widely differing political persuasions on many issues of pandemic policy. Yet speaking to politicians, bureaucrats, and the public revealed many ways in which the economic way of thinking was foreign and sometimes uncomfortable to non-economists, albeit often useful. Instructors can use pandemic policy to engage students on topics like incentives, trade-offs, utilitarianism, Bayesian reasoning, and overcoming cognitive biases.
Published: 2025
Full Text: View/download PDF

10. Exploration in 4-Year-Old Children Is Guided by Learning Progress and Novelty

Author: Francesco Poli, Marlene Meyer, Rogier B. Mars, and Sabine Hunnius
Abstract: Humans are driven by an intrinsic motivation to learn, but the developmental origins of curiosity-driven exploration remain unclear. We investigated the computational principles guiding 4-year-old children's exploration during a touchscreen game (N = 102, F = 49, M = 53, primarily white and middle-class, data collected in the Netherlands from 2021-2023). Children guessed the location of characters that were hiding following predictable (yet noisy) patterns. Children could freely switch characters, which allowed us to quantify "when" they decided to explore something different and "what" they chose to explore. Bayesian modeling of their responses revealed that children selected activities that were more novel and offered greater learning progress (LP). Moreover, children's interest in making LP correlated with better learning performance. These findings highlight the importance of novelty and LP in guiding children's exploration.
Published: 2025
Full Text: View/download PDF

11. Using Bayesian Networks for Cognitive Diagnosis Assessment of Upper-Secondary School Students Understanding in Redox Reaction

Author: Min Qi, Xinyang Hu, and Hualin Bi
Abstract: The redox reaction is a core concept of upper-secondary school chemistry curriculum. Accurate diagnosis of students' conceptual understanding of the redox reaction from a cognitive structure perspective is critical for enhancing their understanding of chemical concepts. This study utilized Bayesian networks to investigate the cognitive structures of Chinese students regarding the redox reaction. A total of 409 upper-secondary school students participated, with 227 in 11th grade and 182 in 12th grade. Seven cognitive attributes related to the redox reaction were identified, and their hierarchical relationships were mapped. The research process of cognitive diagnosis assessment of redox reaction based on Bayesian network was developed. The results indicated that Bayesian networks can effectively assess students' cognitive structures of the redox reaction. Key attributes identified in students' cognitive structures were "electron transfer", "oxidation reaction/reduction reaction" and "oxidability/reducibility". Furthermore, a comparison of the cognitive structures between 11th and 12th graders showed that 12th graders had a more advanced understanding with fewer conceptual gaps, while 11th graders demonstrated less developed cognitive pathways, which may be attributed to a lack of deep conceptual understanding.
Published: 2024

12. Predicting Students' Future Success: Harnessing Clickstream Data with Wide & Deep Item Response Theory

Author: Shi Pu, Yu Yan, and Brandon Zhang
Abstract: We propose a novel model, Wide & Deep Item Response Theory (Wide & Deep IRT), to predict the correctness of students' responses to questions using historical clickstream data. This model combines the strengths of conventional Item Response Theory (IRT) models and Wide & Deep Learning for Recommender Systems. By leveraging clickstream data, Wide & Deep IRT provides precise predictions of answer correctness while enabling the exploration of behavioral patterns among different ability groups. Our experimental results based on a real-world dataset (EDM Cup 2023) demonstrate that Wide & Deep IRT outperforms conventional IRT models and state-of-the-art knowledge tracing models while maintaining the ease of interpretation associated with IRT models. Our model performed very well in the EDM Cup 2023 competition, placing second on the public leaderboard and third on the private leaderboard. Additionally, Wide & Deep IRT identifies distinct behavioral patterns across ability groups. In the EDMCup2023dataset, low-ability students were more likely to directly request an answer to a question before attempting to respond, which can negatively impact their learning outcomes and potentially indicates attempts to game the system. Lastly, the Wide & Deep IRT model consists of significantly fewer parameters compared to traditional IRT models and deep knowledge tracing models, making it easier to deploy in practice. The source code is available via Open Science Framework.
Published: 2024

13. Clearer Analysis, Interpretation, and Communication in Organizational Research: A Bayesian Guide

Author: Karyssa A. Courey, Frederick L. Oswald, and Steven A. Culpepper
Abstract: Historically, organizational researchers have fully embraced frequentist statistics and null hypothesis significance testing (NHST). Bayesian statistics is an underused alternative paradigm offering numerous benefits for organizational researchers and practitioners: e.g., accumulating direct evidence for the null hypothesis (vs. 'fail to reject the null'), capturing uncertainty across a distribution of population parameters (vs. a 95% confidence interval on a single point estimate) -- and through these benefits, communicating statistical findings more clearly. Although organizational methodologists in the past have promoted Bayesian methods, only now is easy-to-use JASP statistical software available for more widespread implementation. Moreover, the software is free to download and use, is menu-driven, and is supported by an active multidisciplinary user community. Using JASP, our tutorial compares and contrasts frequentist and Bayesian approaches for two analyses: a multiple linear regression analysis and a linear mixed regression analysis.
Published: 2024

14. Frequentist and Bayesian Factorial Invariance Using R

Author: Teck Kiang Tan
Abstract: The procedures of carrying out factorial invariance to validate a construct were well developed to ensure the reliability of the construct that can be used across groups for comparison and analysis, yet mainly restricted to the frequentist approach. This motivates an update to incorporate the growing Bayesian approach for carrying out the Bayesian factorial invariance, as well as the frequentist approach, using the recent add-on R packages to show the procedures systematically for testing measurement equivalence via multigroup confirmatory factor analysis. The practical procedure and guidelines for carrying out factorial invariance under MCFA using a classic empirical example are demonstrated. Comparison between the frequentist and the Bayesian procedures and demonstration using priors are another two nuclei of the paper.
Published: 2024

15. Mapping Motivational Networks in EFL: Exploring the Impact of Additional L2 Lessons

Author: Aitor Garcés-Manzanera
Abstract: Learning a second language (L2) is dependent upon numerous external and internal factors, among which motivation plays a relevant role. In fact, motivation has been recognized as crucial in the L2 learning process (Ushioda, 2012). Such has been its importance that interest in L2 motivation has led to the development of theories such as the L2 motivational construct, and the L2 motivational self system (Dörnyei, 2005, 2009). Nevertheless, despite the academic focus on L2 learning motivation (Dörnyei & Ushioda, 2013), the impact of additional L2 lessons on students already engaged in formal L2 instruction at an official educational level (e.g., Higher Education) remaines vastly underexplored. Thus, this study aims to bridge this gap by analyzing the differences between 118 undergraduate EFL students who attended extra L2 lessons and those who did not. Considering the complex nature of the motivational construct, a Bayesian network analysis was used, categorizing motivations into two modules based on attendance of additional L2 lessons. This allowed us to observe the different factors of motivation as a whole construct, and not individually. The findings revealed that students who attended extra lessons are internally motivated toward self-improvement, whereas those who do not attend extra L2 lessons are influenced by external pressures and career aspirations.
Published: 2024

16. The Impact of Attribute Noise on the Automated Estimation of Collaboration Quality Using Multimodal Learning Analytics in Authentic Classrooms

Author: Pankaj Chejara, Luis P. Prieto, Yannis Dimitriadis, Maria Jesus Rodriguez-Triana, Adolfo Ruiz-Calleja, Reet Kasepalu, and Shashi Kant Shankar
Abstract: Multimodal learning analytics (MMLA) research has shown the feasibility of building automated models of collaboration quality using artificial intelligence (AI) techniques (e.g., supervised machine learning (ML)), thus enabling the development of monitoring and guiding tools for computer-supported collaborative learning (CSCL). However, the practical applicability and performance of these automated models in authentic settings remains largely an under-researched area. In such settings, the quality of data features or attributes is often affected by noise, which is referred to as attribute noise. This paper undertakes a systematic exploration of the impact of attribute noise on the performance of different collaboration-quality estimation models. Moreover, we also perform a comparative analysis of different ML algorithms in terms of their capability of dealing with attribute noise. We employ four ML algorithms that have often been used for collaboration-quality estimation tasks due to their high performance: random forest, naive Bayes, decision tree, and AdaBoost. Our results show that random forest and decision tree outperformed other algorithms for collaboration-quality estimation tasks in the presence of attribute noise. The study contributes to the MMLA (and learning analytics (LA) in general) and CSCL fields by illustrating how attribute noise impacts collaboration-quality model performance and which ML algorithms seem to be more robust to noise and thus more likely to perform well in authentic settings. Our research outcomes offer guidance to fellow researchers and developers of (MM)LA systems employing AI techniques with multimodal data to model collaboration-related constructs in authentic classroom settings.
Published: 2024

17. Comparison of Item Response Theory Ability and Item Parameters According to Classical and Bayesian Estimation Methods

Author: Eray Selçuk and Ergül Demir
Abstract: This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item parameters estimated according to the maximum likelihood and Bayesian method and the differences in the RMSE of these parameters were examined. The priori distribution (normal, left-skewed, right-skewed, leptokurtic, and platykurtic), test length (10, 20, 40), sample size (100, 500, 1000), logistics model (2PL, 3PL). The simulation conditions were performed with 100 replications. Mixed model ANOVA was performed to determine RMSE differentiations. The prior distribution type, test length, and estimation method in the differentiation of ability parameter and RMSE were estimated in 2PL models; the priori distribution type and test length were significant in the differences in the ability parameter and RMSE estimated in the 3PL model. While prior distribution type, sample size, and estimation method created a significant difference in the RMSE of the item discrimination parameter estimated in the 2PL model, none of the conditions created a significant difference in the RMSE of the item difficulty parameter. The priori distribution type, sample size, and estimation method in the item discrimination RMSE were estimated in the 3PL model; the a priori distribution and estimation method created significant differentiation in the RMSE of the lower asymptote parameter. However, none of the conditions significantly changed the RMSE of item difficulty parameters.
Published: 2024

18. Supercharging BKT with Multidimensional Generalizable IRT and Skill Discovery

Author: Mohammad M. Khajah
Abstract: Bayesian Knowledge Tracing (BKT) is a popular interpretable computational model in the educational mining community that can infer a student's knowledge state and predict future performance based on practice history, enabling tutoring systems to adaptively select exercises to match the student's competency level. Existing BKT implementations do not scale to large datasets and are difficult to extend and improve in terms of prediction accuracy. On the other hand, uninterpretable neural network (NN) student models, such as Deep Knowledge Tracing, enjoy the speed and modeling flexibility of popular computational frameworks (e.g., PyTorch, Tensorflow, etc.), making them easy to develop and extend. To bridge this gap, we develop a collection of BKT recurrent neural network (RNN) cells that are much faster than brute-force implementations and are within an order of magnitude of a fast, fine-tuned but inflexible C++ implementation. We leverage our implementation's modeling flexibility to create two novel extensions of BKT that significantly boost its performance. The first merges item response theory (IRT) and BKT by modeling multidimensional problem difficulties and student abilities without fitting student-specific parameters, allowing the model to easily generalize to new students in a principled way. The second extension discovers the discrete assignment matrix of problems to knowledge components (KCs) via stochastic neural network techniques and supports further guidance via problem input features and an auxiliary loss objective. Both extensions are learned in an end-to-end fashion; that is, problem difficulties, student abilities, and assignments to knowledge components are jointly learned with BKT parameters. In synthetic experiments, the skill discovery model can partially recover the true generating problem-KC assignment matrix while achieving high accuracy, even in some cases where the true KCs are structured unfavorably (interleaving sequences). On a real dataset where problem content is available, the skill discovery model matches BKT with expert-provided skills, despite using fewer KCs. On seven out of eight real-world datasets, our novel extensions achieve prediction performance that is within 0.04 AUC-ROC points of state-of-the-art models. We conclude by showing visualizations of the parameters and inferences to demonstrate the interpretability of our BKT RNN models on a real-life dataset.
Published: 2024

19. AI Adaptivity in a Mixed-Reality System Improves Learning

Author: Nesra Yannier, Scott E. Hudson, Henry Chang, and Kenneth R. Koedinger
Abstract: Adaptivity in advanced learning technologies offer the possibility to adapt to different student backgrounds, which is difficult to do in a traditional classroom setting. However, there are mixed results on the effectiveness of adaptivity based on different implementations and contexts. In this paper, we introduce AI adaptivity in the context of a new genre of Intelligent Science Stations that bring intelligent tutoring into the physical world. Intelligent Science Stations are mixed-reality systems that bridge the physical and virtual worlds to improve children's inquiry-based STEM learning. Automated reactive guidance is made possible by a specialized AI computer vision algorithm, providing personalized interactive feedback to children as they experiment and make discoveries in their physical environment. We report on a randomized controlled experiment where we compare learning outcomes of children interacting with the Intelligent Science Station that has task-loop adaptivity incorporated, compared to another version that provides tasks randomly without adaptivity. Our results show that adaptivity using Bayesian Knowledge Tracing in the context of a mixed-reality system leads to better learning of scientific principles, without sacrificing enjoyment. These results demonstrate benefits of adaptivity in a mixed-reality setting to improve children's science learning.
Published: 2024
Full Text: View/download PDF

20. Model Selection Posterior Predictive Model Checking via Limited-Information Indices for Bayesian Diagnostic Classification Modeling

Author: Jihong Zhang, Jonathan Templin, and Xinya Liang
Abstract: Recently, Bayesian diagnostic classification modeling has been becoming popular in health psychology, education, and sociology. Typically information criteria are used for model selection when researchers want to choose the best model among alternative models. In Bayesian estimation, posterior predictive checking is a flexible Bayesian model evaluation tool, which allows researchers to detect Q-matrix misspecification. However, model selection methods using posterior predictive checking (PPC) for Bayesian DCM are not well investigated. Thus, this research aims to propose a novel model selection approach using posterior predictive checking with limited-information statistics for selecting the correct Q-matrix. A simulation study was conducted to examine the performance of the proposed method. Furthermore, an empirical example was provided to illustrate how it can be used in real scenarios.
Published: 2024
Full Text: View/download PDF

21. The Structure of Psychosocial Factors in Academic Success: A Gaussian Graphical Model Approach

Author: Manyu Li, Taylar Johnson, and Ayodeji Solomon Adegoke
Abstract: Past research identified various psychosocial indicators of college students' academic success. Using the affordance ecology framework, the present study explored the complex relations among different psychosocial indicators with a Bayesian Gaussian Graphical Model approach. Specifically, this study aims to uncover the general patterns of the psychosocial indicators, central variables, and the network centrality indices (network betweenness, closeness, and strength). The final sample consisted of 997 college students. Results showed that after accounting for the complex covariances of all indicators, sense of belonging, having the highest strength, was one of the most central factors in the network of psychosocial indicators and demonstrated a strong direct link to the rest of the psychosocial indicators. Variables relating to the family background and socioeconomic status, including perceived parental support, perceived family social status, and perceived personal financial situation, demonstrated high betweenness and closeness in the network. Implications for higher education research on the psychosocial experiences of students were discussed.
Published: 2024
Full Text: View/download PDF

22. Promoting Diagnostic Reasoning in Teacher Education: The Role of Case Format and Perceived Authenticity

Author: Sarah Bichler, Michael Sailer, Elisabeth Bauer, Jan Kiesewetter, Hanna Härtl, Martin R. Fischer, and Frank Fischer
Abstract: Teachers routinely observe and interpret student behavior to make judgements about whether and how to support their students' learning. Simulated cases can help pre-service teachers to gain this skill of diagnostic reasoning. With 118 pre-service teachers, we tested whether participants rate simulated cases presented in a serial-cue case format as more authentic and become more involved with the materials compared to cases presented in a whole case format. We further investigated whether participants with varying prior conceptual knowledge (what are symptoms of ADHD and dyslexia) gain more strategic knowledge (how to detect ADHD and dyslexia) with a serial-cue versus whole case format. We found that the case format did not impact authenticity ratings but that learners reported higher involvement in the serial-cue case format condition. Bayes factors provide moderate evidence for the absence of a case format effect on strategic knowledge and strong evidence for the absence of an interaction of case format and prior knowledge. We recommend using serial-cue case formats in simulations as they are a more authentic representation of the diagnostic reasoning process and cognitively involve learners. We call for replications to gather more evidence for the impact of case format on knowledge acquisition. We suggest a further inquiry into the relationship of case format, involvement, and authenticity but think that a productive way forward for designing authentic simulations is attention to aspects that make serial-cue cases effective for diverse learners. For example, adaptive feedback or targeted practice of specific parts of diagnostic reasoning such as weighing evidence.
Published: 2024
Full Text: View/download PDF

23. A Bayesian Analysis of a Cognitive-Behavioral Therapy Intervention for High-Risk People on Probation

Author: SeungHoon Han, Jordan M. Hyatt, Geoffrey C. Barnes, and Lawrence W. Sherman
Abstract: This analysis employs a Bayesian framework to estimate the impact of a Cognitive-Behavioral Therapy (CBT) intervention on the recidivism of high-risk people under community supervision. The study relies on the reanalysis of experimental datal using a Bayesian logistic regression model. In doing so, new estimates of programmatic impact were produced using weakly informative Cauchy priors and the Hamiltonian Monte Carlo method. The Bayesian analysis indicated that CBT reduced the prevalence of new charges for total, non-violent, property, and drug crimes. However, the effectiveness of the CBT program varied meaningfully depending on the participant's age. The probability of the successful reduction of drug offenses was high only for younger individuals (<26 years old), while there was an impact on property offenses only for older individuals (>26 years old). In general, the probability of the successful reduction of new charges was higher for the older group of people on probation. Generally, this study demonstrates that Bayesian analysis can complement the more commonplace Null Hypothesis Significance Test (NHST) analysis in experimental research by providing practically useful probability information. Additionally, the specific findings of the reestimation support the principles of risk-needs responsivity and risk-stratified community supervision and align with related findings, though important differences emerge. In this case, the Bayesian estimations suggest that the effect of the intervention may vary for different types of crime depending on the age of the participants. This is informative for the development of evidence-based correctional policy and effective community supervision programming.
Published: 2024
Full Text: View/download PDF

24. Prediction of Student Exam Performance Using Data Mining Classification Algorithms

Author: Dalia Khairy, Nouf Alharbi, Mohamed A. Amasha, Marwa F. Are, Salem Alkhalaf, and Rania A. Abougalala
Abstract: Student outcomes are of great importance in higher education institutions. Accreditation bodies focus on them as an indicator to measure the performance and effectiveness of the institution. Forecasting students' academic performance is crucial for every educational establishment seeking to enhance performance and perseverance of its students and reduce the failure rate in the future. The main goal of this study is to predict the performance of undergraduate first-level students in the Computer Department during the years 2016 to 2021 to enhance their performance in future by discovering the best algorithm use to analyze the educational data to identify the students' academic performance. The secondary data was collected by reviewing the Student Affairs Department at the Faculty of Specific Education at Damietta University, in addition to the Statistics Department at the university. The dataset contained 830 instances after excluding 139 instances of missing values, irrelevant rows, and outliers. The dataset was divided into train (577 instances (70%)), test (253 instances (30%)) and involved six features such year, midterm, practical exam, writing exam, final total degree, and grade. This paper use five machine learning (ML) algorithms which was selected according to the literature review and high accuracy in predicting educational data mining: For the purpose of comparison, a number of different machine learning algorithms, such as Random Forest, Decision Tree, Naive Bayes, Neural Network, and K-Nearest Neighbours, were utilized and evaluated with evaluation metrics such as confusion matrix, accuracy, precision, recall, and F-measure. The Random Forest and Decision Tree classifiers emerged as the top-performing algorithms, accurately categorizing 250 instances when predicting students' performance in the statistics course. This was determined based on the findings of the study. Out of a total of 253 instances that were included in the testing set, they only made three incorrect classifications.
Published: 2024
Full Text: View/download PDF

25. Concrete Counterfactual Tests for Process Tracing: Defending an Interventionist Potential Outcomes Framework

Author: Rosa W. Runhardt
Abstract: This article uses the interventionist theory of causation, a counterfactual theory taken from philosophy of science, to strengthen causal analysis in process tracing research. Causal claims from process tracing are re-expressed in terms of so-called hypothetical interventions, and concrete evidential tests are proposed which are shown to corroborate process tracing claims. In particular, three steps are prescribed for an interventionist investigation, and each step in turn is shown to make the causal analysis more robust, amongst others by disambiguating causal claims and clarifying or strengthening the existing methodological advice on counterfactual analysis. The article's claims are then illustrated using a concrete example, Haggard and Kaufman's analysis of the Argentinian transition to democracy. It is shown that interventionism could have strengthened the authors' conclusions. The article concludes with a short Bayesian analysis of its key methodological proposals.
Published: 2024
Full Text: View/download PDF

26. A Tutorial on Aggregating Evidence from Conceptual Replication Studies Using the Product Bayes Factor

Author: Caspar J. Van Lissa, Eli-Boaz Clapper, and Rebecca Kuiper
Abstract: The product Bayes factor (PBF) synthesizes evidence for an informative hypothesis across heterogeneous replication studies. It can be used when fixed- or random effects meta-analysis fall short. For example, when effect sizes are incomparable and cannot be pooled, or when studies diverge significantly in the populations, study designs, and measures used. PBF shines as a solution for small sample meta-analyses, where the number of between-study differences is often large relative to the number of studies, precluding the use of meta-regression to account for these differences. Users should be mindful of the fact that the PBF answers a qualitatively different research question than other evidence synthesis methods. For example, whereas fixed-effect meta-analysis estimates the size of a population effect, the PBF quantifies to what extent an informative hypothesis is supported in all included studies. This tutorial paper showcases the user-friendly PBF functionality within the bain R-package. This new implementation of an existing method was validated using a simulation study, available in an Online Supplement. Results showed that PBF had a high overall accuracy, due to greater sensitivity and lower specificity, compared to random-effects meta-analysis, individual participant data meta-analysis, and vote counting. Tutorials demonstrate applications of the method on meta-analytic and individual participant data. The example datasets, based on published research, are included in bain so readers can reproduce the examples and apply the code to their own data. The PBF is a promising method for synthesizing evidence for informative hypotheses across conceptual replications that are not suitable for conventional meta-analysis.
Published: 2024
Full Text: View/download PDF

27. A Comparison of Two Models for Detecting Inconsistency in Network Meta-Analysis

Author: Lu Qin, Shishun Zhao, Wenlai Guo, Tiejun Tong, and Ke Yang
Abstract: The application of network meta-analysis is becoming increasingly widespread, and for a successful implementation, it requires that the direct comparison result and the indirect comparison result should be consistent. Because of this, a proper detection of inconsistency is often a key issue in network meta-analysis as whether the results can be reliably used as a clinical guidance. Among the existing methods for detecting inconsistency, two commonly used models are the design-by-treatment interaction model and the side-splitting models. While the original side-splitting model was initially estimated using a Bayesian approach, in this context, we employ the frequentist approach. In this paper, we review these two types of models comprehensively as well as explore their relationship by treating the data structure of network meta-analysis as missing data and parameterizing the potential complete data for each model. Through both analytical and numerical studies, we verify that the side-splitting models are specific instances of the design-by-treatment interaction model, incorporating additional assumptions or under certain data structure. Moreover, the design-by-treatment interaction model exhibits robust performance across different data structures on inconsistency detection compared to the side-splitting models. Finally, as a practical guidance for inconsistency detection, we recommend utilizing the design-by-treatment interaction model when there is a lack of information about the potential location of inconsistency. By contrast, the side-splitting models can serve as a supplementary method especially when the number of studies in each design is small, enabling a comprehensive assessment of inconsistency from both global and local perspectives.
Published: 2024
Full Text: View/download PDF

28. The Effects of Gaze-Display Feedback on Medical Students' Self-Monitoring and Learning in Radiology

Author: Ellen M. Kok, Diederick C. Niehorster, Anouk van der Gijp, Dirk R. Rutgers, William F. Auffermann, Marieke van der Schaaf, Liesbeth Kester, and Tamara van Gog
Abstract: Self-monitoring is essential for effectively regulating learning, but difficult in visual diagnostic tasks such as radiograph interpretation. Eye-tracking technology can visualize viewing behavior in gaze displays, thereby providing information about visual search and decision-making. We hypothesized that individually adaptive gaze-display feedback improves posttest performance and self-monitoring of medical students who learn to detect nodules in radiographs. We investigated the effects of: (1) Search displays, showing which part of the image was searched by the participant; and (2) Decision displays, showing which parts of the image received prolonged attention in 78 medical students. After a pretest and instruction, participants practiced identifying nodules in 16 cases under search-display, decision-display, or no feedback conditions (n = 26 per condition). A 10-case posttest, without feedback, was administered to assess learning outcomes. After each case, participants provided self-monitoring and confidence judgments. Afterward, participants reported on self-efficacy, perceived competence, feedback use, and perceived usefulness of the feedback. Bayesian analyses showed no benefits of gaze displays for post-test performance, monitoring accuracy (absolute difference between participants' estimated and their actual test performance), completeness of viewing behavior, self-efficacy, and perceived competence. Participants receiving search-displays reported greater feedback utilization than participants receiving decision-displays, and also found the feedback more useful when the gaze data displayed was precise and accurate. As the completeness of search was not related to posttest performance, search displays might not have been sufficiently informative to improve self-monitoring. Information from decision displays was rarely used to inform self-monitoring. Further research should address if and when gaze displays can support learning.
Published: 2024
Full Text: View/download PDF

29. Using Bayesian Meta-Analysis to Explore the Components of Early Literacy Interventions. Appendices. WWC 2023-008

Author: National Center for Education Evaluation and Regional Assistance (NCEE) (ED/IES), What Works Clearinghouse (WWC) and Mathematica
Abstract: The appendices accompany the full report "Using Bayesian Meta-Analysis to Explore the Components of Early Literacy Interventions. WWC 2023-008," (ED630495), which pilots a new taxonomy developed by early literacy experts and intervention developers as part of a larger effort to develop standard nomenclature for the components of literacy interventions. The What Works Clearinghouse (WWC) uses Bayesian meta-analysis--a statistical method to systematically summarize evidence across multiple studies--to estimate the associations between intervention components and intervention impacts. Twenty-nine studies of 25 early literacy interventions that were previously reviewed by the WWC and met the WWC's rigorous research standards were included in the analysis. The following apprendices are presented: (1) Components of Early Literacy Interventions; (2) Data from the What Works Clearinghouse's Database of Reviewed Studies; (3) The Bayesian Meta-Analytic Model; (4) Additional Results; and (5) Component Coding Protocol.
Published: 2023

30. Using Bayesian Meta-Analysis to Explore the Components of Early Literacy Interventions. WWC 2023-008

Author: National Center for Education Evaluation and Regional Assistance (NCEE) (ED/IES), What Works Clearinghouse (WWC), Mathematica, Walsh, Elias, Deke, John, Robles, Silvia, Streke, Andrei, and Thal, Dan
Abstract: The What Works Clearinghouse (WWC) released a report that applies two methodological approaches new to the WWC that together aim to improve researchers' understanding of how early literacy interventions may work to improve outcomes for students in grades K-3. First, this report pilots a new taxonomy developed by early literacy experts and intervention developers as part of a larger effort to develop standard nomenclature for the components of literacy interventions. Then, the WWC uses Bayesian meta-analysis--a statistical method to systematically summarize evidence across multiple studies--to estimate the associations between intervention components and intervention impacts. Twenty-nine studies of 25 early literacy interventions that were previously reviewed by the WWC and met the WWC's rigorous research standards were included in the analysis. This method found that the components examined in this synthesis appear to have a limited role in explaining variation in intervention impacts on alphabetics outcomes, including phonics, phonemic awareness, phonological awareness, and letter identification. This method also identified positive associations between intervention impacts on alphabetics outcomes and components related to using student assessment data to drive decisions, including about how to group students for instruction, and components related to non-academic student supports, including efforts to teach social-emotional learning strategies and outreach to parents and families. This report is exploratory because this synthesis cannot conclude that specific components caused improved alphabetics outcomes. [For the appendices to this report, see ED630496.]
Published: 2023

31. Likelihood-Based Estimation of Model-Derived Oral Reading Fluency

Author: Cornelis Potgieter, Xin Qiao, Akihito Kamata, and Yusuf Kara
Abstract: As part of the effort to develop an improved oral reading fluency (ORF) assessment system, Kara et al. estimated the ORF scores based on a latent variable psychometric model of accuracy and speed for ORF data via a fully Bayesian approach. This study further investigates likelihood-based estimators for the model-derived ORF scores, including maximum likelihood estimator (MLE), maximum a posteriori (MAP), and expected a posteriori (EAP), as well as their standard errors. The proposed estimators were demonstrated with a real ORF assessment dataset. Also, the estimation of model-derived ORF scores and their standard errors by the proposed estimators were evaluated through a simulation study. The fully Bayesian approach was included as a comparison in the real data analysis and the simulation study. Results demonstrated that the three likelihood-based approaches for the model-derived ORF scores and their standard error estimation performed satisfactorily.
Published: 2024
Full Text: View/download PDF

32. Bayesian Pairwise Meta-Analysis of Time-to-Event Outcomes in the Presence of Non-Proportional Hazards: A Simulation Study of Flexible Parametric, Piecewise Exponential and Fractional Polynomial Models

Author: Suzanne C. Freeman, Alex J. Sutton, Nicola J. Cooper, Alessandro Gasparini, Michael J. Crowther, and Neil Hawkins
Abstract: Background: Traditionally, meta-analysis of time-to-event outcomes reports a single pooled hazard ratio assuming proportional hazards (PH). For health technology assessment evaluations, hazard ratios are frequently extrapolated across a lifetime horizon. However, when treatment effects vary over time, an assumption of PH is not always valid. The Royston-Parmar (RP), piecewise exponential (PE), and fractional polynomial (FP) models can accommodate non-PH and provide plausible extrapolations of survival curves beyond observed data. Methods: Simulation study to assess and compare the performance of RP, PE, and FP models in a Bayesian framework estimating restricted mean survival time difference (RMSTD) at 50 years from a pairwise meta-analysis with evidence of non-PH. Individual patient data were generated from a mixture Weibull distribution. Twelve scenarios were considered varying the amount of follow-up data, number of trials in a meta-analysis, non-PH interaction coefficient, and prior distributions. Performance was assessed through bias and mean squared error. Models were applied to a metastatic breast cancer example. Results: FP models performed best when the non-PH interaction coefficient was 0.2. RP models performed best in scenarios with complete follow-up data. PE models performed well on average across all scenarios. In the metastatic breast cancer example, RMSTD at 50-years ranged from -14.6 to 8.48 months. Conclusions: Synthesis of time-to-event outcomes and estimation of RMSTD in the presence of non-PH can be challenging and computationally intensive. Different approaches make different assumptions regarding extrapolation and sensitivity analyses varying key assumptions are essential to check the robustness of conclusions to different assumptions for the underlying survival function.
Published: 2024
Full Text: View/download PDF

33. A Bayesian Semi-Parametric Approach for Modeling Memory Decay in Dynamic Social Networks

Author: Giuseppe Arena, Joris Mulder, and Roger Th. A. J. Leenders
Abstract: In relational event networks, the tendency for actors to interact with each other depends greatly on the past interactions between the actors in a social network. Both the volume of past interactions and the time that has elapsed since the past interactions affect the actors' decision-making to interact with other actors in the network. Recently occurred events may have a stronger influence on current interaction behavior than past events that occurred a long time ago--a phenomenon known as "memory decay". Previous studies either predefined a short-run and long-run memory or fixed a parametric exponential memory decay using a predefined half-life period. In real-life relational event networks, however, it is generally unknown how the influence of past events fades as time goes by. For this reason, it is not recommendable to fix memory decay in an ad-hoc manner, but instead we should learn the shape of memory decay from the observed data. In this paper, a novel semi-parametric approach based on Bayesian Model Averaging is proposed for learning the shape of the memory decay without requiring any parametric assumptions. The method is applied to relational event history data among socio-political actors in India and a comparison with other relational event models based on predefined memory decays is provided.
Published: 2024
Full Text: View/download PDF

34. Community-Guided, Autism-Adapted Group Cognitive Behavioral Therapy for Depression in Autistic Youth (CBT-DAY): Preliminary Feasibility, Acceptability, and Efficacy

Author: Jessica M. Schwartzman, Marissa C. Roth, Ann V. Paterson, Alexandra X. Jacobs, and Zachary J. Williams
Abstract: This study examined the preliminary feasibility, acceptability, and efficacy of an autism-adapted cognitive behavioral therapy for depression in autistic youth, CBT-DAY. Twenty-four autistic youth (11-17 years old) participated in the pilot non-randomized trial including 5 cisgender females, 14 cisgender males, and 5 non-binary youth. Youth participated in 12 weeks of, CBT-DAY and youth depressive symptoms (i.e., primary clinical outcome) and emotional reactivity and self-esteem (i.e., intervention mechanisms) were assessed through self-report and caregiver report at four timepoints: baseline (week 0), midpoint (week 6), post-treatment (week 12), and follow-up (week 24). Results suggested that CBT-DAY may be feasible (16.67% attrition) in an outpatient setting and acceptable to adolescents and their caregivers. Bayesian linear mixed-effects models showed that CBT-DAY may be efficacious in targeting emotional reactivity [[beta][subscript T1-T3] = -2.53, CrI[subscript 95%] (-4.62, -0.58), P[subscript d] = 0.995, d = -0.35] and self-esteem [[beta][subscript T1-T3] = -3.57, CrI[subscript 95%] (-5.17, -2.00), P[subscript d] > 0.999, d = -0.47], as well as youth depressive symptom severity [[beta] = -2.72, CrI[subscript 95%] (-3.85, -1.63), P[subscript d] > 0.999]. Treatment gains were maintained at follow-up. A cognitive behavioral group therapy designed for and with autistic people demonstrates promise in targeting emotional reactivity and self-esteem to improve depressive symptom severity in youth. Findings can be leveraged to implement larger, more controlled trials of CBT-DAY. The trial was registered at Clinicaltrials.gov (Identifier: NCT05430022; https://beta.clinicaltrials.gov/study/NCT05430022).
Published: 2024
Full Text: View/download PDF

35. Item Parameter Recovery: Sensitivity to Prior Distribution

Author: Christine E. DeMars and Paulius Satkus
Abstract: Marginal maximum likelihood, a common estimation method for item response theory models, is not inherently a Bayesian procedure. However, due to estimation difficulties, Bayesian priors are often applied to the likelihood when estimating 3PL models, especially with small samples. Little focus has been placed on choosing the priors for marginal maximum estimation. In this study, using sample sizes of 1,000 or smaller, not using priors often led to extreme, implausible parameter estimates. Applying prior distributions to the c-parameters alleviated the estimation problems with samples of 500 or more; for the samples of 100, priors on both the a-parameters and c-parameters were needed. Estimates were biased when the mode of the prior did not match the true parameter value, but the degree of the bias did not depend on the strength of the prior unless it was extremely informative. The root mean squared error (RMSE) of the a-parameters and b-parameters did not depend greatly on either the mode or the strength of the prior unless it was extremely informative. The RMSE of the c-parameters, like the bias, depended on the mode of the prior for c.
Published: 2024
Full Text: View/download PDF

36. Effectiveness of Conceptual Change Strategies in Science Education: A Meta-Analysis

Author: Cagatay Pacaci, Ulas Ustun, and Omer Faruk Ozdemir
Abstract: There is extensive literature focusing on students' misconceptions in various subject domains. Several conceptual change approaches have been trying to understand how conceptual change occurs to help learners handle these misconceptions. This meta-analysis aims to integrate studies investigating the effectiveness of three types of conceptual change strategy: cognitive conflict, cognitive bridging, and ontological category shift in science learning. We conducted a random-effects meta-analysis to calculate an overall effect size in Hedges' g with a sample of 218 primary studies, including 18,051 students. Our analyses resulted in a large overall effect size (g = 1.10, 95% CI [1.01, 1.19], k = 218, p < 0.001). We also performed a robust Bayesian meta-analysis to calculate an adjusted effect size, which specified a large effect (adjusted g = 0.93, 95% CI [0.68, 1.07], k = 218). Results are also consistent across the conceptual change strategies of cognitive conflict (g = 1.10, 95% CI [0.99, 1.21], k = 150, p < 0.001), cognitive bridging (g = 1.06, 95% CI [0.84, 1.28], k = 30, p < 0.001), and ontological category shift (g = 0.88, 95% CI [0.50, 1.26], k = 9, p < 0.001). However, a wide-ranging prediction interval [0.19, 2.38] points out a high level of heterogeneity in the distribution of effect sizes. Thus, we investigated the moderating effects of several variables using simple and multiple meta-regression. The final meta-regression model we created explained 35% of overall heterogeneity. This meta-analysis provides robust evidence that conceptual change strategies significantly enhance students' learning in science.
Published: 2024
Full Text: View/download PDF

37. Examining the Influence of Generalized Trust on Life Satisfaction across Different Education Levels and Socioeconomic Conditions Using the Bayesian Mindsponge Framework

Author: Tam-Tri Le, Minh-Hoang Nguyen, Ruining Jin, Viet-Phuong La, Hong-Son Nguyen, and Quan-Hoang Vuong
Abstract: Extant literature suggests a positive correlation between social trust (also called generalized trust) and life satisfaction. However, the psychological pathways underlying this relationship can be complex. Using the Bayesian Mindsponge Framework (BMF), we examined the influence of social trust in a high-violence environment. Employing Bayesian analysis on a sample of 1,237 adults in Cali, Colombia, we found that in a linear relationship, generalized trust is positively associated with life satisfaction. However, in a model including the interactions between trust and education level as well as between trust and socioeconomic status, generalized trust is found to be negatively associated with life satisfaction. In this non-linear relationship, both education level and socioeconomic status have moderating effects against the negative association between generalized trust and life satisfaction. In other words, less educated people living in worse socioeconomic conditions are more likely to have lower life satisfaction when they have higher levels of social trust. In contrast, highly educated people living in better socioeconomic conditions are more likely to have higher life satisfaction when they have higher levels of social trust. Due to the facilitating function of trust in information processing, lowering the rigor of the filtering system in a high-violence social environment will likely put an individual at risk. Based on our findings, we suggest that policymakers should be more meticulous and consider many socioeconomic factors when advocating for increasing social trust. We also recommend that researchers should investigate deeper the complexity of human psychology and the information-processing mechanisms of social trust.
Published: 2024
Full Text: View/download PDF

38. Free Time-Induced Retroactive Effects in Working Memory: Evidence from the Single-Gap Paradigm

Author: Ruoyu Lu, Yinuo Xu, Jiyu Xu, Tengfei Wang, and Zhi Li
Abstract: Free time in a working memory task often improves the recall performances of the to-be-remembered items. It is still debated whether the free-time effect in working memory is purely proactive, purely retroactive, or both proactive and retroactive. In the present study, we used the single-gap paradigm to explore this question. In Experiment 1, we measured the gap-length effect (i.e., the difference in memory performance elicited by the gap-length difference) under three long-short-gap combinations (i.e., 2,500 ms/100 ms, 2,500 ms/500 ms, 2,500 ms/1,000 ms). Proactive effects have been observed in all the three combinations whereas retroactive effects have only been found in two of them (i.e., 2,500 ms/100 ms, 2,500 ms/500 ms). To rule out the possibility that the retroactive effects found in Experiment 1 were simply due to the temporal grouping caused by the gap, in Experiment 2, the 2,500 ms/500 ms combination was retested, with the memory materials being changed from letters (the material used in Experiment 1) to words. The results showed that the range of the retroactive effect (i.e., the number of affected memory items prior to the gap) increased when the memory material changed from letters to words, which cannot be explained by temporal grouping. Taken together, the two experiments provided solid evidence that free time in working memory could produce both retroactive and proactive effects that cannot be explained by temporal grouping. These findings also provide insight into the underlying mechanism of working memory, for example, whether rehearsal would occur during the free time.
Published: 2024
Full Text: View/download PDF

39. A Hierarchical Bayesian Model of Adaptive Teaching

Author: Alicia M. Chen, Andrew Palacci, Natalia Vélez, Robert D. Hawkins, and Samuel J. Gershman
Abstract: How do teachers learn about what learners already know? How do learners aid teachers by providing them with information about their background knowledge and what they find confusing? We formalize this collaborative reasoning process using a hierarchical Bayesian model of pedagogy. We then evaluate this model in two online behavioral experiments (N = 312 adults). In Experiment 1, we show that teachers select examples that account for learners' background knowledge, and adjust their examples based on learners' feedback. In Experiment 2, we show that learners strategically provide more feedback when teachers' examples deviate from their background knowledge. These findings provide a foundation for extending computational accounts of pedagogy to richer interactive settings.
Published: 2024
Full Text: View/download PDF

40. Stabilizing Subgroup Proficiency Results to Improve the Identification of Low-Performing Schools. Appendixes. REL 2023-001

Author: Regional Educational Laboratory Mid-Atlantic (ED/IES), National Center for Education Evaluation and Regional Assistance (NCEE) (ED/IES), and Mathematica
Abstract: The "Stabilizing Subgroup Proficiency Results to Improve the Identification of Low-Performing Schools" study used Bayesian stabilization to improve the reliability (long-term stability) of subgroup proficiency measures that the Pennsylvania Department of Education (PDE) uses to identify schools for Targeted Support and Improvement (TSI) or Additional Targeted Support and Improvement (ATSI). The Every Student Succeeds Act requires states to designate schools with low-performing student subgroups for TSI or ATSI. This document presents the following appendixes that accompany the study: (1) Literature review; (2) Data and methods; and (3) Supplemental Results. [For the full report, see ED626539. For the Study Snapshot, see ED626540.]
Published: 2023

41. Stabilizing Subgroup Proficiency Results to Improve the Identification of Low-Performing Schools. REL 2023-001

Author: Regional Educational Laboratory Mid-Atlantic (ED/IES), National Center for Education Evaluation and Regional Assistance (NCEE) (ED/IES), Mathematica, Forrow, Lauren, Starling, Jennifer, and Gill, Brian
Abstract: The Every Student Succeeds Act requires states to identify schools with low-performing student subgroups for Targeted Support and Improvement or Additional Targeted Support and Improvement. Random differences between students' true abilities and their test scores, also called measurement error, reduce the statistical reliability of the performance measures used to identify schools for these categorizations. Measurement error introduces a risk that the identified schools are unlucky rather than truly low performing. Using data provided by the Pennsylvania Department of Education, the study team used Bayesian hierarchical modeling to improve the reliability of subgroup proficiency measures and demonstrate the approach's efficacy. [For the Study Snapshot, see ED626540. For the appendixes, see ED626541.]
Published: 2023

42. Supervised Latent Dirichlet Allocation With Covariates: A Bayesian Structural and Measurement Model of Text and Covariates

Author: Kenneth Tyler Wilcox, Ross Jacobucci, Zhiyong Zhang, and Brooke A. Ammerman
Abstract: Text is a burgeoning data source for psychological researchers, but little methodological research has focused on adapting popular modeling approaches for text to the context of psychological research. One popular measurement model for text, topic modeling, uses a latent mixture model to represent topics underlying a body of documents. Recently, psychologists have studied relationships between these topics and other psychological measures by using estimates of the topics as regression predictors along with other manifest variables. While similar two-stage approaches involving estimated latent variables are known to yield biased estimates and incorrect standard errors, two-stage topic modeling approaches have received limited statistical study and, as we show, are subject to the same problems. To address these problems, we proposed a novel statistical model-supervised latent Dirichlet allocation with covariates (SLDAX)-that jointly incorporates a latent variable measurement model of text and a structural regression model to allow the latent topics and other manifest variables to serve as predictors of an outcome. Using a simulation study with data characteristics consistent with psychological text data, we found that SLDAX estimates were generally more accurate and more efficient. To illustrate the application of SLDAX and a two-stage approach, we provide an empirical clinical application to compare the application of both the two-stage and SLDAX approaches. Finally, we implemented the SLDAX model in an open-source R package to facilitate its use and further study. Translational Abstract Text data is an increasingly popular data source in psychological research that can be analyzed with a variety of models and algorithms. Topic models are a popular measurement model that use latent variables to represent constructs underlying a set of documents (e.g., clinical interviews, survey open responses, written or spoken educational assessments). Recent applications have used estimates of these "topics" as predictors of other variables in a regression model, but the statistical behavior of this approach has not been well studied. Similar approaches with other latent variable models are known to yield incorrect regression coefficient estimates and incorrect inferences. We showed that the use of topic estimates as regression predictors is also prone to these problems. As a solution, we proposed a model that jointly estimates the topic model and regression model-supervised latent Dirichlet allocation with covariates (SLDAX). Using a simulation study under typical psychological text data conditions, we found that SLDAX estimates were generally more accurate and more precise than the two-stage approach. We illustrate the SLDAX and two-stage approaches in a clinical study of nonsuicidal self-injury and emotional dysregulation with participant interpersonal narratives. To allow researchers to apply the SLDAX model, we developed an open-source R software package.
Published: 2023
Full Text: View/download PDF

43. Machine Learning for Causal Inference

Author: Jennifer Hill, George Perrett, and Vincent Dorie
Abstract: Estimation of causal effects requires making comparisons across groups of observations exposed and not exposed to a a treatment or cause (intervention, program, drug, etc). To interpret differences between groups causally we need to ensure that they have been constructed in such a way that the comparisons are "fair." This can be accomplished though design, for instance, by allocating treatments to individuals randomly. However, more often researchers have access to observational data and are thus in the position of trying to create fair comparisons through post-hoc data restructuring or modeling. Many chapters in this book focus on the former approach (data restructuring). This chapter will focus on the latter (modeling) to illuminate what can be gained from such an approach. It illustrates the case for modeling the relationship between outcomes, covariates, and a treatment to estimate causal effects using a Bayesian machine learning algorithm known as Bayesian Additive Regression Trees (BART). [This chapter was published in: "Handbook of Matching and Weighting Adjustments for Causal Inference," pp. 416-443. Chapman & Hall/CRC, 2023.]
Published: 2023
Full Text: View/download PDF

44. KT-Bi-GRU: Student Performance Prediction with a Bi-Directional Recurrent Knowledge Tracing Neural Network

Author: Delianidi, Marina and Diamantaras, Konstantinos
Abstract: Student performance is affected by their knowledge which changes dynamically over time. Therefore, employing recurrent neural networks (RNN), which are known to be very good in dynamic time series prediction, can be a suitable approach for student performance prediction. We propose such a neural network architecture containing two modules: (i) a dynamic sub-network including a recurrent Bi-GRU layer used for knowledge state estimation, (ii) a non-dynamic, feed-forward sub-network for predicting answer correctness based on the current question and current student knowledge state. The model modifies our previously proposed architecture and is different from all other existing models because it estimates the student's knowledge state considering only their previous responses. Thus the dynamic sub-network generates more stable knowledge state vector representations since they are independent of the current question. We studied both single-skill and multi-skill question scenarios and employed embeddings to represent questions and responses. In the multi-skill case the initialization of the question embedding matrix with pretrained word-embeddings is found to improve model performance. The experimental results showed that our current KT-Bi-GRU model and the previous one have similar performance while both surpassed the performance of previous state-of-the-art knowledge tracing models for five out of seven datasets where in some cases, the difference is quite noticeable.
Published: 2023

45. Informative Hypothesis for Group Means Comparison

Author: Tan, Teck Kiang
Abstract: Researchers often have hypotheses concerning the state of affairs in the population from which they sampled their data to compare group means. The classical frequentist approach provides one way of carrying out hypothesis testing using ANOVA to state the null hypothesis that there is no difference in the means and proceed with multiple comparisons if the null hypothesis is rejected. As this approach is not able to incorporate order, inequality, and direction into hypothesis testing, and neither does it able to specify multiple hypotheses, this paper introduces the informative hypothesis that allows more flexibility in stating hypothesis testing and is directly targeted to address and state the researcher's study concern. The two new hypothesis terms under the informative hypothesis framework, the unconstrained and complementary hypotheses are introduced, and the approaches to state the level of evidence using the Bayes factor and Generalization AIC are elaborated. As this hypothesis conception is relatively new and the literature was mostly technical, the main aims of the paper are to introduce this conception, offer a general guideline, and provide an easy-to-read approach to the procedure with practical examples of carrying out this hypothesis approach and contrast it to the frequentist, using the R package.
Published: 2023

46. Expectations of Students from Classroom Rules: A Scenario Based Bayesian Network Analysis

Author: Demir, Ibrahim, Sener, Ersin, Karaboga, Hasan Aykut, and Basal, Ahmet
Abstract: Classroom rules are a fundamental aspect of classroom management and ensuring compliance with established rules is crucial. Previous research has shown that students often pay little attention to the development of classroom rules. This quantitative study aims to investigate the expectations that students have concerning classroom rules. To this end, a 4-point Likert scale questionnaire consisting of 30 items was administered to 356 secondary school students. The Bayesian Search method and expert opinion were used to obtain a Bayesian Network model. The findings of the study indicate that students expect rules to be determined at the beginning of the academic year, wish to be involved in the determination process, and prefer minimal changes to the rules. They also expect a limited number of rules and reinforcement from teachers for displaying desirable behavior. Additionally, the study found that students are more likely to adhere to classroom rules in a clean and uncrowded environment, and prefer that their parents are not informed about these rules. The results also suggest that increased adherence to classroom rules leads to increased class inclusion, while decreased adherence results in decreased class inclusion. Furthermore, the study found that adoption of classroom rules leads to increased in-class cohesion, while non-adoption results in decreased cohesion. These findings contribute to the existing body of knowledge concerning student expectations of classroom rules.
Published: 2023

47. Promoting Student Competencies in Informatics Education by Combining Semantic Waves and Algorithmic Thinking

Author: Ritter, Frauke and Standl, Bernhard
Abstract: We live in a digital age, not least accelerated by the COVID-19 pandemic. It is all the more important in our society that students learn and master the key competence of algorithmic thinking to understand the informatics concepts behind every digital phenomena and thus is able to actively shape the future. For this to be successful, concepts must be identified that can convey this key competence to all students in such a way that algorithmic thinking is integrated in the subject of informatics -beyond a pure programming course. Furthermore, based on the Legitimation Code Theory, semantic waves provide a way to develop and review lesson plans. Therefore, we planned a workshop, that follow the phases of a semantic wave addressing algorithmic problems using a block-based programming language. Considering this, we suggest the so-called SWAT concept (Semantic Wave Algorithmic Thinking concept), which is carried out and analyzed in a workshop with students. The workshop was carried out in online format in an 8th grade of a high school during a coronavirus lockdown. The level of algorithmic thinking was measured using a pretest and posttest both in the treatment group and in a control group and with the help of the approximate adjusted fractional Bayes factors for testing informative hypotheses statistically and through a reductive, qualitative content analysis of the students' work results (worksheets and created programs) evaluated. The semantic wave concept was measured using several cognitive load ratings of the students during the workshop and also statistically evaluated with the approximate adjusted fractional Bayes factors for testing informative hypotheses, as well as a qualitative content analysis of the worksheets. Results of this pilot study provide first insights, that the SWAT-concept can be used in combination of unplugged and plugged parts.
Published: 2023

48. Knowledge Tracing over Time: A Longitudinal Analysis

Author: Lee, Morgan P., Croteau, Ethan, Gurung, Ashish, Botelho, Anthony F., and Heffernan, Neil T.
Abstract: The use of Bayesian Knowledge Tracing (BKT) models in predicting student learning and mastery, especially in mathematics, is a well-established and proven approach in learning analytics. In this work, we report on our analysis examining the generalizability of BKT models across academic years attributed to "detector rot." We compare the generalizability of Knowledge Training (KT) models by comparing model performance in predicting student knowledge within the academic year and across academic years. Models were trained on data from two popular open-source curricula available through Open Educational Resources. We observed that the models generally were highly performant in predicting student learning within an academic year, whereas certain academic years were more generalizable than other academic years. We posit that the Knowledge Tracing models are relatively stable in terms of performance across academic years yet can still be susceptible to systemic changes and underlying learner behavior. As indicated by the evidence in this paper, we posit that learning platforms leveraging KT models need to be mindful of systemic changes or drastic changes in certain user demographics. [For the complete proceedings, see ED630829. Additional funding was provided by the U.S. Department of Education's Graduate Assistance in Areas of National Need (GAANN) program.]
Published: 2023

49. Examining the Factors Affecting Students' Science Success with Bayesian Networks

Author: Hasan Aykut Karaboga and Ibrahim Demir
Abstract: Bayesian Networks (BNs) are probabilistic graphical statistical models that have been widely used in many fields over the last decade. This method, which can also be used for educational data mining (EDM) purposes, is a fairly new method in education literature. This study models students' science success using the BN approach. Science is one of the core areas in the PISA exam. To this end, we used the data set including the most successful 25% and the least successful 25% students from Turkey based on their scores from Program for International Student Assessment (PISA) survey. We also made the feature selection to determine the most effective variables on success. The accuracy value of the BN model created with the variables determined by the feature selection is 86.2%. We classified effective variables on success into three categories; individual, family-related and school-related. Based on the analysis, we found that family-related variables are very effective in science success, and gender is not a discriminant variable in this success. In addition, this is the first study in the literature on the evaluation of complex data made with the BN model. In this respect, it serves as a guide in the evaluation of international exams and in the use of the data obtained.
Published: 2023

50. Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data

Author: Yuqi Gu, Elena A. Erosheva, Gongjun Xu, and David B. Dunson
Abstract: Mixed Membership Models (MMMs) are a popular family of latent structure models for complex multivariate data. Instead of forcing each subject to belong to a single cluster, MMMs incorporate a vector of subject-specific weights characterizing partial membership across clusters. With this flexibility come challenges in uniquely identifying, estimating, and interpreting the parameters. In this article, we propose a new class of "Dimension-Grouped" MMMs (Gro-M[superscript 3]s) for multivariate categorical data, which improve parsimony and interpretability. In Gro-M[superscript 3]s, observed variables are partitioned into groups such that the latent membership is constant for variables within a group but can differ across groups. Traditional latent class models are obtained when all variables are in one group, while traditional MMMs are obtained when each variable is in its own group. The new model corresponds to a novel decomposition of probability tensors. Theoretically, we derive transparent identifiability conditions for both the unknown grouping structure and model parameters in general settings. Methodologically, we propose a Bayesian approach for Dirichlet Gro-M[superscript 3]s to inferring the variable grouping structure and estimating model parameters. Simulation results demonstrate good computational performance and empirically confirm the identifiability results. We illustrate the new methodology through applications to a functional disability survey dataset and a personality test dataset.
Published: 2023

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

15,891 results on '"Bayesian Statistics"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources