348 results on '"Rare Event"'
Search Results
2. Accelerated High-Index Saddle Dynamics Method for Searching High-Index Saddle Points.
- Author
-
Luo, Yue, Zhang, Lei, and Zheng, Xiangcheng
- Abstract
The high-index saddle dynamics (HiSD) method (SIAM J Sci Comput 41:A3576–A3595, 2019) serves as an efficient tool for computing index-k saddle points and constructing solution landscapes. Nevertheless, the conventional HiSD method often encounters slow convergence rates on ill-conditioned problems. To address this challenge, we propose an accelerated high-index saddle dynamics (A-HiSD) by incorporating the heavy ball method. We prove the linear stability theory of the continuous A-HiSD, and subsequently estimate the local convergence rate for the discrete A-HiSD. Our analysis demonstrates that the A-HiSD method exhibits a faster convergence rate compared to the conventional HiSD method, especially when dealing with ill-conditioned problems. We also perform various numerical experiments including the loss function of neural network to substantiate the effectiveness and acceleration of the A-HiSD method. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
3. A limit formula and recursive algorithm for multivariate Normal tail probability.
- Author
-
Au, Siu-Kui
- Abstract
This work develops a formula for the large threshold limit of multivariate Normal tail probability when at least one of the normalised thresholds grows indefinitely. Derived using integration by parts, the formula expresses the tail probability in terms of conditional probabilities involving one less variate, thereby reducing the problem dimension by 1. The formula is asymptotic to Ruben’s formula under Salvage’s condition. It satisfies Plackett’s identity exactly or approximately, depending on the correlation parameter being differentiated. A recursive algorithm is proposed that allows the tail probability limit to be calculated in terms of univariate Normal probabilities only. The algorithm shows promise in numerical examples to offer a semi-analytical approximation under non-asymptotic situations to within an order of magnitude. The number of univariate Normal probability evaluations is at least n!, however, and in this sense the algorithm suffers from the curse of dimension. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
4. Binomial Confidence Intervals for Rare Events: Importance of Defining Margin of Error Relative to Magnitude of Proportion.
- Author
-
McGrath, Owen and Burke, Kevin
- Subjects
- *
SAMPLE size (Statistics) , *PROBABILITY theory , *CONFIDENCE intervals - Abstract
Confidence interval performance is typically assessed in terms of two criteria: coverage probability and interval width (or margin of error). In this article, we assess the performance of four common proportion interval estimators: the Wald, Clopper-Pearson (exact), Wilson and Agresti-Coull, in the context of rare-event probabilities. We define the interval precision in terms of a relative margin of error which ensures consistency with the magnitude of the proportion. Thus, confidence interval estimators are assessed in terms of achieving a desired coverage probability whilst simultaneously satisfying the specified relative margin of error. We illustrate the importance of considering both coverage probability and relative margin of error when estimating rare-event proportions, and show that within this framework, all four interval estimators perform somewhat similarly for a given sample size and confidence level. We identify relative margin of error values that result in satisfactory coverage while being conservative in terms of sample size requirements, and hence suggest a range of values that can be adopted in practice. The proposed relative margin of error scheme is evaluated analytically, by simulation, and by application to a number of recent studies from the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Sum of Poisson-Distributed Random Variables: A Convolution Method Approach
- Author
-
A. A. Ayenigba, O. M. Ajao, and F. A. Okolie
- Subjects
Rare event ,convolution method ,Poisson distribution ,Skewness ,Kurtosis ,Science - Abstract
This paper presents a two-parameter extension of the classical Poisson distribution, specifically tailored for rare event modeling. The proposed model is constructed as the sum of two independent Poisson random variables, using a convolution method. Some properties of the distribution, including the probability mass function (PMF), moment-generating function (MGF), mean, variance, higher-order moments, Skewness, and kurtosis, are derived.
- Published
- 2025
6. A Unified Approach for Hitting Time of Jump Markov Type Processes.
- Author
-
Limnios, Nikolaos and Wu, Bei
- Abstract
This paper investigates the asymptotic analysis of the hitting time of Markov-type jump processes (i.e., semi-Markov, Markov, in continuous or discrete time) with a small probability of entering a non-empty terminal subset. This means that absorption is a rare event. The mean hitting time function of all four type processes obeyed the same equation. We obtain unified results of asymptotic approximation in a series scheme or, equivalently, a functional type of mean hitting time. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. PyRETIS 3: Conquering rare and slow events without boundaries.
- Author
-
Vervust, Wouter, Zhang, Daniel T., Ghysels, An, Roet, Sander, van Erp, Titus S., and Riccardi, Enrico
- Subjects
- *
METASTABLE states , *MODULAR construction , *MACHINE learning , *PYTHON programming language , *ALGORITHMS , *BIOCHEMICAL substrates - Abstract
We present and discuss the advancements made in PyRETIS 3, the third instalment of our Python library for an efficient and user‐friendly rare event simulation, focused to execute molecular simulations with replica exchange transition interface sampling (RETIS) and its variations. Apart from a general rewiring of the internal code towards a more modular structure, several recently developed sampling strategies have been implemented. These include recently developed Monte Carlo moves to increase path decorrelation and convergence rate, and new ensemble definitions to handle the challenges of long‐lived metastable states and transitions with unbounded reactant and product states. Additionally, the post‐analysis software PyVisa is now embedded in the main code, allowing fast use of machine‐learning algorithms for clustering and visualising collective variables in the simulation data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Modeling multivariate extreme value distributions via Markov trees.
- Author
-
Hu, Shuang, Peng, Zuoxiang, and Segers, Johan
- Subjects
- *
MARKOV random fields , *PARETO distribution , *PARSIMONIOUS models , *TREES , *DISTRIBUTION (Probability theory) - Abstract
Multivariate extreme value distributions are a common choice for modeling multivariate extremes. In high dimensions, however, the construction of flexible and parsimonious models is challenging. We propose to combine bivariate max‐stable distributions into a Markov random field with respect to a tree. Although in general not max‐stable itself, this Markov tree is attracted by a multivariate max‐stable distribution. The latter serves as a tree‐based approximation to an unknown max‐stable distribution with the given bivariate distributions as margins. Given data, we learn an appropriate tree structure by Prim's algorithm with estimated pairwise upper tail dependence coefficients as edge weights. The distributions of pairs of connected variables can be fitted in various ways. The resulting tree‐structured max‐stable distribution allows for inference on rare event probabilities, as illustrated on river discharge data from the upper Danube basin. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Bias reduction for semi-competing risks frailty model with rare events: application to a chronic kidney disease cohort study in South Korea.
- Author
-
Kim, Jayoun, Jeong, Boram, Ha, Il Do, Oh, Kook-Hwan, Jung, Ji Yong, Jeong, Jong Cheol, and Lee, Donghwan
- Subjects
DISEASE risk factors ,CHRONIC kidney failure ,CENSORING (Statistics) ,FRAILTY ,COHORT analysis - Abstract
In a semi-competing risks model in which a terminal event censors a non-terminal event but not vice versa, the conventional method can predict clinical outcomes by maximizing likelihood estimation. However, this method can produce unreliable or biased estimators when the number of events in the datasets is small. Specifically, parameter estimates may converge to infinity, or their standard errors can be very large. Moreover, terminal and non-terminal event times may be correlated, which can account for the frailty term. Here, we adapt the penalized likelihood with Firth's correction method for gamma frailty models with semi-competing risks data to reduce the bias caused by rare events. The proposed method is evaluated in terms of relative bias, mean squared error, standard error, and standard deviation compared to the conventional methods through simulation studies. The results of the proposed method are stable and robust even when data contain only a few events with the misspecification of the baseline hazard function. We also illustrate a real example with a multi-centre, patient-based cohort study to identify risk factors for chronic kidney disease progression or adverse clinical outcomes. This study will provide a better understanding of semi-competing risk data in which the number of specific diseases or events of interest is rare. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Something Out of Nothing? The Influence of Double-Zero Studies in Meta-analysis of Adverse Events in Clinical Trials
- Author
-
Fan, Zhaohu, Liu, Dungang, Chen, Yuejie, and Zhang, Nanhua
- Published
- 2024
- Full Text
- View/download PDF
11. Learning-based importance sampling via stochastic optimal control for stochastic reaction networks.
- Author
-
Ben Hammouda, Chiheb, Ben Rached, Nadhir, Tempone, Raúl, and Wiechert, Sophia
- Abstract
We explore efficient estimation of statistical quantities, particularly rare event probabilities, for stochastic reaction networks. Consequently, we propose an importance sampling (IS) approach to improve the Monte Carlo (MC) estimator efficiency based on an approximate tau-leap scheme. The crucial step in the IS framework is choosing an appropriate change of probability measure to achieve substantial variance reduction. This task is typically challenging and often requires insights into the underlying problem. Therefore, we propose an automated approach to obtain a highly efficient path-dependent measure change based on an original connection in the stochastic reaction network context between finding optimal IS parameters within a class of probability measures and a stochastic optimal control formulation. Optimal IS parameters are obtained by solving a variance minimization problem. First, we derive an associated dynamic programming equation. Analytically solving this backward equation is challenging, hence we propose an approximate dynamic programming formulation to find near-optimal control parameters. To mitigate the curse of dimensionality, we propose a learning-based method to approximate the value function using a neural network, where the parameters are determined via a stochastic optimization algorithm. Our analysis and numerical experiments verify that the proposed learning-based IS approach substantially reduces MC estimator variance, resulting in a lower computational complexity in the rare event regime, compared with standard tau-leap MC estimators. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. Environmental, climatic, and situational factors influencing the probability of fatality or injury occurrence in flash flooding: a rare event logistic regression predictive model.
- Author
-
Chang, Shi, Wilkho, Rohan Singh, Gharaibeh, Nasir, Sansom, Garett, Meyer, Michelle, Olivera, Francisco, and Zou, Lei
- Subjects
LOGISTIC regression analysis ,REGRESSION analysis ,PREDICTION models ,FLOOD warning systems ,HAZARD mitigation ,FLOODS ,COMMUNITIES ,PROBABILITY theory - Abstract
Flash flooding is considered one of the most lethal natural hazards in the USA as measured by the ratio of fatalities to people affected. However, the occurrence of injuries and fatalities during flash flooding was found to be rare (about 2% occurrence rate) based on our analysis of 6,065 flash flood events that occurred in Texas over a 15-year period (2005 to 2019). This article identifies climatic, environmental, and situational factors that affect the occurrence of fatalities and injuries in flash flood events and provides a predictive model to estimate the likelihood of these occurrences. Due to the highly imbalanced dataset, three forms of logit models were investigated to achieve unbiased estimations of the model coefficients. The rare event logistic regression (Relogit) model was found to be the most suitable model. The model considers ten independent situational, climatic, and environmental variables that could affect human safety in flash flood events. Vehicle-related activities during flash flooding exhibited the greatest effect on the probability of human harm occurrence, followed by the event's time (daytime vs. nighttime), precipitation amount, location with respect to the flash flood alley, median age of structures in the community, low water crossing density, and event duration. The application of the developed model as a simulation tool for informing flash flood mitigation planning was demonstrated in two study cases in Texas. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. State-dependent importance sampling for estimating expectations of functionals of sums of independent random variables.
- Author
-
Ben Amar, Eya, Ben Rached, Nadhir, Haji-Ali, Abdul-Lateef, and Tempone, Raúl
- Abstract
Estimating the expectations of functionals applied to sums of random variables (RVs) is a well-known problem encountered in many challenging applications. Generally, closed-form expressions of these quantities are out of reach. A naive Monte Carlo simulation is an alternative approach. However, this method requires numerous samples for rare event problems. Therefore, it is paramount to use variance reduction techniques to develop fast and efficient estimation methods. In this work, we use importance sampling (IS), known for its efficiency in requiring fewer computations to achieve the same accuracy requirements. We propose a state-dependent IS scheme based on a stochastic optimal control formulation, where the control is dependent on state and time. We aim to calculate rare event quantities that could be written as an expectation of a functional of the sums of independent RVs. The proposed algorithm is generic and can be applied without restrictions on the univariate distributions of RVs or the functional applied to the sum. We apply this approach to the log-normal distribution to compute the left tail and cumulative distribution of the ratio of independent RVs. For each case, we numerically demonstrate that the proposed state-dependent IS algorithm compares favorably to most well-known estimators dealing with similar problems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
14. Predicting the Length of Stay in Hospital Emergency Rooms in Rhode Island
- Author
-
Lamere, Alicia T., Nguyen, Son, Niu, Gao, Olinsky, Alan, and Quinn, John
- Published
- 2021
- Full Text
- View/download PDF
15. An efficient SRAM yield analysis method based on scaled-sigma adaptive importance sampling with meta-model accelerated.
- Author
-
Pang, Liang, Wang, Ziqi, Shi, Rui, Yao, Mengyun, Shi, Xiao, Yan, Hao, and Shi, Longxin
- Subjects
- *
STATIC random access memory , *MONTE Carlo method , *ONLINE education - Abstract
SRAM yield analysis is critical to the robust SRAM design. However, it is a quite difficult to estimate the SRAM yield because the circuit failure is a "rare-event". Existing methods are still not efficient enough to solve the problem, especially in high dimensional circuit scenario. In this paper, we present an upgraded version of our conference work, scaled-sigma adaptive importance sampling (SSAIS), improved by adapting projection pursuit regression (PPR). The SSAIS updates not only the location parameters but the scale parameters by searching failure region iteratively. To further reduce the cost of the estimation, we construct PPR model to replace the expensive transistor-level simulation. The model and modeling procedure are integrated into SSAIS successfully by the re-simulation technique. Our method was first validated on SMIC 40 nm SRAM cell and the result outperforms over 2534 X than Monte Carlo method and is 3. 2 X ∼ 7. 3 X faster than the state-of-art methods with enough accuracy. The comparisons on sense amplifier show our method achieves 1811X speedup over the Monte Carlo method and 2 X ∼ 11 X speedup over the other methods. • An efficient reliability analysis method based on importance sampling is proposed to predict SRAM yield. • The sampling function is constructed and updated adaptively by our statistical methods to guarantee the prediction accuracy. • The circuit simulation cost is greatly reduced by our meta model with online training. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
16. Performance of prior and weighting bias correction methods for rare event logistic regression under the influence of sampling bias.
- Author
-
Alpay, Olcay and Çankaya, Emel
- Subjects
- *
LOGISTIC regression analysis , *COMMUNICABLE diseases , *FINANCIAL crises - Abstract
The problem of classifying events to binary classes has been popularly addressed by Logistic Regression Analysis. However, there may be situations where the most interested class of event is rare such as an infectious disease, earthquake, financial crisis etc. The model of such events tends to focus on the majority class, resulting in the underestimation of probabilities for the rare class. Additionally, the model may incorporate sampling bias if the rare class of the sample is not representative of its population. It is therefore important to investigate whether such rareness is genuine or caused by an improperly drawn sample. We conducted a simulation study by creating three populations with different rarity levels and drawing samples from each of those which are either compatible or incompatible with the actual rare classes of the population. Then, the effect of sampling bias is discussed under the two correction methods of bias due to rareness as suggested by King and Zeng. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. The Role of Double-Zero-Event Studies in Evidence Synthesis: Evaluating Robustness Using the Fragility Index.
- Author
-
Wang Z, Xing X, Mun EY, Wu C, and Lin L
- Subjects
- Humans, Research Design, Data Interpretation, Statistical, Clinical Trials as Topic methods, Sample Size, Meta-Analysis as Topic
- Abstract
Rationale: Zero-event counts are common in clinical studies, particularly when assessing rare adverse events. These occurrences can result from low event rates, short follow-up periods, and small sample sizes. When both intervention and control groups report zero events in a clinical trial, the study is referred to as a double-zero-event study, which presents methodological challenges for evidence synthesis. There has been ongoing debate about whether these studies should be excluded from evidence synthesis, as traditional two-stage meta-analysis methods may not estimate an effect size for them. Recent research suggests that these studies may still contain valuable clinical and statistical information., Aims and Objectives: This study examines the role of double-zero-event studies from the perspective of the fragility index (FI), a popular metric for assessing the robustness of clinical results. We aim to determine how including or excluding double-zero-event studies affects FI derivations in meta-analyses., Methods: We conducted an illustrative case study to demonstrate how double-zero-event studies can impact FI derivations. Additionally, we performed a large-scale analysis of 12,184 Cochrane meta-analyses involving zero-event studies to assess the prevalence and effect of double-zero-event studies on FI calculations., Results: Our analysis revealed that FI derivations in 6608 (54.2%) of these meta-analyses involved double-zero-event studies. Excluding double-zero-event studies could lead to artificially inflated FI values, potentially misrepresenting the results as more robust than they are., Conclusions: We advocate for retaining double-zero-event studies in meta-analyses and emphasise the importance of carefully considering their role in FI assessments. Including these studies ensures a more accurate evaluation of the robustness of clinical results in evidence synthesis., (© 2025 John Wiley & Sons Ltd.)
- Published
- 2025
- Full Text
- View/download PDF
18. Mountains and rivers: rare events in noisy systems and the forces that shape them
- Author
-
Kuznets-Speck, Benjamin
- Subjects
Biophysics ,Physical chemistry ,first-passage ,rare event ,rate ,stochastic ,trajectory reweighting - Abstract
Rare events are ubiquitous in noisy complex systems throughout the physical sciences and to large extent determine their function and regulation. Dissipative outside forces often work hand in hand with equilibrium structure to shape the mechanism and frequency of such improbable fluctuations, but little is known about how to codify the influence of non-equilibrium on reaction rates and their mechanisms. In the last quarter century we have seen paradigm shifting breakthroughs in reaction rate theory that have allowed for the study of rare transitions in complex many particle systems. At the same time, development of the statistical mechanics of trajectories has revolutionized how we study the behavior, response and functional limits of systems away from equilibrium. Here, we develop a trajectory theory of how reaction rates respond to nonequilibrium forces, allowing us both to probe how non-equilibrium systems regulate their function, and leverage optimally designed forces to sample reaction rates from finite time driven trajectories for the first time.
- Published
- 2023
19. Active Learning for Saddle Point Calculation.
- Author
-
Gu, Shuting, Wang, Hongqiao, and Zhou, Xiang
- Abstract
The saddle point (SP) calculation is a grand challenge for computationally intensive energy function in computational chemistry area, where the saddle point may represent the transition state. The traditional methods need to evaluate the gradients of the energy function at a very large number of locations. To reduce the number of expensive computations of the true gradients, we propose an active learning framework consisting of a statistical surrogate model, Gaussian process regression (GPR) for the energy function, and a single-walker dynamics method, gentle accent dynamics (GAD), for the saddle-type transition states. SP is detected by the GAD applied to the GPR surrogate for the gradient vector and the Hessian matrix. Our key ingredient for efficiency improvements is an active learning method which sequentially designs the most informative locations and takes evaluations of the original model at these locations to train GPR. We formulate this active learning task as the optimal experimental design problem and propose a very efficient sample-based sub-optimal criterion to construct the optimal locations. We show that the new method significantly decreases the required number of energy or force evaluations of the original model. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Small failure probability: principles, progress and perspectives.
- Author
-
Lee, Ikjin, Lee, Ungki, Ramu, Palaniappan, Yadav, Deepanshu, Bayrak, Gamze, and Acar, Erdem
- Abstract
Design of structural and multidisciplinary systems under uncertainties requires estimation of their reliability or equivalently the probability of failure under the given operating conditions. Various high technology systems including aircraft and nuclear power plants are designed for very small probabilities of failure, and estimation of these small probabilities is computationally challenging. Even though substantial number of approaches have been proposed to reduce the computational burden, there is no established guideline to decide which approach is the best choice for a given problem. This paper provides a review of the approaches developed for small probability estimation of structural or multidisciplinary systems and enlists the criterion/metrics to choose the preferred approach amongst the existing ones, for a given problem. First, the existing approaches are categorized into the sampling-based, the surrogate-based, and statistics of extremes based approaches. Next, the small probability estimation methods developed for time-independent systems and the ones tailored for time-dependent systems are discussed, respectively. Then, some real-life engineering applications in structural and multidisciplinary design studies are summarized. Finally, concluding remarks are provided, and areas for future research are suggested. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. Assessment of a Modified Sandwich Estimator for Generalized Estimating Equations with Application to Opioid Poisoning in MIMIC-IV ICU Patients
- Author
-
Paul Rogers and Julie Stoner
- Subjects
sandwich estimator ,generalized estimating equation ,rare event ,finite sample ,binary outcome ,correlated outcome ,Statistics ,HA1-4737 - Abstract
Longitudinal data is encountered frequently in many healthcare research areas to include the critical care environment. Repeated measures from the same subject are expected to correlate with each other. Models with binary outcomes are commonly used in this setting. Regression models for correlated binary outcomes are frequently fit using generalized estimating equations (GEE). The Liang and Zeger sandwich estimator is often used in GEE to produce unbiased standard error estimation for regression coefficients in large sample settings, even when the covariance structure is misspecified. The sandwich estimator performs optimally in balanced designs when the number of participants is large with few repeated measurements. The sandwich estimator’s asymptotic properties do not hold in small sample and rare-event settings. Under these conditions, the sandwich estimator underestimates the variances and is biased downwards. Here, the performance of a modified sandwich estimator is compared to the traditional Liang-Zeger estimator and alternative forms proposed by authors Morel, Pan, and Mancl-DeRouen. Each estimator’s performance was assessed with 95% coverage probabilities for the regression coefficients using simulated data under various combinations of sample sizes and outcome prevalence values with independence and autoregressive correlation structures. This research was motivated by investigations involving rare-event outcomes in intensive care unit settings.
- Published
- 2021
- Full Text
- View/download PDF
22. Rare Events in Random Geometric Graphs.
- Author
-
Hirsch, Christian, Moka, Sarat B., Taimre, Thomas, and Kroese, Dirk P.
- Subjects
RANDOM numbers ,POISSON processes ,CONDITIONAL probability ,POINT processes ,PROBABILITY theory - Abstract
This work introduces and compares approaches for estimating rare-event probabilities related to the number of edges in the random geometric graph on a Poisson point process. In the one-dimensional setting, we derive closed-form expressions for a variety of conditional probabilities related to the number of edges in the random geometric graph and develop conditional Monte Carlo algorithms for estimating rare-event probabilities on this basis. We prove rigorously a reduction in variance when compared to the crude Monte Carlo estimators and illustrate the magnitude of the improvements in a simulation study. In higher dimensions, we use conditional Monte Carlo to remove the fluctuations in the estimator coming from the randomness in the Poisson number of nodes. Finally, building on conceptual insights from large-deviations theory, we illustrate that importance sampling using a Gibbsian point process can further substantially reduce the estimation variance. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
23. Non-Markov-Type Analysis and Diffusion Map Analysis for Molecular Dynamics Trajectory of Chignolin at a High Temperature.
- Author
-
Fujisaki, Hiroshi, Suetani, Hiromichi, Maragliano, Luca, and Mitsutake, Ayori
- Subjects
- *
HIGH temperatures , *SYNTHETIC proteins , *MOLECULAR dynamics , *CONFORMATIONAL analysis , *EIGENVECTORS - Abstract
We apply the non-Markov-type analysis of state-to-state transitions to nearly microsecond molecular dynamics (MD) simulation data at a folding temperature of a small artificial protein, chignolin, and we found that the time scales obtained are consistent with our previous result using the weighted ensemble simulations, which is a general path-sampling method to extract the kinetic properties of molecules. Previously, we also applied diffusion map (DM) analysis, which is one of a manifold of learning techniques, to the same trajectory of chignolin in order to cluster the conformational states and found that DM and relaxation mode analysis give similar results for the eigenvectors. In this paper, we divide the same trajectory into shorter pieces and further apply DM to such short-length trajectories to investigate how the obtained eigenvectors are useful to characterize the conformational change of chignolin. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
24. Comparison of penalized logistic regression models for rare event case.
- Author
-
Olmuş, Hülya, Nazman, Ezgi, and Erbaş, Semra
- Subjects
- *
REGRESSION analysis , *MONTE Carlo method , *STANDARD deviations , *PARAMETER estimation , *ESTIMATION bias , *BINARY codes , *PROBABILITY theory , *LOGISTIC regression analysis - Abstract
The occurrence rate of the event of interest might be quite small (rare) in some cases, although sample size is large enough for Binary Logistic Regression (LR) model. In studies where the sample size is not large enough, the parameters to be estimated might be biased because of rare event case. Parameter estimations of LR model are usually obtained using Newton–Raphson (NR) algorithm for Maximum Likelihood Estimation (MLE). It is known that these estimations are usually biased in small samples but asymptotically unbiased. On the other hand, initial parameter values are sensitive for parameter estimation in NR for MLE. Our aim of the study is to present an approach on parameter estimation bias using inverse conditional distributions based on distribution assumption giving true parameter values and to compare this approach on different penalized LR methods. With this aim, LR, Firth LR, FLIC and FLAC methods were compared in terms of parameter estimation bias, predicted probability bias and Root Mean Squared Error (RMSE) for different sample sizes, event and correlation rates conducting a detailed Monte Carlo simulation study. Findings suggest that FLIC method should be preferred in rare event and small sample cases. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
25. Economic and financial determinants of bankrupcy: evidence from Ecuador's private banks and the impact of dollarization on financial fragility
- Author
-
Adriana Uquillas and Francis Flores
- Subjects
bankruptcy ,dollarization ,rare event ,small sample ,financial crisis ,Commerce ,HF1-6182 ,Business ,HF5001-6182 - Abstract
Purpose – An econometric model is established to explain bankruptcy in Ecuadorian banks. The utility of combining macroeconomic, financial, and idiosyncratic determinants to explain bankruptcy is empirically demonstrated. Design/methodology/approach – The cross-sectional analysis includes 24 banks between 1996 and 2016. Bankruptcy is considered as a rare event. Findings – Even in adverse macroeconomic conditions, the main factor explaining bankruptcy is lax administration. Also, those banks with higher levels of indebtedness with respect to their capital levels are more susceptible to bankruptcy. Higher levels of spread and lower inflation are associated with lower levels of bankruptcy. Ceteris paribus, after dollarization the bankruptcy probability decreases and the effective management of each bank becomes a relevant factor to explain bankruptcy. Originality/value – Different determinants are combined in order to produce predictive models with practical value and macro-dependent dynamics that are relevant for stress tests. There is empirical evidence that the change in the monetary system has helped to stabilize the financial system. The problem of having a small sample and rare events is evident and adequately addressed.
- Published
- 2020
- Full Text
- View/download PDF
26. Survival Modeling of Suicide Risk with Rare and Uncertain Diagnoses
- Author
-
Wang, Wenjie, Luo, Chongliang, Aseltine, Robert H., Wang, Fei, Yan, Jun, and Chen, Kun
- Published
- 2023
- Full Text
- View/download PDF
27. Weighting Methods for Rare Event Identification From Imbalanced Datasets
- Author
-
Jia He and Maggie X. Cheng
- Subjects
imbalanced dataset ,bias ,classification ,machine learning ,rare event ,Information technology ,T58.5-58.64 - Abstract
In machine learning, we often face the situation where the event we are interested in has very few data points buried in a massive amount of data. This is typical in network monitoring, where data are streamed from sensing or measuring units continuously but most data are not for events. With imbalanced datasets, the classifiers tend to be biased in favor of the main class. Rare event detection has received much attention in machine learning, and yet it is still a challenging problem. In this paper, we propose a remedy for the standing problem. Weighting and sampling are two fundamental approaches to address the problem. We focus on the weighting method in this paper. We first propose a boosting-style algorithm to compute class weights, which is proved to have excellent theoretical property. Then we propose an adaptive algorithm, which is suitable for real-time applications. The adaptive nature of the two algorithms allows a controlled tradeoff between true positive rate and false positive rate and avoids excessive weight on the rare class, which leads to poor performance on the main class. Experiments on power grid data and some public datasets show that the proposed algorithms outperform the existing weighting and boosting methods, and that their superiority is more noticeable with noisy data.
- Published
- 2021
- Full Text
- View/download PDF
28. Efficient importance sampling for large sums of independent and identically distributed random variables.
- Author
-
Ben Rached, Nadhir, Haji-Ali, Abdul-Lateef, Rubino, Gerardo, and Tempone, Raúl
- Abstract
We discuss estimating the probability that the sum of nonnegative independent and identically distributed random variables falls below a given threshold, i.e., P (∑ i = 1 N X i ≤ γ) , via importance sampling (IS). We are particularly interested in the rare event regime when N is large and/or γ is small. The exponential twisting is a popular technique for similar problems that, in most cases, compares favorably to other estimators. However, it has some limitations: (i) It assumes the knowledge of the moment-generating function of X i and (ii) sampling under the new IS PDF is not straightforward and might be expensive. The aim of this work is to propose an alternative IS PDF that approximately yields, for certain classes of distributions and in the rare event regime, at least the same performance as the exponential twisting technique and, at the same time, does not introduce serious limitations. The first class includes distributions whose probability density functions (PDFs) are asymptotically equivalent, as x → 0 , to b x p , for p > - 1 and b > 0 . For this class of distributions, the Gamma IS PDF with appropriately chosen parameters retrieves approximately, in the rare event regime corresponding to small values of γ and/or large values of N, the same performance of the estimator based on the use of the exponential twisting technique. In the second class, we consider the Log-normal setting, whose PDF at zero vanishes faster than any polynomial, and we show numerically that a Gamma IS PDF with optimized parameters clearly outperforms the exponential twisting IS PDF. Numerical experiments validate the efficiency of the proposed estimator in delivering a highly accurate estimate in the regime of large N and/or small γ . [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
29. Automated importance sampling via optimal control for stochastic reaction networks: A Markovian projection–based approach.
- Author
-
Ben Hammouda, Chiheb, Ben Rached, Nadhir, Tempone, Raúl, and Wiechert, Sophia
- Subjects
- *
PROBABILITY measures , *STOCHASTIC control theory , *MARGINAL distributions , *HAMILTON-Jacobi-Bellman equation , *NUMERICAL analysis , *COMPUTATIONAL complexity , *OPTIMAL control theory , *PROBABILITY theory - Abstract
We propose a novel alternative approach to our previous work (Ben Hammouda et al., 2023) to improve the efficiency of Monte Carlo (MC) estimators for rare event probabilities for stochastic reaction networks (SRNs). In the same spirit of Ben Hammouda et al. (2023), an efficient path-dependent measure change is derived based on a connection between determining optimal importance sampling (IS) parameters within a class of probability measures and a stochastic optimal control formulation, corresponding to solving a variance minimization problem. In this work, we propose a novel approach to address the encountered curse of dimensionality by mapping the problem to a significantly lower-dimensional space via a Markovian projection (MP) idea. The output of this model reduction technique is a low-dimensional SRN (potentially even one dimensional) that preserves the marginal distribution of the original high-dimensional SRN system. The dynamics of the projected process are obtained by solving a related optimization problem via a discrete L 2 regression. By solving the resulting projected Hamilton–Jacobi–Bellman (HJB) equations for the reduced-dimensional SRN, we obtain projected IS parameters, which are then mapped back to the original full-dimensional SRN system, resulting in an efficient IS-MC estimator for rare events probabilities of the full-dimensional SRN. Our analysis and numerical experiments reveal that the proposed MP-HJB-IS approach substantially reduces the MC estimator variance, resulting in a lower computational complexity in the rare event regime than standard MC estimators. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Predicting future changes in the work ability of individuals receiving a work disability benefit: weighted analysis of longitudinal data
- Author
-
Ilse Louwerse, Maaike A Huysmans, Jolanda HJ van Rijssen, Frederieke G Schaafsma, Kristel HN Weerdesteijn, Allard J van der Beek, and Johannes R Anema
- Subjects
work ability ,prognosis ,work disability ,longitudinal data ,disability benefit ,work disability benefit ,weighted analysis ,work disability allowance ,weighted multinomial logit model ,rare event ,Public aspects of medicine ,RA1-1270 - Abstract
OBJECTIVES: Weighted regression procedures can be an efficient solution for cohort studies that involve rare events or diseases, which can be difficult to predict, allowing for more accurate prediction of cases of interest. The aims of this study were to (i) predict changes in work ability at one year after approval of the work disability benefit and (ii) explore whether weighted regression procedures could improve the accuracy of predicting claimants with the highest probability of experiencing a relevant change in work ability. METHODS: The study population consisted of 944 individuals who were granted a work disability benefit. Self-reported questionnaire data measured at baseline were linked with administrative data from Dutch Social Security Institute databases. Standard and weighted multinomial logit models were fitted to predict changes in the work ability score (WAS) at one-year follow-up. McNemar’s test was used to assess the difference between these models. RESULTS: A total of 208 (22%) claimants experienced an improvement in WAS. The standard multinomial logit model predicted a relevant improvement in WAS for only 9% of the claimants [positive predictive value (PPV) 62%]. The weighted model predicted significantly more cases, 14% (PPV 63%). Predictive variables were several physical and mental functioning factors, work status, wage loss, and WAS at baseline. CONCLUSION: This study showed that there are indications that weighted regression procedures can correctly identify more individuals who experience a relevant change in WAS compared to standard multinomial logit models. Our findings suggest that weighted analysis could be an effective method in epidemiology when predicting rare events or diseases.
- Published
- 2020
- Full Text
- View/download PDF
31. Computer simulation of the homogeneous nucleation of ice
- Author
-
Reinhardt, Aleks and Doye, Jonathan P. K.
- Subjects
547.214 ,Chemical kinetics ,Computational chemistry ,Physical & theoretical chemistry ,Theoretical chemistry ,Materials modelling ,Condensed matter theory ,homogeneous nucleation ,ice ,water ,simulation ,rare event ,Monte Carlo - Abstract
In this work, we wish to determine the free energy landscape and the nucleation rate associated with the process of homogeneous ice nucleation. To do this, we simulate the homogeneous nucleation of ice with the mW monatomic model of water and with all-atom models of water using primarily the umbrella sampling rare event method. We find that the use of the mW model of water, which has simpler dynamics compared to all-atom models of water, but is nevertheless surprisingly good at reproducing experimental data, results in very reasonable agreement with classical nucleation theory, in contrast to some previous simulations of homogeneous ice nucleation. We suggest that previous simulations did not observe the lowest free energy pathway in order parameter space because of their use of global order parameters, leading to a deviation from classical nucleation theory predictions. Whilst monatomic water can nucleate reasonably quickly, all-atom models of water are considerably more difficult to simulate, primarily because of their slow dynamics of ice growth and the fact that standard order parameters do not work well in driving nucleation when such models are being used. In this thesis, we describe a local, rotationally invariant order parameter that is capable of growing ice homogeneously in a biassed simulation without the unnatural effects introduced by global order parameters, and without leading to non-physical chain-like growth of 'ice' clusters that results from a naïve implementation of the standard Steinhardt-Ten Wolde order parameter. We have successfully used this order parameter to force the growth of ice clusters in simulations of all-atom models of water. However, although ice growth can be achieved, equilibrating simulations with all-atom models of water is extremely difficult. We describe several approaches to speeding up the equilibration in all-atom models of water to enable the computation of free energy profiles for homogeneous ice nucleation.
- Published
- 2013
32. Transformative rare events: Leveraging digital affordance actualisation.
- Author
-
Henningsson, Stefan, Kettinger, William J., Zhang, Chen, and Vaidyanathan, Nageswaran
- Abstract
This paper conceptualises the COVID-19 pandemic as a "rare event." Rare events channel managerial attention to magnified issues and foster resource mobilisation and learning. We draw on a case study of a US consumer lender to develop a model explaining how organisations actualise digital affordances as part of their rare event response and, in doing so, leverage the transformative experience towards establishing a "new normal." The model and its instantiation contribute conceptual understanding and advice for how IS managers may effectively address rare events and, in particular, the COVID-19 pandemic, including the aftermath of its lockdown and the transition to the new business status quo. The model emphasises the importance of understanding the evolution of digital affordances as possessing teleological paths where affordances are developed in steps corresponding to where an organisation focuses its managerial attention, with indirect consequences of possibilities to attend to other objectives enabled by digital technologies. Overall, the model contributes to theory by explaining the role of rare events in the evolution of affordances, including some that can be transformative and introducing the rare events literature into the IS discipline. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
33. Individual-Level Fatality Prediction of COVID-19 Patients Using AI Methods
- Author
-
Yun Li, Melanie Alfonzo Horowitz, Jiakang Liu, Aaron Chew, Hai Lan, Qian Liu, Dexuan Sha, and Chaowei Yang
- Subjects
COVID-19 ,machine learning ,deep learning ,pandemic ,rare event ,fatality prediction ,Public aspects of medicine ,RA1-1270 - Abstract
The global covid-19 pandemic puts great pressure on medical resources worldwide and leads healthcare professionals to question which individuals are in imminent need of care. With appropriate data of each patient, hospitals can heuristically predict whether or not a patient requires immediate care. We adopted a deep learning model to predict fatality of individuals tested positive given the patient's underlying health conditions, age, sex, and other factors. As the allocation of resources toward a vulnerable patient could mean the difference between life and death, a fatality prediction model serves as a valuable tool to healthcare workers in prioritizing resources and hospital space. The models adopted were evaluated and refined using the metrics of accuracy, specificity, and sensitivity. After data preprocessing and training, our model is able to predict whether a covid-19 confirmed patient is likely to be dead or not, given their information and disposition. The metrics between the different models are compared. Results indicate that the deep learning model outperforms other machine learning models to solve this rare event prediction problem.
- Published
- 2020
- Full Text
- View/download PDF
34. PyVisA: Visualization and Analysis of path sampling trajectories.
- Author
-
Aarøen, Ola, Kiær, Henrik, and Riccardi, Enrico
- Subjects
- *
PATH analysis (Statistics) , *MOLECULAR dynamics , *PROTON transfer reactions , *VISUALIZATION , *LATENT variables , *PYTHON programming language - Abstract
Rare event methods applied to molecular simulations are growing in popularity, accessible and customizable software solutions have thus been developed and released. One of the most recent is PyRETIS, an open Python library for performing path sampling simulations. Here, we introduce PyVisA, a postprocessing package for path sampling simulations, which includes visualization and analysis tools for interpreting path sampling outputs. PyVisA integrates PyRETIS functionalities and aims to facilitate the determination of: (a) the correlation of the order parameter with other descriptors; (b) the presence of latent variables; and (c) intermediate meta‐stable states. To illustrate some of the main PyVisA features, we investigate the proton transfer reaction in a protonated water trimer simulated via a simple polarizable model (Stillinger−David). [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
35. Firth's penalized method in Cox proportional hazard framework for developing predictive models for sparse or heavily censored survival data.
- Author
-
Adhikary, Avizit C. and Shafiqur Rahman, M.
- Subjects
- *
SURVIVAL analysis (Biometry) , *PREDICTION models , *ESTUARIES , *CENSORING (Statistics) , *PROPORTIONAL hazards models , *BREAST cancer - Abstract
This paper explored the use of Firth's penalized method in the Cox PH framework, which was originally proposed for solving the problem of separation, for developing prediction model for sparse or heavily censored survival data. An extensive simulation study, based on both breast cancer data and simulated data, were conducted to evaluate the predictive performance of the Firth's penalized model over the standard Cox model. The predictive performance of the models developed in training data were assessed in test data by estimating some well-known performance measures such as calibration slope and concordance statistics for both models and compared their results. The results revealed that Firth's penalized model showed substantial improvement over the MLE-based standard Cox model by providing accurate estimate of the true predictive (discriminative) performance and removing overfitting to some extent. The methods were further illustrated using birth-interval data with high percentage of censoring and the results support the simulation findings. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
36. Meta-analysis of rare adverse events in randomized clinical trials: Bayesian and frequentist methods.
- Author
-
Hong, Hwanhee, Wang, Chenguang, and Rosner, Gary L
- Subjects
CONFIDENCE intervals ,META-analysis ,PROBABILITY theory ,RANDOMIZED controlled trials ,ROSIGLITAZONE ,ADVERSE health care events ,STATISTICAL models ,DESCRIPTIVE statistics - Abstract
Background/aims: Regulatory approval of a drug or device involves an assessment of not only the benefits but also the risks of adverse events associated with the therapeutic agent. Although randomized controlled trials (RCTs) are the gold standard for evaluating effectiveness, the number of treated patients in a single RCT may not be enough to detect a rare but serious side effect of the treatment. Meta-analysis plays an important role in the evaluation of the safety of medical products and has advantage over analyzing a single RCT when estimating the rate of adverse events. Methods: In this article, we compare 15 widely used meta-analysis models under both Bayesian and frequentist frameworks when outcomes are extremely infrequent or rare. We present extensive simulation study results and then apply these methods to a real meta-analysis that considers RCTs investigating the effect of rosiglitazone on the risks of myocardial infarction and of death from cardiovascular causes. Results: Our simulation studies suggest that the beta hyperprior method modeling treatment group-specific parameters and accounting for heterogeneity performs the best. Most models ignoring between-study heterogeneity give poor coverage probability when such heterogeneity exists. In the data analysis, different methods provide a wide range of log odds ratio estimates between rosiglitazone and control treatments with a mixed conclusion on their statistical significance based on 95% confidence (or credible) intervals. Conclusion: In the rare event setting, treatment effect estimates obtained from traditional meta-analytic methods may be biased and provide poor coverage probability. This trend worsens when the data have large between-study heterogeneity. In general, we recommend methods that first estimate the summaries of treatment-specific risks across studies and then relative treatment effects based on the summaries when appropriate. Furthermore, we recommend fitting various methods, comparing the results and model performance, and investigating any significant discrepancies among them. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
37. Global optimization-based dimer method for finding saddle points.
- Author
-
Yu, Bing and Zhang, Lei
- Subjects
ANT algorithms ,POTENTIAL energy surfaces ,GLOBAL optimization ,SADDLERY ,MATHEMATICAL optimization - Abstract
Searching saddle points on the potential energy surface is a challenging problem in the rare event. When there exist multiple saddle points, sampling different initial guesses are needed in local search methods in order to find distinct saddle points. In this paper, we present a novel global optimization-based dimer method (GOD) to efficiently search saddle points by coupling ant colony optimization (ACO) algorithm with optimization-based shrinking dimer (OSD) method. In particular, we apply OSD method as a local search algorithm for saddle points and construct a pheromone function in ACO to update the global population. By applying a two-dimensional example and a benchmark problem of seven-atom island on the (111) surface of an FCC crystal, we demonstrate that GOD shows a significant improvement in computational efficiency compared with OSD method. Our algorithm is the first try to apply the global optimization technique in searching saddle points and it offers a new framework to open up possibilities of adopting other global optimization methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
38. Sampling Conditionally on a Rare Event via Generalized Splitting.
- Author
-
Botev, Zdravko I. and L'Ecuyer, Pierre
- Subjects
- *
MONTE Carlo method , *COMPUTATIONAL statistics , *APPROXIMATION error , *MARKOV chain Monte Carlo , *OPERATIONS research , *ALGORITHMS - Abstract
We propose and analyze a generalized splitting method to sample approximately from a distribution conditional on the occurrence of a rare event. This has important applications in a variety of contexts in operations research, engineering, and computational statistics. The method uses independent trials starting from a single particle. We exploit this independence to obtain asymptotic and nonasymptotic bounds on the total variation error of the sampler. Our main finding is that the approximation error depends crucially on the relative variability of the number of points produced by the splitting algorithm in one run and that this relative variability can be readily estimated via simulation. We illustrate the relevance of the proposed method on an application in which one needs to sample (approximately) from an intractable posterior density in Bayesian inference. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
39. A Universal Splitting Estimator for the Performance Evaluation of Wireless Communications Systems.
- Author
-
Rached, Nadhir Ben, Mackinlay, Daniel, Botev, Zdravko, Tempone, Raul, and Alouini, Mohamed-Slim
- Abstract
We propose a unified rare-event estimator for the performance evaluation of wireless communication systems. The estimator is derived from the well-known multilevel splitting algorithm. In its original form, the splitting algorithm cannot be applied to the simulation and estimation of time-independent problems, because splitting requires an underlying continuous-time Markov process whose trajectories can be split. We tackle this problem by embedding the static problem of interest within a continuous-time Markov process, so that the target time-independent distribution becomes the distribution of the Markov process at a given time instant. The main feature of the proposed multilevel splitting algorithm is its large scope of applicability. For illustration, we show how the same algorithm can be applied to the problem of estimating the cumulative distribution function (CDF) of sums of random variables (RVs), the CDF of partial sums of ordered RVs, the CDF of ratios of RVs, and the CDF of weighted sums of Poisson RVs. We investigate the computational efficiency of the proposed estimator via a number of simulation studies and find that it compares favorably with existing estimators. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
40. VARIATIONAL PHASE FIELD FORMULATIONS OF POLARIZATION AND PHASE TRANSITION IN FERROELECTRIC THIN FILMS.
- Author
-
QIANG DU, RUOTAI LI, and LEI ZHANG
- Subjects
- *
FERROELECTRIC thin films , *FERROELECTRIC transitions , *PHASE transitions , *ELECTROSTATIC fields , *ELECTROSTATIC interaction , *LATTICE dynamics - Abstract
The electric field plays an important role in ferroelectric phase transition. There have been numerous phase field formulations attempting to account for electrostatic interactions subject to different boundary conditions. In this paper, we develop new variational forms of the phase field electrostatic energy and the relaxation dynamics of the polarization vector that involves a hybrid representation in both real and Fourier variables. The new formulations avoid ambiguities appearing in earlier studies and lead to much more effective ways to perform variational studies and numerical simulations. Computations of phase transition and polarization switching in a single domain by applying the new formulations are provided as illustrative examples. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
41. Efficient estimation of extreme quantiles using adaptive kriging and importance sampling.
- Author
-
Razaaly, Nassim, Crommelin, Daan, and Congedo, Pietro Marco
- Subjects
QUANTILES ,PROBABILITY measures ,QUANTILE regression ,KRIGING ,GAUSSIAN processes ,STANDARD deviations - Abstract
SUMMARY: This study considers an efficient method for the estimation of quantiles associated to very small levels of probability (up to O(10−9)), where the scalar performance function J is complex (eg, output of an expensive‐to‐run finite element model), under a probability measure that can be recast as a multivariate standard Gaussian law using an isoprobabilistic transformation. A surrogate‐based approach (Gaussian Processes) combined with adaptive experimental designs allows to iteratively increase the accuracy of the surrogate while keeping the overall number of J evaluations low. Direct use of Monte‐Carlo simulation even on the surrogate model being too expensive, the key idea consists in using an importance sampling method based on an isotropic‐centered Gaussian with large standard deviation permitting a cheap estimation of small quantiles based on the surrogate model. Similar to AK‐MCS as presented in the work of Schöbi et al., (2016), the surrogate is adaptively refined using a parallel infill criterion of an algorithm suitable for very small failure probability estimation. Additionally, a multi‐quantile selection approach is developed, allowing to further exploit high‐performance computing architectures. We illustrate the performances of the proposed method on several two to eight‐dimensional cases. Accurate results are obtained with less than 100 evaluations of J on the considered benchmark cases. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
42. Knowledge-inspired data-driven prediction of overheating risks in flexible thermal-power plants.
- Author
-
Wang, Zhimin, Huang, Qian, Liu, Guanqing, Wang, Kexuan, Lyu, Junfu, and Li, Shuiqing
- Subjects
- *
EXTREME value theory , *FALSE alarms , *RENEWABLE energy sources , *COAL-fired power plants , *PREDICTION models , *FORECASTING , *TUBES , *POWER plants - Abstract
Mechanism-data-integrated methods are promising technologies for safe and flexible operation of power stations, which play an important role in compensating for the renewable energy intermittency and fluctuation. As an attempt in this direction, this work is devoted to the tube overheating problem in the boiler with the aim of developing effective predictive methods. First, we obtain insights from the real incidents with excessive metal temperatures. With data collected over a six-month period from a 350-MW cogeneration unit operating in a flexible mode, we quantified 230 overheat events using steam-pressure-dependent permissible limits. Power law distributions with tails are revealed for the severity of overheating, and three types of events can be classified. Among them, the 'moderate-type' overheating can be accompanied with simultaneous overheating of multiple neighboring tubes, and is thus recognized as the primary target for the real-time prediction. Our data-driven model rests on the long-short-term memory neutral network and gives satisfactory outputs under normal operating conditions with a mean absolute error of 3.40 °C. However, the original model fails to reach the extreme values whilst tube overheating due to severe dataset imbalance as only 0.012% of the data correspond to overheating. Finally, we devise an additional strategy making use of the change in the predictive capability of the model. The integrated method successfully predicts all overheat events of a tube over two minutes in advance, and the false alert is kept at a minimum level. • Simultaneous overheating of multiple adjacent tubes is found from operating data. • The severity of tube overheating exhibits power law distributions in frequency. • Data driven model fails to predict tube overheat due to a lack of training samples. • A new strategy is proposed that combines the rarity nature of tube overheating. • Our method predicts tube overheat two minutes in advance with a minimum false rate. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. A hyper-spherical adaptive sparse-grid method for high-dimensional discontinuity detection
- Author
-
Burkardt, John [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)]
- Published
- 2014
- Full Text
- View/download PDF
44. Predicting future changes in the work ability of individuals receiving a work disability benefit: weighted analysis of longitudinal data.
- Author
-
Louwerse, Ilse, Huysmans, Maaike A., van Rijssen, Jolanda H. J., Schaafsma, Frederieke G., Weerdesteijn, Kristel H. N., van der Beek, Allard J., and Anema, Johannes R.
- Subjects
DATA analysis ,DISABILITIES ,SOCIAL security ,DATABASE security ,RARE diseases - Abstract
Objectives Weighted regression procedures can be an efficient solution for cohort studies that involve rare events or diseases, which can be difficult to predict, allowing for more accurate prediction of cases of interest. The aims of this study were to (i) predict changes in work ability at one year after approval of the work disability benefit and (ii) explore whether weighted regression procedures could improve the accuracy of predicting claimants with the highest probability of experiencing a relevant change in work ability. Methods The study population consisted of 944 individuals who were granted a work disability benefit. Selfreported questionnaire data measured at baseline were linked with administrative data from Dutch Social Security Institute databases. Standard and weighted multinomial logit models were fitted to predict changes in the work ability score (WAS) at one-year follow-up. McNemar's test was used to assess the difference between these models. Results A total of 208 (22%) claimants experienced an improvement in WAS. The standard multinomial logit model predicted a relevant improvement in WAS for only 9% of the claimants [positive predictive value (PPV) 62%]. The weighted model predicted significantly more cases, 14% (PPV 63%). Predictive variables were several physical and mental functioning factors, work status, wage loss, and WAS at baseline. Conclusion This study showed that there are indications that weighted regression procedures can correctly identify more individuals who experience a relevant change in WAS compared to standard multinomial logit models. Our findings suggest that weighted analysis could be an effective method in epidemiology when predicting rare events or diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
45. PyRETIS 2: An improbability drive for rare events.
- Author
-
Riccardi, Enrico, Lervik, Anders, Roet, Sander, Aarøen, Ola, and Erp, Titus S.
- Subjects
- *
USER interfaces , *COMPUTATIONAL chemistry , *MOLECULAR dynamics , *PYTHON programming language , *SAMPLING methods , *PERIODICAL publishing - Abstract
The algorithmic development in the field of path sampling has made tremendous progress in recent years. Although the original transition path sampling method was mostly used as a qualitative tool to sample reaction paths, the more recent family of interface‐based path sampling methods has paved the way for more quantitative rate calculation studies. Of the exact methods, the replica exchange transition interface sampling (RETIS) method is the most efficient, but rather difficult to implement. This has been the main motivation to develop the open‐source Python‐based computer library PyRETIS that was released in 2017. PyRETIS is designed to be easily interfaced with any molecular dynamics (MD) package using either classical or ab initio MD. In this study, we report on the principles and the software enhancements that are now included in PyRETIS 2, as well as the recent developments on the user interface, improvements of the efficiency via the implementation of new shooting moves, easier initialization procedures, analysis methods, and supported interfaced software. © 2019 The Authors. Journal of Computational Chemistry published by Wiley Periodicals, Inc. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
46. Markov modulated jump-diffusions for currency options when regime switching risk is priced.
- Author
-
Liu, David
- Abstract
In the current literature, regime-switching risk is NOT priced in the Markov-modulated jump-diffusion models for currency options. We therefore develop a hidden Markov-modulated jump-diffusion model under the regime-switching economy where the regime-switching risk is priced. In the model, the dynamics of the spot foreign exchange rate captures both the rare events and the time-inhomogeneity in the fluctuating currency market. In particular, the rare events are described by a compound Poisson process with log-normal jump amplitude, and the time-varying rates are formulated by a continuous-time finite-state Markov chain. Unlike previous research, the proposed model can price regime-switching risk, in addition to diffusion risk and jump risk, based on the Esscher transform conditional on a single initial regime of economy. Numerical experiments are conducted and their results reveal that the impact of pricing regime-switching risk on the currency option prices does not seem significant in contradictory to the findings made by Siu and Yang [Siu, TK and H Yang (2009). Option Pricing When The Regime-Switching Risk is priced. Acta Mathematicae Applicatae Sinica, English Series, Vol. 25, No. 3, pp. 369–388]. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
47. Transition pathways between defect patterns in confined nematic liquid crystals.
- Author
-
Han, Yucen, Hu, Yucheng, Zhang, Pingwen, and Zhang, Lei
- Subjects
- *
LYOTROPIC liquid crystals , *ACTIVATION energy , *NEMATIC liquid crystals , *ENERGY level transitions - Abstract
Confined nematic liquid crystals can admit multiple stable equilibria. These equilibrium states usually exhibit peculiar defect patterns in order to satisfy the topological constraints imposed by the boundary conditions. In this work, we systematically investigate transition pathways between different stable equilibria of nematic liquid crystals confined in a cylinder with homeotropic boundary condition. To do so, we implement spectral method to minimize the Landau-de Gennes free energy and develop a multiscale string method to accurately compute both minimal energy path and transition state. We first compute the transition pathways under the assumption that system state is invariant along the axial direction of the cylinder. This axial invariant assumption allows us to reduce the simulation domain to a two-dimensional disk. We subsequently remove the axial invariant assumption to search the transition pathways in the three-dimensional cylinder. Numerical simulations demonstrate that there exists a threshold for the cylinder height above which a domino-like transition pathway is more energetically favored than the axial invariant transition pathway. Moreover, we show that the transition pathway from escape radial with ring defects (ERRD) to escape radial has a lower energy barrier than the one from ERRD to planar polar, even though the free energy of the planar polar state is lower than that of the escape radial state. Our approach provides an accurate and efficient method to compute minimal energy path and transition state of confined nematic liquid crystals. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
48. HIGH-INDEX OPTIMIZATION-BASED SHRINKING DIMER METHOD FOR FINDING HIGH-INDEX SADDLE POINTS.
- Author
-
JIANYUAN YIN, LEI ZHANG, and PINGWEN ZHANG
- Subjects
- *
CONJUGATE gradient methods , *HESSIAN matrices , *SADDLERY - Abstract
We present a high-index optimization-based shrinking dimer (HiOSD) method to compute index-k saddle points as a generalization of the optimization-based shrinking dimer method for index-1 saddle points [L. Zhang, Q. Du, and Z. Zheng, SIAM J. Sci. Comput., 38 (2016), pp. A528{A544]. We first formulate a minimax problem for an index-k saddle point that is a local maximum on a k-dimensional manifold and a local minimum on its orthogonal complement. The k-dimensional maximal subspace is spanned by the k eigenvectors corresponding to the smallest k eigenvalues of the Hessian, which can be constructed by the simultaneous Rayleigh-quotient minimization technique or the locally optimal block preconditioned conjugate gradient method. Under the minimax framework, we implement the Barzilai{Borwein gradient method to speed up the convergence. We demonstrate the eficiency of the HiOSD method for computing high-index saddle points by applying finite-dimensional examples and semilinear elliptic problems. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
49. Machine Learning and Sampling Scheme: An Empirical Study of Money Laundering Detection.
- Author
-
Zhang, Yan and Trubey, Peter
- Subjects
MONEY laundering ,MACHINE learning ,ARTIFICIAL neural networks ,SUPPORT vector machines ,DECISION trees - Abstract
This paper studies the interplay of machine learning and sampling scheme in an empirical analysis of money laundering detection algorithms. Using actual transaction data provided by a U.S. financial institution, we study five major machine learning algorithms including Bayes logistic regression, decision tree, random forest, support vector machine, and artificial neural network. As the incidence of money laundering events is rare, we apply and compare two sampling techniques that increase the relative presence of the events. Our analysis reveals potential advantages of machine learning algorithms in modeling money laundering events. This paper provides insights into the use of machine learning and sampling schemes in money laundering detection specifically, and classification of rare events in general. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
50. Prediction of default probability by using statistical models for rare events.
- Author
-
Ogundimu, Emmanuel O.
- Subjects
STATISTICAL models ,CREDIT ratings ,PROBABILITY theory ,LOGISTIC regression analysis ,CREDIT cards ,MULTICOLLINEARITY - Abstract
Summary: Prediction models in credit scoring usually involve the use of data sets with highly imbalanced distributions of the event of interest (default). Logistic regression, which is widely used to estimate the probability of default, PD, often suffers from the problem of separation when the event of interest is rare and consequently poor predictive performance of the minority class in small samples. A common solution is to discard majority class examples, to duplicate minority class examples or to use a combination of both to balance the data. These methods may overfit data. It is unclear how penalized regression models such as Firth's estimator, which reduces bias and mean‐square error relative to classical logistic regression, performs in modelling PD. We review some methods for class imbalanced data and compare them in a simulation study using the Taiwan credit card data. We emphasize the effect of events per variable for developing an accurate model—an often neglected concept in PD‐modelling. The data balancing techniques that are considered are the random oversampling examples and synthetic minority oversampling technique methods. The results indicate that the synthetic minority oversampling technique improved predictive accuracy of PD regardless of sample size. Among the penalized regression models that are analysed, the log‐F prior and ridge regression methods are preferred. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.