Descriptor: "Cross validation" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Cross validation"' showing total 2,052 results

Start Over Descriptor "Cross validation"

2,052 results on '"Cross validation"'

1. Improved BPNN models based on different algorithms to predict the flexural capacity of corroded RC beams

Author: Wang, Huxiang, Bao, Chao, Ma, Xiaotong, Alshaikh, Ibrahim M.H., Al-Gaboby, Ziyad, and Cao, Jixing
Published: 2025
Full Text: View/download PDF

2. Random forest with feature selection and K-fold cross validation for predicting the electrical and thermal efficiencies of air based photovoltaic-thermal systems

Author: Ait tchakoucht, Taha, Elkari, Badr, Chaibi, Yassine, and Kousksou, Tarik
Published: 2024
Full Text: View/download PDF

3. Unveiling the potential of operating time in improving machine learning models’ performance for waste biomass gasification systems

Author: Olca, Kadriye Deniz and Yücel, Özgün
Published: 2024
Full Text: View/download PDF

4. SmartScanPCOS: A feature-driven approach to cutting-edge prediction of Polycystic Ovary Syndrome using Machine Learning and Explainable Artificial Intelligence

Author: G, Umaa Mahesswari and P, Uma Maheswari
Published: 2024
Full Text: View/download PDF

5. Prediction of leaf nitrogen in sugarcane (Saccharum spp.) by Vis-NIR-SWIR spectroradiometry

Author: Fiorio, Peterson Ricardo, Silva, Carlos Augusto Alves Cardoso, Rizzo, Rodnei, Demattê, José Alexandre Melo, Luciano, Ana Cláudia dos Santos, and Silva, Marcelo Andrade da
Published: 2024
Full Text: View/download PDF

6. Comparative Analysis of Machine Learning Algorithms for Prostate Cancer

Author: Thakur, Bharti, Abhinav, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Gunjan, Vinit Kumar, editor, and Zurada, Jacek M., editor
Published: 2025
Full Text: View/download PDF

7. Intelligent Segmentation Algorithm for Medical Images Based on Deep Learning Technology

Author: Zhao, Xiaoyu, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Hung, Jason C., editor, Yen, Neil, editor, and Chang, Jia-Wei, editor
Published: 2025
Full Text: View/download PDF

8. Development and validation of QSPR models for corrosion inhibition of carbon steel by some pyridazine derivatives in acidic medium

Author: El Assiri, El Hassan, Driouch, Majid, Lazrak, Jamila, Bensouda, Zakariae, Elhaloui, Ali, Sfaira, Mouhcine, Saffaj, Taoufiq, and Taleb, Mustapha
Published: 2020
Full Text: View/download PDF

9. Nested genetic algorithm-based classifier selection and placement in multi-level ensemble framework for effective disease diagnosis.

Author: Arukonda, Srinivas and Cheruku, Ramalingaswamy
Subjects: *MACHINE learning, *EVOLUTIONARY algorithms, *CHRONIC kidney failure, *GENETIC algorithms, *RECEIVER operating characteristic curves
Abstract: Effective disease diagnosis is a critical unmet need on a global scale. The intricacies of the numerous disease mechanisms and underlying symptoms make developing a model for early diagnosis and effective treatment extremely difficult. Machine learning (ML) can help to solve some of these issues. Recently, various ensemble-based ML models have benefited clinicians in early diagnosis. However, one of the most difficult challenges in multi-level ensemble approaches is the classifier selection and their placement in the ensemble framework as it improves the overall performance. Let m classifiers have to select from n classifiers there are ( n m ) ways. Again, these ( n m ) possibilities can be arranged in m ! ways. Finding the best m classifiers and their positions from total ( n m ) m ! ways is a challenging and hard problem. To address this challenge, a dynamic three-level ensemble framework is proposed. A nested Genetic Algorithm (GA) and ensemble-based fitness function are employed to optimize the classifier selection and their placement in a three-level ensemble framework. Our approach used eleven classifiers and chose seven classifiers by maximizing the fitness function. The proposed model experiments on 12 disease datasets. The proposed model outperformed in terms of accuracy, F1, and G-measure on the Chronic Kidney Disease (CKD) dataset is 0.987, 0.988, and 0.989, respectively. In terms of AUC on the Heart disease dataset (HDD) is 0.998 and in terms of recall on the Hypothyroid disease dataset (HyDD) is 0.988. In addition, the proposed model superiority is statically evaluated by Wilcoxon-Signed-Rank (WSR) test compared with other ensemble models, such as random forest (RF), bagging classifier (BC), XGBoost (XGB), and gradient boost classifier (GBC) with probability value p < 0.05 results shows all the traditional ensemble model differs with proposed model and also effective size evaluated with using the matched-pairs rank biserial correlation coefficient wc and statistical results shows effective size is large with RF and BC and effective size is medium with XGB and GBC. Proposed model has outperformed comparing with State-Of-The-Art (SOTA) ensemble and non-ensemble models. Further, the proposed model outperformed in terms of the ROC curve in the majority of the disease datasets. The results suggest the usage of the proposed model for disease diagnosis applications. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

10. Numerical properties of solutions of LASSO regression.

Author: Lakshmi, Mayur V. and Winkler, Joab R.
Subjects: *CONSTRAINT satisfaction, *LEAST squares, *LINEAR systems, *EQUATIONS
Abstract: The determination of a concise model of a linear system when there are fewer samples m than predictors n requires the solution of the equation A x = b , where A ∈ R m × n and rank A = m , such that the selected solution from the infinite number of solutions is sparse, that is, many of its components are zero. This leads to the minimisation with respect to x of f (x , λ) = ‖ A x − b ‖ 2 2 + λ ‖ x ‖ 1 , where λ is the regularisation parameter. This problem, which is called LASSO regression, yields a family of functions x lasso (λ) and it is necessary to determine the optimal value of λ , that is, the value of λ that balances the fidelity of the model, ‖ A x lasso (λ) − b ‖ ≈ 0 , and the satisfaction of the constraint that x lasso (λ) be sparse. The aim of this paper is an investigation of the numerical properties of x lasso (λ) , and the main conclusion of this investigation is the incompatibility of sparsity and stability, that is, a sparse solution x lasso (λ) that preserves the fidelity of the model exists if the least squares (LS) solution x ls = A † b is unstable. Two methods, cross validation and the L-curve, for the computation of the optimal value of λ are compared and it is shown that the L-curve yields significantly better results. This difference between stable and unstable solutions x ls of the LS problem manifests itself in the very different forms of the L-curve for these two solutions. The paper includes examples of stable and unstable solutions x ls that demonstrate the theory. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

11. The development and validation of the Student Self-feedback Behavior Scale.

Author: Yang, Yongle, Yan, Zi, Zhu, Jinyu, Guo, Wuyuan, Wu, Junsheng, and Huang, Bingjun
Subjects: EXPLORATORY factor analysis, CONFIRMATORY factor analysis, CHINESE-speaking students, HIGH school students, STUDENT development
Abstract: Though the importance and benefits of students' active role in the feedback process have been widely discussed in the literature, an instrument for measuring students' self-feedback behavior is still lacking. This paper reports the development and validation of the Self-feedback Behavior Scale (SfBS), which comprises three dimensions (seeking, processing, and using feedback). The SfBS items were constructed in line with the self-feedback behavioral model. One thousand two hundred fifty-two high school students (Grade 10 to Grade 12) in mainland China participated in this survey. The exploratory factor analysis revealed a three-factor model reaffirmed in the confirmatory factor analysis. The multi-group CFA supported the measurement invariance of the SfBS across gender. Using the SfBS can help researchers and teachers better understand students' self-feedback behavior and optimize benefits derived from the self-feedback process. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

12. An Effective Methodology for Diabetes Prediction in the Case of Class Imbalance.

Author: Toleva, Borislava, Atanasov, Ivan, Ivanov, Ivan, and Hooper, Vincent
Subjects: *MACHINE learning, *ETIOLOGY of diabetes, *BLOOD sugar, *HUMAN body, *DATA modeling, *DEEP learning
Abstract: Diabetes causes an increase in the level of blood sugar, which leads to damage to various parts of the human body. Diabetes data are used not only for providing a deeper understanding of the treatment mechanisms but also for predicting the probability that one might become sick. This paper proposes a novel methodology to perform classification in the case of heavy class imbalance, as observed in the PIMA diabetes dataset. The proposed methodology uses two novel steps, namely resampling and random shuffling prior to defining the classification model. The methodology is tested with two versions of cross validation that are appropriate in cases of class imbalance—k-fold cross validation and stratified k-fold cross validation. Our findings suggest that when having imbalanced data, shuffling the data randomly prior to a train/test split can help improve estimation metrics. Our methodology can outperform existing machine learning algorithms and complex deep learning models. Applying our proposed methodology is a simple and fast way to predict labels with class imbalance. It does not require additional techniques to balance classes. It does not involve preselecting important variables, which saves time and makes the model easy for analysis. This makes it an effective methodology for initial and further modeling of data with class imbalance. Moreover, our methodologies show how to increase the effectiveness of the machine learning models based on the standard approaches and make them more reliable. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

13. A Robust EfficientNetV2-S Classifier for Predicting Acute Lymphoblastic Leukemia Based on Cross Validation.

Author: Abd El-Aziz, A. A., Mahmood, Mahmood A., and Abd El-Ghany, Sameh
Subjects: *LEUKOCYTES, *LYMPHOBLASTIC leukemia, *NOSOLOGY, *CELL anatomy, *ACUTE leukemia
Abstract: This research addresses the challenges of early detection of Acute Lymphoblastic Leukemia (ALL), a life-threatening blood cancer particularly prevalent in children. Manual diagnosis of ALL is often error-prone, time-consuming, and reliant on expert interpretation, leading to delays in treatment. This study proposes an automated binary classification model based on the EfficientNetV2-S architecture to overcome these limitations, enhanced with 5-fold cross-validation (5KCV) for robust performance. A novel aspect of this research lies in leveraging the symmetry concepts of symmetric and asymmetric patterns within the microscopic imagery of white blood cells. Symmetry plays a critical role in distinguishing typical cellular structures (symmetric) from the abnormal morphological patterns (asymmetric) characteristic of ALL. By integrating insights from generative modeling techniques, the study explores how asymmetric distortions in cellular structures can serve as key markers for disease classification. The EfficientNetV2-S model was trained and validated using the normalized C-NMC_Leukemia dataset, achieving exceptional metrics: 97.34% accuracy, recall, precision, specificity, and F1-score. Comparative analysis showed the model outperforms recent classifiers, making it highly effective for distinguishing abnormal white blood cells. This approach accelerates diagnosis, reduces costs, and improves patient outcomes, offering a transformative tool for early ALL detection and treatment planning. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

14. Can Level-2 Firth’s Bias-reduced logistic regression be considered a robust approach for predicting landslide susceptibility?

Author: Pradhan, Ananta Man Singh, Shrestha, Suchita, Lee, Ji-Sung, and Kim, Yun-Tae
Abstract: The implementation of effective landslide mitigation strategies relies heavily on the availability of accurate and reliable landslide susceptibility map. This study focuses on the adequacy evaluation of the Level-2 Firth’s Bias-Reduced Logistic Regression (BLR) to predict landslide susceptibility. The study was performed at the mountain Seunghak which lies in the southern-west part of Busan. A total of 57 multi-temporal landslides since 2006 to 2019 were identified and plotted in geographic information system (GIS) environment. Although, twelve spatial environmental variables were selected for the analysis, topographic wetness index was removed to avoid a collinearity issue. The dataset was randomly divided into two sets: training set (70%) and test set (30%), ensuring they did not overlap. In order to assess the performance of the model, two different cross-validation methods i.e. random cross-validation (RCV) and spatial cross-validation (SCV) were applied. The overall accuracy was examined using area under the curve of receiver operating characteristic curve (mean AUC of RCV = 0.965, mean AUC of SCV = 0.939). The true positive and true negative values depicted correctly which showed the excellent adequacy of the prediction of the landslide occurrences. Among eleven environmental variables, slope played a significant role in the result of landslide prediction. The susceptibility estimation component of BLR model outperformed a standard logistic regression (LR) model, which we used as a benchmark. LR is the most widely used classifier in landslide research, making it a key point of comparison. In our discussion, we explored the strength and weaknesses of the new modeling framework and its potential applicability in various domains. We highlighted both the specific considerations related to hazards and geomorphology, as well as the broad implications of its application. [ABSTRACT FROM AUTHOR]
Published: 2025
Full Text: View/download PDF

15. Cross-validation of methods for the quantitative determination of phenazepam and its active metabolite in human blood plasma at various extractions

Author: A. I. Platova, I. I. Kuzmin, D. V. Ivaschenko, and I. I. Miroshnichenko
Subjects: phenazepam, 3-oxyphenazepam, chromatography, mass spectrometry, cross validation, bland altman analysis, deming regression, accuracy, precision, Pharmaceutical industry, HD9665-9675
Abstract: Introduction. In conducting of therapeutic drug monitoring (TDM), often such situation arises where the drug concentration has measured by different methods or in different laboratories. To combine and analyze the data obtained with different methods, it is necessary to perform cross-validation procedure. Insufficient attention is paid to the statistical approaches used for this purpose.Aim. Performing cross-validation of different analytical methods for the quantitative determination of phenazepam (PHEN) and 3-hydroxyphenazepam (3-OH-PHEN) using the Bland – Altman analysis.Materials and methods. PHEN and 3-OH-PHEN concentrations in the blood plasma of patients (n = 100) with alcohol withdrawal syndrome were measured using high-performance liquid chromatography with tandem mass spectrometry (HPLC-MS/MS). The quantification of both analytes in each sample was measure twice by two different methods: solid phase extraction (SPE) and supported liquid extraction (SLE). Both methods have been fully validated before the experiment began. Cross-validation was performed at the end of the experiment using data from study samples. The Bland – Altman analysis was used to evaluate accuracy and precision. Deming regression was also used to identify a systematic error between measurement results.Results and discussion. The regression equations have been obtained between concentrations both analytes measured by different sample preparation methods. 95 % confidence intervals (CI) of the regression coefficients of both equations included one, and 95 % CI of the intercepts included zero. 95 % CI of the geometric mean of the individual SLE/SPE ratios was within the acceptable range (0.87; 1.15). These results confirm the absence of the influence of quantitative methods on the measurement of both analytes concentration. 66.7 % CI of the percent difference between two measurements was within acceptable limits (–0.2; 0.2), not exceeding 20 % of the range of their mean value. This confirms the acceptable precision between the methods. The estimated CIs were displayed in the Bland – Altman plots.Conclusion. The statistical approaches used in the work have confirmed the reproducibility of the results of different sample preparation methods. In addition to cross-validation, the statistical algorithm from this paper using Bland – Altman analysis can be successfully employed to assess accuracy and precision during bioanalytical method validation and evaluation of the acceptance of analytical runs, as well as to determine the level of reproducibility of incurred samples.
Published: 2024
Full Text: View/download PDF

16. Análisis y predicción del desempeño docente por medio de encuestas estudiantiles. Búsqueda de relaciones desde la minería de datos.

Author: Castrillón, Omar Danilo
Subjects: DEPENDENCY (Psychology), STUDENT surveys, DEPENDENT variables, TEACHER educators, DATA mining
Abstract: Copyright of Formación Universitaria is the property of Centro de Informacion Tecnologica (CIT) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Published: 2024
Full Text: View/download PDF

17. Predicting fungal infection sensitivity of sepals in harvested tomatoes using imaging spectroscopy and partial least squares discriminant analysis.

Author: Bertotto, Mercedes, de Villiers, Hendrik AC, Chauhan, Aneesh, Hogeveen-van Echtelt, Esther, Mensink, Manon, Grbovic, Zeljana, Stefanovic, Dimitrije, Panic, Marko, and Brdar, Sanja
Subjects: TOMATO diseases & pests, MYCOSES, TOMATO harvesting, TOMATO yields, TOMATO varieties, HYPERSPECTRAL imaging systems
Abstract: Tomatoes (Solanum lycopersicum L.) are a widely grown and globally traded vegetable, essential for both local consumption and international trade. However, approximately 30% of harvested tomato yields are lost due to fungal decay during postharvest handling. Timely disease identification is crucial to prevent such losses, but certain tomato varieties exhibit higher susceptibility to fungal infections than others. Additionally, there are variations in susceptibility among individual sepals, with unknown underlying causes. Traditional methods for assessing fungal presence in plants have limitations, such as sample destruction and a focus on symptom detection rather than evaluating susceptibility to fungal infection. Hence, there is a demand need for an accurate, non-destructive method capable of predicting susceptibility to fungal infection. The use of hyperspectral imaging (HSI) with chemometrics presents a pioneering approach to address this need. In this study, three tomato cultivars ('Brioso,' 'Cappricia,' and 'Provine') were studied. Hyperspectral images were captured on day-1 of harvest, followed by controlled fungal growth conditions. Ground truth assessments were conducted by three experts on day-3 and day-4, averaging severity scores assigned per sepal. The methodology involved extracting spectra from HSI images and calibrating and validating models using partial least squares discriminant analysis (PLSDA), aiming to optimize model parameters for accurate predictions. The models were categorized into those developed using data from a single variety (intravariety) and those utilizing data from multiple varieties combined (global models). The best-performing intravariety model was established using the Cappricia variety, achieving a balanced accuracy of 0.84. Conversely, a global model combining Cappricia and Provine varieties achieved a balanced accuracy of 0.70. Overall, the results suggest that distinguishing between more and less susceptible sepals is feasible under controlled conditions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Music statistics: uncertain logistic regression models with applications in analyzing music.

Author: Lu, Jue, Zhou, Lianlian, Zeng, Wenxing, and Li, Anshui
Subjects: REGRESSION analysis, LOGISTIC regression analysis, RESEARCH personnel, DATA analysis, STATISTICS, AMBIGUITY
Abstract: In the realm of data analysis, traditional statistical methods often struggle when faced with ambiguity and uncertainty inherent in real world data. Uncerainty theory, developed to better model and interpret such data, offers a promising alternative to conventional techniques. In this paper, we establish logistic regression models to initiate music statistics based on uncertainty theory. In particular, we will classify the music into different types named Baroque, Classical, Romantic, and Impressionism based on four characteristics: harmonic complexity, rhythmic complexity, texture complexity, and formal structure, with the help of the uncertain logistic models proposed. This theoretical framework for music classification provides a nuanced understanding of how music is interpreted under conditions of ambiguity and variability. Compared with the probabilistic counterpart, our approach highlights the versatility of uncertainty theory and provides researchers one much more feasible method to analyze the often-subjective nature of music reception, as well as broadening the potential applications of uncertainty theory. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Machine Learning-Based Scrap Steel Price Forecasting for the Northeast Chinese Market.

Author: Jin, Bingzi and Xu, Xiaojie
Subjects: PRICES, KRIGING, STANDARD deviations, STEEL prices, MARKET prices, STOCK price forecasting
Abstract: Throughout history, governments and investors have relied on predictions of prices for a broad spectrum of commodities. Using time-series data covering 08/23/2013–04/15/2021, this study investigates the challenging problem of predicting scrap steel prices, which are issued daily for the northeast China market. Previous research has not sufficiently taken into account estimates for this significant commodity price measurement. In this instance, Gaussian process regression methods are created using Bayesian optimisation approaches and cross-validation processes, and the resulting price forecasts are constructed. This empirical prediction methodology provides reasonably accurate price estimates for the out-of-sample period from 09/17/2019 to 04/15/2021, with a root mean square error of 9.6951, mean absolute error of 5.4218, and correlation coefficient of 99.9122%. Governments and investors can arrive at informed decisions regarding regional scrap steel markets by using pricing research models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Cubic spline estimation for non parametric uncertain differential equation.

Author: Shi, Yuxin, Zhao, Jiangtao, and Sheng, Yuhong
Subjects: *AIR quality indexes, *DIFFERENTIAL equations, *PARAMETER estimation, *SPLINES, *COMPARATIVE studies
Abstract: Abstract.In the history of researching to estimate the unknown parameters that were in the uncertain differential equation (UDE), the problem of parameter estimation with known functional forms is often studied. However, in practical situations, its functional form is often unknown. In order to deal with this problem, this article proposes the cubic spline method to approximate the autonomous UDE, and perform non parametric estimation on it. The cross validation is introduced to determine the number of term (J), which is in the approximate cubic spline. In addition, the uncertain hypothesis testing is given to verify the rationality of this method. Finally, some numerical examples are given. Then this method is applied to a case study of the Beijing Air Quality Index, and a comparative analysis is given to verify the practicability and superiority of the cubic spline method. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Forecasts of thermal coal prices through Gaussian process regressions.

Author: Jin, Bingzi and Xu, Xiaojie
Abstract: Given thermal coal's significance as a tactical energy source, price projections for the commodity are crucial for investors and decision-makers alike. The goal of the current work is to determine whether Gaussian process regressions are useful for this forecast problem using a dataset of closing prices of thermal coal traded on the China Zhengzhou Commodity Exchange from January 4, 2016, to December 31, 2020. This is a significant financial index that has not received enough attention in the literature in terms of price forecasting. Our forecasting exercises make use of Bayesian optimizations and cross-validation. The price from January 02, 2020, to December 31, 2020 is successfully predicted by the generated models, with the out-of-sample relative root mean square error of 0.4210%. Gaussian process regressions are shown to be useful for the thermal coal price forecast problem. The outcomes of this projection might be used as independent technical forecasts or in conjunction with other forecasts for policy research that entails developing viewpoints on price patterns. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. The development and validation of the Student Self-feedback Behavior Scale

Author: Yongle Yang, Zi Yan, Jinyu Zhu, Wuyuan Guo, Junsheng Wu, and Bingjun Huang
Subjects: self-feedback behavior, scale development and validation, Chinese student, cross validation, measurement invariance, Psychology, BF1-990
Abstract: Though the importance and benefits of students’ active role in the feedback process have been widely discussed in the literature, an instrument for measuring students’ self-feedback behavior is still lacking. This paper reports the development and validation of the Self-feedback Behavior Scale (SfBS), which comprises three dimensions (seeking, processing, and using feedback). The SfBS items were constructed in line with the self-feedback behavioral model. One thousand two hundred fifty-two high school students (Grade 10 to Grade 12) in mainland China participated in this survey. The exploratory factor analysis revealed a three-factor model reaffirmed in the confirmatory factor analysis. The multi-group CFA supported the measurement invariance of the SfBS across gender. Using the SfBS can help researchers and teachers better understand students’ self-feedback behavior and optimize benefits derived from the self-feedback process.
Published: 2025
Full Text: View/download PDF

23. Comparing simulated demand flexibility against actual performance in commercial office buildings

Author: Yin, Rongxin, Liu, Jingjing, Piette, Mary Ann, Xie, Jiarong, Pritoni, Marco, Casillas, Armando, Yu, Lili, and Schwartz, Peter
Subjects: Built Environment and Design, Architecture, Demand flexibility, Commercial office building, Cross validation, Control strategy, Global temperature adjustment, Field-testing, Prototype building model, Demand Flexibility, commercial office building, cross validation, control strategy, global temperature adjustment, prototype building model, Environmental Science and Management, Building, Building & Construction, Built environment and design, Engineering
Abstract: Commercial building energy benchmarking has been used as a mechanism to evaluate energy use of a single building over time, relative to other similar buildings, or to simulations of a reference building conforming to various energy standards. Lack of empirical demand flexibility data and consistent flexibility metrics has limited the ability to compare demand flexibility performance with estimated demand flexibility in buildings. In this study, we collected demand response performance data for a total of 831 demand response events from 192 sites as a first step to build such a building demand flexibility dataset, and propose a standard core data schema to consolidate field data from different sources. We also performed parametric simulations of a control strategy called “global temperature adjustment” using commercial office prototype building models. We then compared the simulated demand flexibility performance against the actual data for offices with global temperature adjustment strategy implemented. During demand response events with an average outside air temperature of 34 °C (range 23 °C–42 °C), the measured demand decrease intensity of the demand flexibility metrics were 6.1 watts per square meter (W/m2), 10.0 W/m2, 11.1 W/m2, 7.1 W/m2, and 4.7 W/m2 for small, small–medium, medium, medium–large, and large office buildings, respectively. Compared to the measured data in medium- and large-size buildings, the simulated demand decrease intensity was 0.7 W/m2 (17%) lower on average. The discrepancy between simulated and measured peak demand intensities fell within one standard deviation of the mean measured data. The comparison results validate the credibility of simulations in capturing real building data for assessing the technical potential of building demand flexibility.
Published: 2023

24. Predictions of residential property price indices for China via machine learning models: Predictions of residential property price indices for China

Author: Jin, Bingzi and Xu, Xiaojie
Published: 2025
Full Text: View/download PDF

25. A compartmental model for smoking dynamics in Italy: a pipeline for inference, validation, and forecasting under hypothetical scenarios

Author: Alessio Lachi, Cecilia Viscardi, Giulia Cereda, Giulia Carreras, and Michela Baccini
Subjects: Compartmental models, Smoking dynamics, Tobacco control policies, Global sensitivity analysis, Parametric bootstrap, Cross validation, Medicine (General), R5-920
Abstract: Abstract We propose a compartmental model for investigating smoking dynamics in an Italian region (Tuscany). Calibrating the model on local data from 1993 to 2019, we estimate the probabilities of starting and quitting smoking and the probability of smoking relapse. Then, we forecast the evolution of smoking prevalence until 2043 and assess the impact on mortality in terms of attributable deaths. We introduce elements of novelty with respect to previous studies in this field, including a formal definition of the equations governing the model dynamics and a flexible modelling of smoking probabilities based on cubic regression splines. We estimate model parameters by defining a two-step procedure and quantify the sampling variability via a parametric bootstrap. We propose the implementation of cross-validation on a rolling basis and variance-based Global Sensitivity Analysis to check the robustness of the results and support our findings. Our results suggest a decrease in smoking prevalence among males and stability among females, over the next two decades. We estimate that, in 2023, 18% of deaths among males and 8% among females are due to smoking. We test the use of the model in assessing the impact on smoking prevalence and mortality of different tobacco control policies, including the tobacco-free generation ban recently introduced in New Zealand.
Published: 2024
Full Text: View/download PDF

26. Enhancing Indoor Localization Accuracy through Multiple Access Point Deployment.

Author: Aziz, Toufiq and Insoo, Koo
Subjects: WIRELESS LANs, STANDARD deviations, RADIO frequency, MACHINE learning, LOCALIZATION (Mathematics)
Abstract: This study addresses the limitations of wireless local area networks in indoor localization by utilizing Extra-Trees Regression (ETR) to estimate locations based on received signal strength indicator (RSSI) values from a radio environment map (REM). We investigate how integrating numerous access points can enhance indoor localization accuracy. By constructing an extensive REM using RSSI data from various access points collected by a mobile robot in the intended interior setting, we evaluate several machine learning regression techniques. Our research pays special attention to an optimized ETR model, validated through 10-fold cross-validation and hyperparameter tuning. We quantitatively evaluate the efficiency of our suggested multi-access-point approach using root mean square error (RMSE) for REM evaluation and location error metrics for accurate localization. The results show that incorporating multiple access points significantly improves indoor localization accuracy, providing a substantial improvement over single-access-point systems when assessing interior radio frequency environments. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Identifying the environmental drivers of corridors and predicting connectivity between seasonal ranges in multiple populations of Alpine ibex (Capra ibex) as tools for conserving migration.

Author: Chauveau, Victor, Garel, Mathieu, Toïgo, Carole, Anderwald, Pia, Beurier, Mathieu, Bouche, Michel, Bunz, Yoann, Cagnacci, Francesca, Canut, Marie, Cavailhes, Jérôme, Champly, Ilka, Filli, Flurin, Frey‐Roos, Alfred, Gressmann, Gunther, Herfindal, Ivar, Jurgeit, Florian, Martinelli, Laura, Papet, Rodolphe, Petit, Elodie, and Ramanzin, Maurizio
Subjects: *ENVIRONMENTALISM, *MIGRATORY animals, *FRAGMENTED landscapes, *SEASONS, *SPRING, *WILDLIFE reintroduction, *CORRIDORS (Ecology), *HOME range (Animal geography), *INTERNAL migration
Abstract: Aim: Seasonal migrations, such as those of ungulates, are particularly threatened by habitat transformations and fragmentation, climate and other environmental changes caused by anthropogenic activities. Mountain ungulate migrations are neglected because they are relatively short, although traversing heterogeneous altitudinal gradients particularly exposed to anthropogenic threats. Detecting migration routes of these species and understanding their drivers are therefore of primary importance to predict connectivity and preserve ecosystem functions and services. The populations of Alpine ibex Capra ibex have all been reintroduced from the last remnant source population. Despite a general increase in abundance and overall distribution range, ibex populations are mostly disconnected but display intra‐population migrations. Therefore, its conservation is strictly linked to the interplay between external threats and related behavioural responses, including space use and migration. Location: Austria, France, Italy and Switzerland. Methods: By using 337 migratory tracks from 425 GPS‐collared individuals from 15 Alpine ibex populations distributed across their entire range, we (i) identified the environmental drivers of movement corridors in both spring and autumn and (ii) compared the ability of a connectivity modelling algorithm to predict migratory movements between seasonal ranges of the 15 populations, using either population‐specific or multipopulation datasets, and three validation procedures. Results: Steep, south‐facing, snow‐free slopes were selected while high elevation changes were avoided. This revealed the importance of favourable resources and an attempt to limit energy expenditures and perceived predation risk. The abilities of the modelling methods we compared to predict migratory connectivity from the results of those movement analyses were similar. Main Conclusions: The trade‐off between energy expenditure, food and cover was the major driver of migration routes and was overall consistent among populations. Based on these findings, we provided useful connectivity models to inform conservation of Alpine ibex and its habitats, and a framework for future research investigating connectivity in migratory species. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Prediction and model evaluation for space–time data.

Author: Watson, G. L., Reid, C. E., Jerrett, M., and Telesca, D.
Subjects: *PREDICTION models, *CALIFORNIA wildfires, *SPACETIME, *AIR pollution, *INTERPOLATION
Abstract: Evaluation metrics for prediction error, model selection and model averaging on space–time data are understudied and poorly understood. The absence of independent replication makes prediction ambiguous as a concept and renders evaluation procedures developed for independent data inappropriate for most space–time prediction problems. Motivated by air pollution data collected during California wildfires in 2008, this manuscript attempts a formalization of the true prediction error associated with spatial interpolation. We investigate a variety of cross-validation (CV) procedures employing both simulations and case studies to provide insight into the nature of the estimand targeted by alternative data partition strategies. Consistent with recent best practice, we find that location-based cross-validation is appropriate for estimating spatial interpolation error as in our analysis of the California wildfire data. Interestingly, commonly held notions of bias-variance trade-off of CV fold size do not trivially apply to dependent data, and we recommend leave-one-location-out (LOLO) CV as the preferred prediction error metric for spatial interpolation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. A Deep Learning-based U-Net 3+ Technique for Segmentation Blood Cell.

Author: ULUTAŞ, Hasan
Subjects: *DEEP learning, *BLOOD cells, *BAYESIAN analysis, *ROBUST statistics, *IMAGE segmentation
Abstract: Segmentation and classification of blood cells are crucial for various medical applications, including disease diagnosis, treatment monitoring, and research purposes. This process allows for accurate identification and quantification of different cell types, aiding in the detection and understanding of various blood-related disorders. The proposed U-Net 3+ architecture incorporates structural modifications, including strengthened connections between convolutional layers, increased filter numbers, and integration of Bayesian optimization for hyperparameter tuning. The model's generalization capability is optimized through the dynamic adjustment of dropout rates and learning rates. Bayesian optimization facilitates the exploration of optimal hyperparameter combinations, allowing the model to adapt effectively to diverse datasets. Advanced training strategies, such as adaptive learning rate adjustment and early stopping, are employed to mitigate overfitting and enhance training efficiency. The proposed model exhibits exceptional performance across multiple folds, achieving low training and validation losses, high accuracy metrics, and robust segmentation indices. Evaluation metrics, including mean IoU (Jaccard Index), dice score, pixel accuracy, and precision, confirm the model's proficiency in accurately delineating blood cell boundaries. The study demonstrates the effectiveness of custom architectures and optimization techniques, achieving an average IoU (Jaccard Index) of 0.9324 and a dice score of 0.9667. The proposed U-Net 3+ model stands as a promising solution for accurate and reliable blood cell segmentation, demonstrating adaptability and robust performance across various datasets. This work sets the stage for future research in the domain of medical image segmentation, emphasizing the potential for continued advancements in precise and efficient segmentation methodologies. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Optimal Latent Variables Number for the Reconstruction of Time Series with PLSR

Author: Balsa, Carlos, Dupuis, Hugo, Breve, Murilo-M., Guivarch, Ronan, Rufino, José, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Garcia, Marcelo V., editor, Gordón-Gallegos, Carlos, editor, Salazar-Ramírez, Asier, editor, and Nuñez, Carlos, editor
Published: 2024
Full Text: View/download PDF

31. Credit Card Fraud Detection Based on Machine Learning Prediction

Author: Yang, Ge, Fournier-Viger, Philippe, Series Editor, and Wang, Yulin, editor
Published: 2024
Full Text: View/download PDF

32. Cricket Forecast: Unraveling Future Matches’ Outcomes

Author: Siddharth, Vemula Vivek, Vikranth, Pulukuri Shalem, Karthik, N., Vani, V., Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Carette, Jacques, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Owoc, Mieczyslaw Lech, editor, Varghese Sicily, Felix Enigo, editor, Rajaram, Kanchana, editor, and Balasundaram, Prabavathy, editor
Published: 2024
Full Text: View/download PDF

33. MetroPT Predictive Maintenance Using Logistic Regression and Random Forest with Isolation Forest Preprocessing

Author: Sandhu, Jaspreet, Mahapatra, Bandana, Kulkarni, Sarang, Bhatt, Abhishek, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Pant, Millie, editor, Deep, Kusum, editor, and Nagar, Atulya, editor
Published: 2024
Full Text: View/download PDF

34. A Machine Learning Approach for Risk Prediction of Cardiovascular Disease

Author: Panda, Shovna, Palei, Shantilata, Samartha, Mullapudi Venkata Sai, Jena, Biswajit, Saxena, Sanjay, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Kaur, Harkeerat, editor, Jakhetiya, Vinit, editor, Goyal, Puneet, editor, Khanna, Pritee, editor, Raman, Balasubramanian, editor, and Kumar, Sanjeev, editor
Published: 2024
Full Text: View/download PDF

35. Breast Cancer Detection: An Evaluation of Machine Learning, Ensemble Learning, and Deep Learning Algorithms

Author: Rai, Deepak, Mishra, Tripti, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Chauhan, Naveen, editor, Yadav, Divakar, editor, Verma, Gyanendra K., editor, Soni, Badal, editor, and Lara, Jorge Morato, editor
Published: 2024
Full Text: View/download PDF

36. Android Malware Detection Using Artificial Intelligence

Author: Masele, Rebecca Kipanga, Khennou, Fadoua, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Lopata, Audrius, editor, Gudonienė, Daina, editor, and Butkienė, Rita, editor
Published: 2024
Full Text: View/download PDF

37. Machine Learning-Based Scrap Steel Price Forecasting for the Northeast Chinese Market

Author: Bingzi Jin and Xiaojie Xu
Subjects: Regional scrap steel price, time-series forecast, Gaussian process regression, Bayesian optimization, cross validation, Economics as a science, HB71-74
Abstract: Throughout history, governments and investors have relied on predictions of prices for a broad spectrum of commodities. Using time-series data covering 08/23/2013–04/15/2021, this study investigates the challenging problem of predicting scrap steel prices, which are issued daily for the northeast China market. Previous research has not sufficiently taken into account estimates for this significant commodity price measurement. In this instance, Gaussian process regression methods are created using Bayesian optimisation approaches and cross-validation processes, and the resulting price forecasts are constructed. This empirical prediction methodology provides reasonably accurate price estimates for the out-of-sample period from 09/17/2019 to 04/15/2021, with a root mean square error of 9.6951, mean absolute error of 5.4218, and correlation coefficient of 99.9122%. Governments and investors can arrive at informed decisions regarding regional scrap steel markets by using pricing research models.
Published: 2024
Full Text: View/download PDF

38. The illusion of success: Test set disproportion causes inflated accuracy in remote sensing mapping research

Author: Yuanjun Xiao, Zhen Zhao, Jingfeng Huang, Ran Huang, Wei Weng, Gerui Liang, Chang Zhou, Qi Shao, and Qiyu Tian
Subjects: Accuracy assessment, Test set, Sample size ratio, Biased accuracy, Accuracy adjustment, Cross validation, Physical geography, GB3-5030, Environmental sciences, GE1-350
Abstract: In remote sensing mapping studies, selecting an appropriate test set to accurately evaluate the results is critical. An imprecise accuracy assessment can be misleading and fail to validate the applicability of mapping products. Commencing with the WHU-Hi-HanChuan dataset, this paper revealed the impact of sample size ratios in test sets on accuracy metrics by generating a series of test sets with varying ratios of positive and negative sample size to evaluate the same map. A rigorous approach for accuracy assessment was suggested, and an example of tea plantations mapping is used to demonstrate the process and analyse potential issues in traditional approaches. A scale factor (λ) was constructed to measure the discrepancy in sample size ratios between test sets and actual conditions. Accuracy adjustment formulas were developed and applied to adjust the accuracy of 42 previous maps based on the λ. Results showed a higher ratio of positive to negative sample size in test set led to inflated user’s accuracy (UA), F1-score (F1) and overall accuracy (OA), but had little impact on producer’s accuracy. When the ratio aligned with that in the target area, the UA, F1, and OA closely matched the true values, indicating the proportion of positive and negative samples in test set should be consistent with that in actual situation. The accuracies reported by the traditional approaches including test set sampling from labelled data and 5-fold cross validation were far from the true accuracy and could not reflect the performance of the map. Among 42 previous maps, nearly 60% of the maps had UAs overestimated by 10%, and 9.5% of the maps had UAs and F1s deviations of more than 25%. The conclusions of this study provide a clear caution for future mapping research and assist in producing and identifying truly excellent maps.
Published: 2024
Full Text: View/download PDF

39. An analysis on classification models for customer churn prediction

Author: Kathi Chandra Mouli, Ch. V. Raghavendran, V. Y. Bharadwaj, G. Y. Vybhavi, C. Sravani, Khristina Maksudovna Vafaeva, Rajesh Deorari, and Laith Hussein
Subjects: Customer churn, classification models, class imbalance, accuracy metrics, cross validation, hyper parameters, Engineering (General). Civil engineering (General), TA1-2040
Abstract: The rapid expansion of technical infrastructure has brought about transformative changes in business operations. A notable consequence of this digital evolution is the proliferation of subscription-based services. With an increasing array of options for goods and services, customer churn has emerged as a significant challenge, posing a threat to businesses across sectors. The direct impact on earnings has prompted businesses to proactively develop tools for predicting potential client turnover. Identifying the underlying factors contributing to churn is crucial for implementing effective retention strategies. Our research makes a pivotal contribution by presenting a churn prediction model designed to assist businesses in identifying clients at risk of churn. The proposed model leverages machine learning classification techniques, with the customer data undergoing thorough pre-processing phases prior to model application. We systematically evaluated ten classification techniques, including Logistic Regression, Support Vector Classifier, Kernel SVM, KNN, Gaussian Naïve Bayes, Decision Tree Classifier, Random Forest, ADA Boost, XGBoost, and Gradient Boost. The assessment encompassed various evaluation metrics, such as ROC AUC Mean, ROC AUC STD, Accuracy Mean, Accuracy STD, Accuracy, Precision, Recall, F1 Score, and F2 Score. Employing 10-fold cross-validation and hyper parameter tuning through GridSearchCV and RandomizedSearchCV, we identified Random Forest as the most effective classifier, achieving an 85% Area Under the Curve (AUC) for optimal results.
Published: 2024
Full Text: View/download PDF

40. Modelling soil prokaryotic traits across environments with the trait sequence database ampliconTraits and the R package MicEnvMod

Author: Jonathan Donhauser, Anna Doménech-Pascual, Xingguo Han, Karen Jordaan, Jean-Baptiste Ramond, Aline Frossard, Anna M. Romaní, and Anders Priemé
Subjects: Trait sequence database, DNA sequencing, Microbial community, Cross validation, Weighted ensemble model, Information technology, T58.5-58.64, Ecology, QH540-549.5
Abstract: We present a comprehensive, customizable workflow for inferring prokaryotic phenotypic traits from marker gene sequences and modelling the relationships between these traits and environmental factors, thus overcoming the limited ecological interpretability of marker gene sequencing data. We created the trait sequence database ampliconTraits, constructed by cross-mapping species from a phenotypic trait database to the SILVA sequence database and formatted to enable seamless classification of environmental sequences using the SINAPS algorithm. The R package MicEnvMod enables modelling of trait – environment relationships, combining the strengths of different model types and integrating an approach to evaluate the models' predictive performance in a single framework. Traits could be accurately predicted even for sequences with low sequence identity (80 %) with the reference sequences, indicating that our approach is suitable to classify a wide range of environmental sequences. Validating our approach in a large trans-continental soil dataset, we showed that trait distributions were robust to classification settings such as the bootstrap cutoff for classification and the number of discrete intervals for continuous traits. Using functions from MicEnvMod, we revealed precipitation seasonality and land cover as the most important predictors of genome size. We found Pearson correlation coefficients between observed and predicted values up to 0.70 using repeated split sampling cross validation, corroborating the predictive ability of our models beyond the training data. Predicting genome size across the Iberian Peninsula, we found the largest genomes in the northern part. Potential limitations of our trait inference approach include dependence on the phylogenetic conservation of traits and limited database coverage of environmental prokaryotes. Overall, our approach enables robust inference of ecologically interpretable traits combined with environmental modelling allowing to harness traits as bioindicators of soil ecosystem functioning.
Published: 2024
Full Text: View/download PDF

41. Evaluation of four machine learning methods in predicting orthodontic extraction decision from clinical examination data and analysis of feature contribution

Author: Jialiang Huang, Ian-Tong Chan, Zhixian Wang, Xiaoyi Ding, Ying Jin, Congchong Yang, and Yichen Pan
Subjects: orthodontic treatment, tooth extraction decision, decision tree, machine learning, cross validation, Biotechnology, TP248.13-248.65
Abstract: IntroductionThe study aims to predict tooth extraction decision based on four machine learning methods and analyze the feature contribution, so as to shed light on the important basis for experts of tooth extraction planning, providing reference for orthodontic treatment planning.MethodsThis study collected clinical information of 192 patients with malocclusion diagnosis and treatment plans. This study used four machine learning strategies, including decision tree, random forest, support vector machine (SVM) and multilayer perceptron (MLP) to predict orthodontic extraction decisions on clinical examination data acquired during initial consultant containing Angle classification, skeletal classification, maxillary and mandibular crowding, overjet, overbite, upper and lower incisor inclination, vertical growth pattern, lateral facial profile. Among them, 30% of the samples were randomly selected as testing sets. We used five-fold cross-validation to evaluate the generalization performance of the model and avoid over-fitting. The accuracy of the four models was calculated for the training set and cross-validation set. The confusion matrix was plotted for the testing set, and 6 indicators were calculated to evaluate the performance of the model. For the decision tree and random forest models, we observed the feature contribution.ResultsThe accuracy of the four models in the training set ranges from 82% to 90%, and in the cross-validation set, the decision tree and random forest had higher accuracy. In the confusion matrix analysis, decision tree tops the four models with highest accuracy, specificity, precision and F1-score and the other three models tended to classify too many samples as extraction cases. In the feature contribution analysis, crowding, lateral facial profile, and lower incisor inclination ranked at the top in the decision tree model.ConclusionAmong the machine learning models that only use clinical data for tooth extraction prediction, decision tree has the best overall performance. For tooth extraction decisions, specifically, crowding, lateral facial profile, and lower incisor inclination have the greatest contribution.
Published: 2024
Full Text: View/download PDF

42. Heart Sound Processing for Early Diagnostic of Heart Abnormalities using Support Vector Machine

Author: Sebastian Michael Paschalis, Duma Kristina Yanti Hutapea, and Karel Octavianus Bachri
Subjects: support vector machine, heart sound, linear kernel, cross validation, heart disease, early diagnostic, Electrical engineering. Electronics. Nuclear engineering, TK1-9971, Information technology, T58.5-58.64
Abstract: This paper addresses the critical issue of cardiovascular disease (CVD), the leading cause of global mortality, emphasizing the imperative for effective and early detection to mitigate CVD-related deaths. The research problem underscores the urgency of developing advanced diagnostic tools to identify heart abnormalities promptly. The primary objective is to create a Support Vector Machine (SVM) algorithm for accurate classification of different heart conditions, namely Normal heart, Mitral Stenosis, and Mitral Regurgitation. To achieve this objective, the study utilizes a dataset of heart sounds available online using a 10-fold cross-validation method. The focus is on evaluating the efficacy of various kernel functions within the SVM framework for heart sound classification. The findings demonstrate that the linear kernel exhibits superior accuracy and robustness in effectively classifying heart conditions. Notably, the proposed classification method attains an impressive 96% accuracy, highlighting its potential as a reliable tool for early detection of cardiovascular diseases. This research contributes to the ongoing efforts to enhance diagnostic capabilities and ultimately reduce the global burden of CVD-related fatalities.
Published: 2024
Full Text: View/download PDF

43. Convolutional Neural Network untuk Klasifikasi Batik Tenun Ikat Bandar Berdasarkan Fitur Warna dan Tekstur

Author: Mohammad Atif Faiz Muthrofin, Danang Erwanto, and Iska Yanuartanti
Subjects: tenun ikat, cnn, glcm, ccm, cross validation, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Tenun Ikat Bandar Kediri adalah salah satu jenis batik berupa kain yang ditenun dan diberi suatu pola dan motif pada teksturnya menggunakan suatu mesin tenun kayu tradisional. Pola dan motif pada batik tenun ikat sangat bervariasi tergantung pada rumah produksinya. Biasanya setiap rimah produksi memiliki suatu ciri khas khusus pada pola dan motifnya. Banyaknya pola dan motif tersebut akan menjadikan masyarakat sulit mengenali dan mempelajari ciri visual Tenun Ikat tersebut sehingga bila ada suatu sistem yang mempelajari pola dan motif tersebut maka akan sangat membantu masyarakat. Sistem klasifikasi yang dibuat pada penelitian ini mengimplementasikan algoritma Convolutional Neural Network (CNN) dengan ekstraksi tekstur Tenun menggunakan fitur Gray Level Cooccurence Matrix (GLCM) dan ekstraksi warna menggunakan fitur Color Co-occourrence Matrix (CCM). Pada penelitian ini menggunakan dataset sebanyak 125 citra gambar dari 5 motif batik pada suatu rumah produksi tenun ikat dengan proporsi setiap pola yang seimbang. Hasil dari penelitian ini menunjukkan bahwa rata-rata akurasi dari setiap pengujian mencapai angka 0,94, ini menunjukkan bahwa metode yang dimaksudkan telah dapat melakukan klasifikasi dengan baik.
Published: 2024
Full Text: View/download PDF

44. Steel price index forecasts through machine learning for northwest China: Steel price index forecasts through machine learning for northwest China

Author: Jin, Bingzi and Xu, Xiaojie
Published: 2024
Full Text: View/download PDF

45. Spatial Distribution of Soil pH Status in Forest Soils of Telangana using GIS-Based Geo-Statistical Models

Author: Patel, Ruby and Panwar, Vijender Pal
Published: 2024
Full Text: View/download PDF

46. Applicability of smell agent optimization and Tasmanian devil optimization hybridized with ANFIS and SVR as reliable solutions in estimation of cooling load in buildings

Author: Li, Shaoxu
Published: 2024
Full Text: View/download PDF

47. Predicting open interest in thermal coal futures using machine learning

Author: Jin, Bingzi and Xu, Xiaojie
Published: 2024
Full Text: View/download PDF

48. Source identification of mine water inrush based on GBDT-RS-SHAP

Author: Yang, Zhenwei, Li, Han, Wang, Xinyi, Meng, Hongwei, Xi, Tong, and Hou, Zhenhuan
Published: 2025
Full Text: View/download PDF

49. Random forest based quantile-oriented sensitivity analysis indices estimation.

Author: Elie-Dit-Cosaque, Kévin and Maume-Deschamps, Véronique
Subjects: *RANDOM forest algorithms, *SENSITIVITY analysis, *TREE size, *QUANTILE regression
Abstract: We propose a random forest based estimation procedure for Quantile-Oriented Sensitivity Analysis—QOSA. In order to be efficient, a cross-validation step on the leaf size of trees is required. Our full estimation procedure is tested on both simulated data and a real dataset. Our estimators use either the bootstrap samples or the original sample in the estimation. Also, they are either based on a quantile plug-in procedure (the R-estimators) or on a direct minimization (the Q-estimators). This leads to 8 different estimators which are compared on simulations. From these simulations, it seems that the estimation method based on a direct minimization is better than the one plugging the quantile. This is a significant result because the method with direct minimization requires only one sample and could therefore be preferred. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. A New Comparative Approach Based on Features of Subcomponents and Machine Learning Algorithms to Detect and Classify Power Quality Disturbances.

Author: Akkaya, Sıtkı, Yüksek, Emre, and Akgün, Hasan Metehan
Subjects: *POWER quality disturbances, *MACHINE learning, *COMPARATIVE method, *FEATURE extraction, *K-nearest neighbor classification
Abstract: Current measurement systems based on the IEEE-1159 standard have some limitations and robustness problems under noisy and fast-changing conditions. Besides, applying different methods for each Power Quality Disturbance (PQD) to every window is required but time-consuming and not feasible. Therefore, different kinds of two-stage methods, Detection and Classification (D&C), have been improved in many studies. Then, the required measurement can be performed to define disturbance. For this purpose, a new approach based on features of subcomponents with Machine Learning Algorithms (MLAs) to detect and classify PQDs is proposed. 21-class dataset including single and multiple PQDs under different noisy conditions was prepared randomly. Of this dataset, determined features were extracted and some of these were selected. Then, selected features were trained and tested with some MLAs in a workstation. Results obtained from comparative MLAs and the other classification methods show that the best MLA with related features is Random Forest with 96.97% while LightGBM, k-Nearest Neighbors, and XGBoost 96.85%, 96.73%, and 92.82% accuracy, respectively. Because the selected features, optimized parameters, and the related MLA were obtained by investigating for features provided from the PQDs in the whole parameter space, this approach brings the advantages of high accuracy, low D&C complexity, and computing load. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

2,052 results on '"Cross validation"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources