1,371 results on '"regression tree"'
Search Results
2. Forecasting the performance and emissions of a diesel engine powered by waste cooking biodiesel with carbon nano additives using tree-based, least square boost and Gaussian regression models
- Author
-
Gad, M.S. and Alenany, Ahmed
- Published
- 2025
- Full Text
- View/download PDF
3. Using regression tree analysis to examine demographic and geographic characteristics of COVID-19 vaccination trends over time, United States, May 2021–April 2022, National Immunization Survey Adult COVID Module
- Author
-
Earp, Morgan, Meng, Lu, Black, Carla L., Carter, Rosalind J., Lu, Peng-Jun, Singleton, James A., and Chorba, Terence
- Published
- 2024
- Full Text
- View/download PDF
4. Unraveling birth weight determinants: Integrating machine learning, spatial analysis, and district-level mapping
- Author
-
Rubaiya, Mansur, Mohaimen, Alam, Md. Muhitul, and Rayhan, Md. Israt
- Published
- 2024
- Full Text
- View/download PDF
5. Methodological approaches for the assessment of bisphenol A exposure
- Author
-
Costa, Sofia Almeida, Severo, Milton, Correia, Daniela, Carvalho, Catarina, Magalhães, Vânia, Vilela, Sofia, Cunha, Sara, Casal, Susana, Lopes, Carla, and Torres, Duarte
- Published
- 2023
- Full Text
- View/download PDF
6. Analysis and modeling of high-performance polymer electrolyte membrane electrolyzers by machine learning
- Author
-
Günay, M. Erdem, Tapan, N. Alper, and Akkoç, Gizem
- Published
- 2022
- Full Text
- View/download PDF
7. A Machine Learning Algorithm for the Analysis of Spatially Distributed Data
- Author
-
Cartone, Alfredo, Piras, Gianfranco, Postiglione, Paolo, Pollice, Alessio, editor, and Mariani, Paolo, editor
- Published
- 2025
- Full Text
- View/download PDF
8. Stock Open Price Prediction of Software Companies in the BSE SENSEX 50 Index
- Author
-
Sonar, Chhaya, Al Hammadi, Ahmed M., Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Weber, Gerhard-Wilhelm, editor, Martinez Trinidad, Jose Francisco, editor, Sheng, Michael, editor, Ramachand, Raghavendra, editor, Kharb, Latika, editor, and Chahal, Deepak, editor
- Published
- 2025
- Full Text
- View/download PDF
9. A comprehensive evaluation of eco-productivity of the municipal solid waste service in Chile.
- Author
-
Mocholi-Arce, Manuel, Sala-Garrido, Ramon, Molinos-Senante, Maria, and Maziotis, Alexandros
- Abstract
Moving toward a circular economy requires improvement of the economic and environmental performance of municipalities in their provision of municipal solid waste (MSW) services. Understanding performance changes over years is fundamental to support decision-making. This study employs the Luenberger-Hicks-Moorsteen productivity indicator to evaluate eco-productivity change and its drivers in the MSW sector in Chile over the years 2015–2019. The further use of decision tree and linear regression analysis allows exploration of the interaction between operating characteristics and eco-productivity estimations. The results of the eco-productivity assessment show that, although the Chilean MSW sector was still facing a transitional period, from 2015 to 2019, eco-productivity increased 1.28% per year. Gains in eco-productivity were due to technical progress and small gains in efficiency, whereas scale effect had an adverse impact. Other factors such as waste spending per inhabitant and the amount of waste collected and recycled per inhabitant had a significant impact on the eco-productivity of Chilean municipalities. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF
10. Thermal performance prediction of a V-trough solar water heater with a modified twisted tape using ANFIS, G.L.R., R.T. and SVM models of machine learning
- Author
-
A. Saravanan, S. Rama Sree, M. Sreenivasa Reddy, Elumalai PV, Krishnasamy Karthik, Ashok Kumar Cheeli, and Nasim Hasan
- Subjects
Solar water heater ,Adaptive neuro-fuzzy inference system ,Generalised linear regression ,Regression tree ,Machine learning ,Medicine ,Science - Abstract
Abstract Four distinct neural models were used to evaluate the efficiency of a V-trough solar water heater (VTSWH) equipped with square-cut twisted tape (SCTT) and V-cut twisted tape (VCTT) at two different twist ratios, 3 and 5. The objective of this study was the use of ANFIS (Adaptive Neuro-Fuzzy Inference System), G.L.R. (Generalised linear regression), R.T. (Regression tree), and SVM (Support Vector Machine). A total of 162 data sets were acquired for these models through a variety of trials. Outdoor experiments were done using a twist ratio of Y = 3 and Y = 5, using both SCTT and VCTT. The models included eight distinct variables: ambient temperature, water mass flow rate, water intake temperature, water exit temperature, absorber plate temperature, tube temperature, solar intensity, and twist ratio. The dependent variables in this study are the Nusselt number (Nu), friction factor (FF), and efficiency (η). 130 datasets were chosen for training purposes, while 32 were used for testing. Using the ANFIS, G.L.R., R.T., and SVM techniques, the correlation coefficient (R2) values for Nusselt number were 0.9990, 0.9961, 0.9562, and 0.9280 for friction factor 0.9966, 0.9683, 0.9810, and 0.9560, and for efficiency 0.9997, 0.9976, 0.9845, and 0.9614, respectively. Comparing all models shows that ANFIS is the most effective of the four strategies studied. The ANFIS model outperformed the other models regarding Nu, FF, and η, with RMSE values of 0.0805, 0.0.0004, and 0.4534. According to the above data, the VTSWH thermal performance predicted using the ANFIS approach has the highest accuracy.
- Published
- 2024
- Full Text
- View/download PDF
11. Tree-based analysis of longevity predictors and their ten-year changes: a 35-Year mortality follow-up
- Author
-
Lily Nosraty, Jaakko Nevalainen, Jani Raitanen, and Linda Enroth
- Subjects
Mortality ,Relative measure of longevity ,Machine learning ,Regression tree ,Realized probability of dying ,Geriatrics ,RC952-954.6 - Abstract
Abstract Background Prior studies on longevity often examine predictors in isolation and rely solely on baseline information, limiting our understanding of the most important predictors and their dynamic nature. In this study, we used an innovative regression tree model to explore the common characteristics of those who lived longer than their age and sex peers in 35-years follow-up. We identified different pathways leading to a long life, and examined to how changes in characteristics over 10 years (from 1979 to 1989) affect the findings on longevity predictors. Methods Data was obtained from the “Tampere Longitudinal Study on Ageing” (TamELSA) in Finland. Survey data was collected in 1979 from 1056 participants aged 60–89 years (49.8% men). In 1989, a second survey was conducted among 432 survivors from the 1979 cohort (40.2% men). Dates of death were provided by the Finnish Population Register until 2015. We employed an individual measure of longevity known as the realized probability of dying (RPD), which was calculated based on each participant’s age and sex, utilizing population life tables. RPD is based on a comparison of the survival time of each individual of a specific age and sex with the survival time of his/her peers in the total population. A regression tree analysis was used to examine individual-based longevity with RPD as an outcome. Results This relative measure of longevity (RPD) provided a complex regression tree where the most important characteristics were self-rated health, years of education, history of smoking, and functional ability. We identified several pathways leading to a long life such as individuals with (1) good self-rated health (SRH), short smoking history, and higher education, (2) good SRH, short smoking history, lower education, and excellent mobility, and (3) poor SRH but able to perform less demanding functions, aged 75 or older, willing to do things, and sleeping difficulties. Changes in the characteristics over time did not change the main results. Conclusion The simultaneous examination of a broad range of potential predictors revealed that longevity can be achieved under very different conditions and is achieved by heterogeneous groups of people.
- Published
- 2024
- Full Text
- View/download PDF
12. Classifying clinical phenotypes of functional recovery for acute traumatic spinal cord injury. An observational cohort study.
- Author
-
Mputu Mputu, Pascal, Beauséjour, Marie, Richard-Denis, Andréane, Fallah, Nader, Noonan, Vanessa K., and Mac-Thiong, Jean-Marc
- Subjects
- *
STATISTICAL models , *SENSES , *NEUROLOGIC examination , *HEALTH self-care , *WOUNDS & injuries , *MATHEMATICAL variables , *RANDOM forest algorithms , *RESEARCH funding , *DISABILITY evaluation , *SCIENTIFIC observation , *SEX distribution , *QUESTIONNAIRES , *MULTIPLE regression analysis , *SPINAL cord injuries , *FUNCTIONAL status , *REPORTING of diseases , *RETROSPECTIVE studies , *AGE distribution , *DISCHARGE planning , *DESCRIPTIVE statistics , *LONGITUDINAL method , *CONVALESCENCE , *EPIDEMIOLOGY , *DATA analysis software , *PHENOTYPES , *PHYSICAL mobility , *COMORBIDITY , *TIME , *HEALTH care teams , *NONPARAMETRIC statistics - Abstract
Purpose: Identify patient subgroups with different functional outcomes after SCI and study the association between functional status and initial ISNCSCI components. Methods: Using CART, we performed an observational cohort study on data from 675 patients enrolled in the Rick-Hansen Registry(RHSCIR) between 2014 and 2019. The outcome was the Spinal Cord Independence Measure (SCIM) and predictors included AIS, NLI, UEMS, LEMS, pinprick(PPSS), and light touch(LTSS) scores. A temporal validation was performed on data from 62 patients treated between 2020 and 2021 in one of the RHSCIR participating centers. Results: The final CART resulted in four subgroups with increasing totSCIM according to PPSS, LEMS, and UEMS: 1)PPSS < 27(totSCIM = 28.4 ± 16.3); 2)PPSS ≥ 27, LEMS < 1.5, UEMS < 45(totSCIM = 39.5 ± 19.0); 3)PPSS ≥ 27, LEMS < 1.5, UEMS ≥ 45(totSCIM = 57.4 ± 13.8); 4)PPSS ≥ 27, LEMS ≥ 1.5(totSCIM = 66.3 ± 21.7). The validation model performed similarly to the original model. The adjusted R-squared and F-test were respectively 0.556 and 62.2(P-value <0.001) in the development cohort and, 0.520 and 31.9(P-value <0.001) in the validation cohort. Conclusion: Acknowledging the presence of four characteristic subgroups of patients with distinct phenotypes of functional recovery based on PPSS, LEMS, and UEMS could be used by clinicians early after tSCI to plan rehabilitation and establish realistic goals. An improved sensory function could be key for potentiating motor gains, as a PPSS ≥ 27 was a predictor of a good function. IMPLICATIONS FOR REHABILITATION: After a traumatic Spinal Cord Injury (SCI), early neurological examination using the International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI) is recommended to determine initial injury severity and prognosis. This study identified three initial ISNCSCI components defining four subgroups of SCI patients with different expectations in functional outcomes, namely the initial pinprick sensory score, the Lower Extremity Motor Score, and the Upper Extremity Motor Score. Clinicians could use these subgroups early after tSCI to plan rehabilitation and set realistic therapeutic goals regarding functional outcomes. In clinical practice, careful and accurate assessment of pinprick sensation early after the SCI is crucial when predicting function or stratifying patients based on the expected function. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Assessing the predictive performance of the Bagging algorithm for genomic selection.
- Author
-
Ghafouri-Kesbi, Farhad
- Subjects
- *
BOOTSTRAP aggregation (Algorithms) , *GAMMA distributions , *SINGLE nucleotide polymorphisms , *REGRESSION trees , *RANDOM forest algorithms - Abstract
The aim of the present study was to compare the predictive performance of the Bagging algorithm with other decision tree-based methods, including Regression Tree (RT), Random Forest (RF) and Boosting in genomic selection. A genome including ten chromosomes for 1,000 individuals on which 10,000 single nucleotide polymorphisms (SNP) were evenly distributed was simulated. QTL effects were assigned to 10% of the polymorphic SNPs, with effects sampled from a gamma distribution. Predictive performance measures including accuracy of prediction, reliability and bias were used to compare the methods. Computing time and memory requirements of the studied methods were also measured. In all methods studied, the accuracy of genomic evaluation increased following increase in the heritability level from 0.10 to 0.50. While RT was the most efficient user of time and memory, it was not recommended for genomic selection due to its poor predictive performance. The obtained results showed that the predictive performance of Bagging was equal to RF and higher than RT and Boosting. However, it required significantly higher computational time and memory requirements. Considering the overall performance, Bagging was recommended for genomic selection, especially when due to the size and structure of the genomic data, the use of RF is limited. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Overlapping coefficient in network-based semi-supervised clustering.
- Author
-
Conversano, Claudio, Frigau, Luca, and Contu, Giulia
- Subjects
- *
REGRESSION trees , *REGRESSION analysis , *MATRICES (Mathematics) , *ALGORITHMS , *CLASSIFICATION - Abstract
Network-based Semi-Supervised Clustering (NeSSC) is a semi-supervised approach for clustering in the presence of an outcome variable. It uses a classification or regression model on resampled versions of the original data to produce a proximity matrix that indicates the magnitude of the similarity between pairs of observations measured with respect to the outcome. This matrix is transformed into a complex network on which a community detection algorithm is applied to search for underlying community structures which is a partition of the instances into highly homogeneous clusters to be evaluated in terms of the outcome. In this paper, we focus on the case the outcome variable to be used in NeSSC is numeric and propose an alternative selection criterion of the optimal partition based on a measure of overlapping between density curves as well as a penalization criterion which takes accounts for the number of clusters in a candidate partition. Next, we consider the performance of the proposed method for some artificial datasets and for 20 different real datasets and compare NeSSC with the other three popular methods of semi-supervised clustering with a numeric outcome. Results show that NeSSC with the overlapping criterion works particularly well when a reduced number of clusters are scattered localized. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Tree-based analysis of longevity predictors and their ten-year changes: a 35-Year mortality follow-up.
- Author
-
Nosraty, Lily, Nevalainen, Jaakko, Raitanen, Jani, and Enroth, Linda
- Subjects
REGRESSION trees ,LONGEVITY ,REGRESSION analysis ,LIFE tables ,MACHINE learning - Abstract
Background: Prior studies on longevity often examine predictors in isolation and rely solely on baseline information, limiting our understanding of the most important predictors and their dynamic nature. In this study, we used an innovative regression tree model to explore the common characteristics of those who lived longer than their age and sex peers in 35-years follow-up. We identified different pathways leading to a long life, and examined to how changes in characteristics over 10 years (from 1979 to 1989) affect the findings on longevity predictors. Methods: Data was obtained from the "Tampere Longitudinal Study on Ageing" (TamELSA) in Finland. Survey data was collected in 1979 from 1056 participants aged 60–89 years (49.8% men). In 1989, a second survey was conducted among 432 survivors from the 1979 cohort (40.2% men). Dates of death were provided by the Finnish Population Register until 2015. We employed an individual measure of longevity known as the realized probability of dying (RPD), which was calculated based on each participant's age and sex, utilizing population life tables. RPD is based on a comparison of the survival time of each individual of a specific age and sex with the survival time of his/her peers in the total population. A regression tree analysis was used to examine individual-based longevity with RPD as an outcome. Results: This relative measure of longevity (RPD) provided a complex regression tree where the most important characteristics were self-rated health, years of education, history of smoking, and functional ability. We identified several pathways leading to a long life such as individuals with (1) good self-rated health (SRH), short smoking history, and higher education, (2) good SRH, short smoking history, lower education, and excellent mobility, and (3) poor SRH but able to perform less demanding functions, aged 75 or older, willing to do things, and sleeping difficulties. Changes in the characteristics over time did not change the main results. Conclusion: The simultaneous examination of a broad range of potential predictors revealed that longevity can be achieved under very different conditions and is achieved by heterogeneous groups of people. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Calculation of the mechanical properties of high‐performance concrete employing hybrid and ensemble‐hybrid techniques.
- Author
-
Zhang, Leilei and Zhao, Yuwei
- Subjects
- *
OPTIMIZATION algorithms , *METAHEURISTIC algorithms , *STRUCTURAL engineering , *DATABASES , *REGRESSION trees - Abstract
This study aims to execute machine learning methods to predict the mechanical properties containing TS and CS of HPC. They are essential parameters for the durability, workability, and efficiency of concrete structures in civil engineering. In this regard, obtaining the estimation of the mechanical properties of HPC is complex energy and time‐consuming. Due to this, an observed database was compiled, including 168 datasets for CS and 120 for TS. This database trained and validated two machine learning models: SVR and RT. The models combine the prediction outputs from the meta‐heuristic algorithms to build hybrid and ensemble‐hybrid models, which include dwarf mongoose optimization, PPSO, and moth flame optimization. According to the observed outputs, the ensemble models have great potential to be a recourse to deal with the overfitting problem of civil engineering, thus leading to the development of more supportable and less polluting concrete structures. This research significantly improves the efficiency and accuracy of predicting vital mechanical properties in high‐performance concrete by integrating machine learning and metaheuristic algorithms, offering promising avenues for enhanced concrete structure design and development. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. Machine learning prediction of methane, ethane, and propane solubility in pure water and electrolyte solutions: Implications for stray gas migration modeling.
- Author
-
Kooti, Ghazal, Taherdangkoo, Reza, Chen, Chaofan, Sergeev, Nikita, Doulati Ardejani, Faramarz, Meng, Tao, and Butscher, Christoph
- Subjects
- *
MACHINE learning , *OPTIMIZATION algorithms , *REGRESSION trees , *GAS migration , *HYDRAULIC fracturing , *ELECTROLYTE solutions , *SHALE gas - Abstract
Hydraulic fracturing is an effective technology for hydrocarbon extraction from unconventional shale and tight gas reservoirs. A potential risk of hydraulic fracturing is the upward migration of stray gas from the deep subsurface to shallow aquifers. The stray gas can dissolve in groundwater leading to chemical and biological reactions, which could negatively affect groundwater quality and contribute to atmospheric emissions. The knowledge of light hydrocarbon solubility in the aqueous environment is essential for the numerical modelling of flow and transport in the subsurface. Herein, we compiled a database containing 2129 experimental data of methane, ethane, and propane solubility in pure water and various electrolyte solutions over wide ranges of operating temperature and pressure. Two machine learning algorithms, namely regression tree (RT) and boosted regression tree (BRT) tuned with a Bayesian optimization algorithm (BO) were employed to determine the solubility of gases. The predictions were compared with the experimental data as well as four well-established thermodynamic models. Our analysis shows that the BRT-BO is sufficiently accurate, and the predicted values agree well with those obtained from the thermodynamic models. The coefficient of determination (R2) between experimental and predicted values is 0.99 and the mean squared error (MSE) is 9.97 × 10−8. The leverage statistical approach further confirmed the validity of the model developed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Estimation of the time of concentration of small watersheds located in Northeastern North America.
- Author
-
Bolduc, Samuel, Mailhot, Alain, and Talbot, Guillaume
- Subjects
- *
REGRESSION trees , *SQUARE root , *HYDROLOGY , *RIVER channels , *LAKES - Abstract
The time of concentration is an important concept in hydrology. It provides a characteristic hydrological response time (CHRT) useful in many applications. Estimation of the time of concentration is challenging because small watersheds (<100 km2) with sub-daily flow and precipitation records are uncommon. Many practitioners therefore use empirical equations developed from watersheds exposed to different climates and with different attributes. The main objective of this study is to develop an approach to estimate the CHRT from physiographic characteristics for small watersheds located in Ontario, Québec and the northeastern USA. Regression trees are used to identify the physiographic characteristics associated with CHRT. The fraction of lakes and wetlands was identified as the most significant attribute related to CHRT, followed by the ratio between the main watercourse length and the square root of the main watercourse slope. Uncertainties on estimated CHRT values based on regression tree are also provided. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Relationships between agronomic traits and characterization of the white oat ideotype for cultivation with and without chemical fertilization
- Author
-
Murilo Vieira Loro, Ivan Carvalho, Genesio Luiz Meggiolaro Junior, Leonardo Cesar Pradebon, Jaqueline Piesanti Sangiovo, João Pedro Dalla Roza, and Willyan Júnior Adorian Bandeira
- Subjects
avena sativa ,correlation ,path analysis ,regression tree ,kohonen map ,Special aspects of education ,LC8-6691 ,Technology - Abstract
This paper aimed to characterize and verify whether the linear relationships between agronomic traits of white oat are different between crops with and without chemical fertilization; and identify the agronomic ideotype that enhances the agronomic performance of white oats. Two uniformity tests were carried out with and without chemical fertilization in the 2020 harvest. In each trial, on 285 plants, agronomic traits were measured. Pearson's linear correlation coefficients and the direct and indirect effects of the trial analysis were calculated. The regression tree algorithm and Kohonen map neural network were used to identify the agronomic ideotype. The linear relationships between agronomic characters of white oat are similar between crops with and without chemical fertilization. White oat genotypes with greater panicle grain weight can be selected indirectly by panicle weight, regardless of cultivation with or without fertilization. White oat genotypes measuring 114.57 cm in height, 97.41 cm in panicle insertion, 18.11 cm in panicle length, 1.31 g in panicle weight and 27.92 grains in the panicle characterize the agronomic ideotype that maximizes panicle grain weight.
- Published
- 2024
- Full Text
- View/download PDF
20. Prediction of body weight of mixed breeds of pigs in Nigeria through morpho-biometric traits using classification and regression tree models.
- Author
-
Mallam, I., Yakubu, A., and Achi, N. P.
- Subjects
- *
WEIGHT of swine , *BODY weight , *MARKET prices , *ANIMAL industry - Abstract
The study was conducted to predict the body weight of mixed breeds of pigs in Nigeria through morpho-biometric traits (body length, chest girth, height at withers, ear length, head length, foreleg length, and hind leg length) using classification and regression tree models. The data were produced using 500 randomly selected mixed breeds of pigs from various farms in five Local Government Areas of Kaduna State, North West Nigeria. The collected data were analysed using the Statistical Package for Social Sciences (SPSS, 2016). Body weight correlated well with morphometric characteristics except with foreleg length, which had a low correlation and no significant (P>0.05) difference. Two body dimensions were shown to be more effective in predicting the body weight of the mixed-breeds based on the significance of the independent variables: chest girth and body length. The largest dividing variable was determined to be chest girth, which explained roughly 88.60 % of the difference in body weight. The decision tree model revealed that pigs with chest girth or chest circumference greater than 76.00 cm are expected to have a higher body weight, which livestock producers and researchers could use to determine the feed amount, drug dose, and market price of an animal, as well as the management, selection, and genetic improvement of mixed breeds of pigs in Nigeria. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. A new approach for classification of stretch-shortening cycle: Beyond 250 ms of ground contact time.
- Author
-
Ünver, Evrim, Konşuk Ünlü, Hande, Yıldız, Adalet E., and Cinemre, Şükrü Alpan
- Subjects
- *
SKELETAL muscle physiology , *BIOMECHANICS , *PLYOMETRICS , *RESEARCH funding , *ACHILLES tendon , *MUSCLE strength , *JUMPING , *COMPARATIVE studies , *TIME , *REGRESSION analysis - Abstract
The stretch-shortening cycle (SSC) has been classified into fast (<250 ms) and slow (>250 ms) groups based on ground contact time (GCT) threshold values. However, there are gaps in the literature on how the 250 ms threshold value was found and which variables affect it. The purpose of this study is to validate the 250 ms threshold by investigating the factors affecting this threshold. For this purpose, force–time variables during a drop jump (DJ) with a force plate and achilles tendon (AT) muscle-tendon unit mechanical properties using shear-wave elastography in 46 recreationally active men were analysed. A regression tree analysis was conducted using R studio to classify GCT with correlated variables (p < 0.05). The new GCT threshold values (GCT < 188 ms, 188 ≤ GCT < 222 ms and GCT ≥ 222 ms) were found according to the lowest root mean square error of approximation value (0.1985) at reactive strength index. Comparisons of GCT groups showed significant differences in force, time, power variables and AT length (p < 0.05). AT length is the main variable differentiating GCT groups: Short AT results in a short GCT and long AT results in a long GCT. This study reveals that SSC can be classified into three groups using new GCT threshold values, offering a new perspective for SSC assessment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Random forests regression for soft interval data.
- Author
-
Gaona-Partida, Paul, Yeh, Chih-Ching, Sun, Yan, and Cutler, Adele
- Abstract
AbstractAnalyzing soft interval data for uncertainty quantification has attracted much attention recently. Within this context, regression methods for interval data have been extensively studied. As most existing works focus on linear models, it is important to note that many problems in practice are nonlinear in nature and the development of nonlinear regression tools for interval data is crucial. This paper proposes an interval-valued random forests model that defines the splitting criterion of variance reduction based on an
L 2 type metric in the space of compact intervals. The model simultaneously considers the centers and ranges of the interval data as well as their possible interactions. Unlike most linear models that require additional constraints to ensure mathematical coherences, the proposed random forests model estimates the regression function in a nonparametric way, and so the predicted interval length is naturally nonnegative without any constraints. Simulation studies show that the new method outperforms typical existing regression methods for various linear, semi-linear, and nonlinear data archetypes and under different error measures. To demonstrate the applicability, a real data example is presented where the price range data of the Dow Jones Industrial Average index and its component stocks are analyzed. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
23. A Regression Tree as Acquisition Function for Low-Dimensional Optimisation
- Author
-
de Paz, Erick G. G., Vaquera Huerta, Humberto, Albores Velasco, Francisco Javier, Bauer Mengelberg, John R., Romero Padilla, Juan Manuel, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Mezura-Montes, Efrén, editor, Acosta-Mesa, Héctor Gabriel, editor, Carrasco-Ochoa, Jesús Ariel, editor, Martínez-Trinidad, José Francisco, editor, and Olvera-López, José Arturo, editor
- Published
- 2024
- Full Text
- View/download PDF
24. Comparative Study of Supervised Regression Algorithms in Machine Learning
- Author
-
Sabouri, Zineb, Gherabi, Noreddine, Amnai, Mohamed, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Gherabi, Noredine, editor, Awad, Ali Ismail, editor, Nayyar, Anand, editor, and Bahaj, Mohamed, editor
- Published
- 2024
- Full Text
- View/download PDF
25. Predicting Response Latencies on Test Questions Based on Features of the Questions
- Author
-
Denner, Madelynn, Xu, Xiangyu, Ober, Teresa M., Pei, Bo, Cheng, Ying, and Khine, Myint Swe, editor
- Published
- 2024
- Full Text
- View/download PDF
26. Diet and Food Restaurant in the Covid-19 Time by Machine Learning Approaches
- Author
-
Babul Islam, Md., Hasibunnahar, Swarna, Shukla, Piyush Kumar, Shukla, Prashant Kumar, Rawat, Paresh, Chatterjee, Prasenjit, Series Editor, Awasthi, Anjali, Series Editor, Tiwari, Manoj Kumar, Series Editor, Chakraborty, Shankar, Series Editor, Yazdani, Morteza, Series Editor, Kautish, Sandeep, editor, Pamucar, Dragan, editor, Pradeep, N., editor, and Singh, Deepmala, editor
- Published
- 2024
- Full Text
- View/download PDF
27. The historical lepto-variance of the US stock returns
- Author
-
Vassilis Polimenis
- Subjects
total variance ,regression tree ,lepto-variance ,macro-variance ,lepto-ratio ,lepto-regression ,Finance ,HG1-9999 ,Statistics ,HA1-4737 - Abstract
Regression trees (RT) involve sorting samples based on a particular feature and identifying the splitting point that yields the highest drop in variance from a parent node to its children. The optimal factor for reducing mean squared error (MSE) is the target variable itself. Consequently, employing the target variable as the basis for splitting sets an upper limit on the reduction of MSE and, equivalently, a lower limit on the residual MSE. Building upon this observation, we define lepto-regression as the process of constructing an RT of a target feature on itself. Lepto-variance pertains to the portion of variance that cannot be mitigated by any regression tree, providing a measure of inherent variance at a specific tree depth. This concept is valuable as it offers insights into the intrinsic structure of the dataset by establishing an upper boundary on the "resolving power" of RTs for a sample. The maximal variance that can be accounted for by RTs with depths up to k is termed the sample k-bit macro-variance. At each depth, the overall variance within a dataset is thus broken into lepto- and macro-variance. We perform 1- and 2-bit lepto-variance analysis for the entire US stock universe for a large historical period since 1926. We find that the optimal 1-bit split is a 30–70 balance. The two children subsets are centered roughly at −1% and 0.5%. The 1-bit macro-variance is almost 42% of the total US stock variability. The other 58% is structure beyond the resolving power of a 1-bit RT. The 2-bit lepto-variance equals 26.3% of the total, with 42% and 47% of the 1-bit lepto-variance of the left and right subtree, respectively.
- Published
- 2024
- Full Text
- View/download PDF
28. Effect of environmental factors on conjugative transfer of antibiotic resistance genes in aquatic settings.
- Author
-
Dadeh Amirfard, Katayoun, Moriyama, Momoko, Suzuki, Satoru, and Sano, Daisuke
- Subjects
- *
HORIZONTAL gene transfer , *DRUG resistance in bacteria , *BACTERIAL transformation , *BACTERIAL conjugation , *REGRESSION trees - Abstract
Antimicrobial-resistance genes (ARGs) are spread among bacteria by horizontal gene transfer, however, the effect of environmental factors on the dynamics of the ARG in water environments has not been very well understood. In this systematic review, we employed the regression tree algorithm to identify the environmental factors that facilitate/inhibit the transfer of ARGs via conjugation in planktonic/biofilm-formed bacterial cells based on the results of past relevant research. Escherichia coli strains were the most studied genus for conjugation experiments as donor/recipient in the intra-genera category. Conversely, Pseudomonas spp. Acinetobacter spp. and Salmonella spp. were studied primarily as recipients across inter-genera bacteria. The conjugation efficiency (ce) was found to be highly dependent on the incubation period. Some antibiotics, such as nitrofurantoin (at ≥0.2 µg ml−1) and kanamycin (at ≥9.5 mg l−1) as well as metallic compounds like mercury (II) chloride (HgCl2, ≥3 µmol l−1), and vanadium (III) chloride (VCl3, ≥50 µmol l−1) had enhancing effect on conjugation. The highest ce value (−0.90 log10) was achieved at 15°C–19°C, with linoleic acid concentrations <8 mg l−1, a recognized conjugation inhibitor. Identifying critical environmental factors affecting ARG dissemination in aquatic environments will accelerate strategies to control their proliferation and combat antibiotic resistance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Performance Comparison of Machine Learning Models for Concrete Compressive Strength Prediction.
- Author
-
Sah, Amit Kumar and Hong, Yao-Ming
- Subjects
- *
ARTIFICIAL neural networks , *COMPRESSIVE strength , *STANDARD deviations , *CONCRETE testing , *MACHINE performance , *REGRESSION trees , *MACHINE learning - Abstract
This study explores the prediction of concrete compressive strength using machine learning models, aiming to overcome the time-consuming and complex nature of conventional methods. Four models—an artificial neural network (ANN), a multiple linear regression, a support vector machine, and a regression tree—are employed and compared for performance, using evaluation metrics such as mean absolute deviation, root mean square error, coefficient of correlation, and mean absolute percentage error. After preprocessing 1030 samples, the dataset is split into two subsets: 70% for training and 30% for testing. The ANN model, further divided into training, validation (15%), and testing (15%), outperforms others in accuracy and efficiency. This outcome streamlines compressive strength determination in the construction industry, saving time and simplifying the process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. PREDICTING BODY WEIGHT OF THREE CHICKEN GENOTYPES FROM LINEAR BODY MEASUREMENTS USING MARS AND CART DATA MINING ALGORITHMS.
- Author
-
ASSAN, N., MPOFU, M., MUSASIRA, M., MOKOENA, K., TYASI, T. L., and MWAREYA, N.
- Subjects
BODY weight ,STANDARD deviations ,LENGTH measurement ,DATA mining ,CHICKENS - Abstract
The aim of the current study was to predict the body weight from linear body measurements of Astrolope, Boschveld and indigenous Sacco genotype using Classification and regression tree (CART) and Multivariate Adaptive Regression Spline (MARS) algorithm. A total of 389 body weight (BW) records, including five continuous predictors such as Neck length (NL), body circumference (BC), shank length (SL), body length (BL) and shank circumference (SC) were used. The best model was selected based on goodness of fit, such as, standard deviation ratio (SDR), root mean square error (RMSE), coefficient of variation (CV), adjusted coefficient of determination (ARsq), coefficient of determination (Rsq) and Pearson's correlation coefficients (PC). The Rsq (%) values ranged from 59 (MARS) to 69 (CART). The lowest SDR was recorded by CART (0.56) and the highest by MARS (0.70). The CART was selected to be the best algorithm with sex, genotype, SC, SL, BL, NL, and BC as influential predictor of BW. The heaviest body weight on females of genotype (Boschveld, Sacco) was recorded when BL was less than 43 cm and BL higher than 47 cm. The goodness of fit criteria suggest that CART model outperformed the MARS model on predicting the body weight of the three genotypes. The findings will assist farmers in the prediction of body wight and selection of heavier chickens. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Short-Term Load Forecasting Based on Optimized Random Forest and Optimal Feature Selection.
- Author
-
Magalhães, Bianca, Bento, Pedro, Pombo, José, Calado, Maria do Rosário, and Mariano, Sílvio
- Subjects
- *
RANDOM forest algorithms , *FEATURE selection , *REGRESSION trees , *FORECASTING , *ELECTRIC power consumption , *COST control - Abstract
Short-term load forecasting (STLF) plays a vital role in ensuring the safe, efficient, and economical operation of power systems. Accurate load forecasting provides numerous benefits for power suppliers, such as cost reduction, increased reliability, and informed decision-making. However, STLF is a complex task due to various factors, including non-linear trends, multiple seasonality, variable variance, and significant random interruptions in electricity demand time series. To address these challenges, advanced techniques and models are required. This study focuses on the development of an efficient short-term power load forecasting model using the random forest (RF) algorithm. RF combines regression trees through bagging and random subspace techniques to improve prediction accuracy and reduce model variability. The algorithm constructs a forest of trees using bootstrap samples and selects random feature subsets at each node to enhance diversity. Hyperparameters such as the number of trees, minimum sample leaf size, and maximum features for each split are tuned to optimize forecasting results. The proposed model was tested using historical hourly load data from four transformer substations supplying different campus areas of the University of Beira Interior, Portugal. The training data were from January 2018 to December 2021, while the data from 2022 were used for testing. The results demonstrate the effectiveness of the RF model in forecasting short-term hourly and one day ahead load and its potential to enhance decision-making processes in smart grid operations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Multiscale computation of different plan curvature forms to enhance the prediction of soil properties in a low-relief watershed.
- Author
-
Khanifar, Javad and Khademalrasoul, Ataallah
- Subjects
- *
CART algorithms , *DIGITAL soil mapping , *CURVATURE , *SOIL moisture , *DIGITAL elevation models - Abstract
This study focuses on the multiscale calculation of different plan curvature forms to enhance the modeling of soil penetration resistance and gravimetric soil water content utilizing the classification and regression trees algorithm in a low-relief watershed. To that end, three forms of plan curvature were derived using the Wood method from a two-meter digital elevation model on six neighborhood sizes. The results showed that the neighborhood size influenced the plan curvature values and there was little difference between the utilization of three forms of plan curvature in the landform determination. The modeling results indicated that the three forms of plan curvature on most neighborhood scales have different contributions to each other in modeling the spatial variability of each soil property. The neighborhood scale was a critical factor in soil modeling because it controls the smoothing rate of plan curvature. The overall results suggest that soil models with poor performance could be constructed if the plan curvature forms and the neighborhood size are not considered in the geomorphometric analysis. Therefore, it is recommended to use the procedure implemented in this study for digital soil mapping in various regions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. CAPM and a new investment decision method.
- Author
-
Taussig, Roi D.
- Subjects
REGRESSION trees ,MARKET prices ,MARKET pricing ,RESEARCH personnel - Abstract
The traditional CAPM explains the return on each security, while the current study suggests a new methodology for "picking" a broad index, based on regression trees. Analysts' recommendations on market prices are analyzed, and a one to three‐stage method is employed. The analysis suggests a few rules of thumb (only one to three stages) for buying or selling the CRSP US market index. The findings are robust, and the rules are reliable. Researchers and practitioners may benefit greatly from the new rules. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Taxi-out time prediction at Mohammed V Casablanca Airport.
- Author
-
Zbakh, Douae, El Gonnouni, Amina, Benkacem, Abderrahmane, Kasttet, Mohammed Said, and Lyhyaoui, Abdelouahid
- Subjects
AIR travel ,TRAFFIC estimation ,STANDARD deviations ,SUPPORT vector machines ,MACHINE learning ,REGRESSION trees ,AIRPORTS - Abstract
Airports are vital for global connectivity. However, the increasing volume of air travel has presented significant challenges in airport managing. Accurate predictions of taxi-out times (TXOT) offer potential to enhance airport performance, minimize delays, optimize airline schedules, and enhance customer satisfaction. This paper focuses on developing a machine learning model to forecast taxi-out times at Mohammed V Airport. Historical taxiing data from various airports will be analyzed to predict taxi-out times based on diverse runway-stand combinations and congestion levels. we used neural network (NN), support vector machines (SVM), and regression tree (RT) in order to create a real-time model that forecasts TXOT and congestion levels for different runway-stand combinations. The result showed that the NN model outperformed other forecasting models when their performances are compared using the mean absolute percentage error, root mean square error as accuracy measures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. A Novel Machine Learning-Based Approach for Fault Detection and Location in Low-Voltage DC Microgrids.
- Author
-
Salehimehr, Sirus, Miraftabzadeh, Seyed Mahdi, and Brenna, Morris
- Abstract
DC microgrids have gained significant attention in recent years due to their potential to enhance energy efficiency, integrate renewable energy sources, and improve the resilience of power distribution systems. However, the reliable operation of DC microgrids relies on the early detection and location of faults to ensure an uninterrupted power supply. This paper aims to develop fast and reliable fault detection and location mechanisms for DC microgrids, thereby enhancing operational efficiency, minimizing environmental impact, and contributing to resource conservation and sustainability goals. The fault detection method is based on compressed sensing (CS) and Regression Tree (RT) techniques. Besides, an accurate fault location method using the feature matrix and long short-term memory (LSTM) model combination has been provided. To implement the proposed fault detection and location method, a DC microgrid equipped with photovoltaic (PV) panels, the vehicle-to-grid (V2G) charging station, and a hybrid energy storage system (ESS) are used. The simulation results represent the proposed methods' superiority over the recent studies. The fault occurrence in the studied DC microgrid is detected in 1 ms, and the proposed fault location method locates the fault with an accuracy of more than 93%. The presented techniques enhance DC microgrid reliability while conserving renewable resources, vital to promoting a greener and more sustainable power grid. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. 2D score-based estimation of heterogeneous treatment effects
- Author
-
Ye Steven Siwei, Chen Yanzhen, and Padilla Oscar Hernan Madrid
- Subjects
observational data ,subgroup treatment effects ,regression tree ,matching ,62d20 ,62g05 ,Mathematics ,QA1-939 ,Probabilities. Mathematical statistics ,QA273-280 - Abstract
Statisticians show growing interest in estimating and analyzing heterogeneity in causal effects in observational studies. However, there usually exists a trade-off between accuracy and interpretability for developing a desirable estimator for treatment effects, especially in the case when there are a large number of features in estimation. To make efforts to address the issue, we propose a score-based framework for estimating the conditional average treatment effect (CATE) function in this article. The framework integrates two components: (i) leverage the joint use of propensity and prognostic scores in a matching algorithm to obtain a proxy of the heterogeneous treatment effects for each observation and (ii) utilize nonparametric regression trees to construct an estimator for the CATE function conditioning on the two scores. The method naturally stratifies treatment effects into subgroups over a 2d grid whose axis are the propensity and prognostic scores. We conduct benchmark experiments on multiple simulated data and demonstrate clear advantages of the proposed estimator over state-of-the-art methods. We also evaluate empirical performance in real-life settings, using two observational data from a clinical trial and a complex social survey, and interpret policy implications following the numerical results.
- Published
- 2023
- Full Text
- View/download PDF
37. Patterns and drivers of amphibian and reptile road mortality vary among species and across scales: Evidence from eastern Ontario, Canada
- Author
-
Joshua D. Jones, Ori Urquhart, Evelyn Garrah, Ewen Eberhardt, and Ryan K. Danby
- Subjects
Road ecology ,Wildlife-vehicle collisions ,Roadkill ,Conservation biology ,Herpetofauna ,Regression tree ,Ecology ,QH540-549.5 - Abstract
The mortality of wildlife on roadways is a major conservation concern worldwide. Amphibians and reptiles are especially vulnerable to vehicular collisions, and this is of particular concern in the Frontenac Arch Biosphere Reserve (Ontario, Canada) where several species are near their geographic limits of distribution and designated as species-at-risk. We completed regular surveys (n = 270) of two major highways in the Reserve, each slightly less than 40 km in length. All observations of wildlife-vehicle collisions were documented for two years on each road, including 18,278 frogs, turtles, and snakes. We used kernel density estimation to map relative magnitude of this mortality and built a suite of regression tree models to assess the influence of landcover and other habitat factors on roadkill at two scales (1 ha and 20 ha). Sample size was large enough to conduct species-level analyses for Chrysemys picta marginata (midland painted turtle) and Nerodia sipedon (northern watersnake). Spatial clustering of roadkill was evident on both roads and for all taxa. However, the extent of clustering varied between the two roadways due to differences in landcover pattern and clustering was more discrete for frogs and turtles than for snakes. For frogs, turtles, and northern watersnakes we found that elevated levels of mortality were positively associated with the amount of wetland and open water in adjacent areas as well as the proximity of water features. However, mortality locations for other species of snakes were more closely associated with upland habitat types. While some generalities emerge from our study, the variation also suggests that caution be exercised when attempting to extend results to different taxa and roadways, especially since these results may vary with scale. Nonetheless, scale-related differences can be informative for identifying the location of roadkill mitigation efforts and we illustrate how such an approach could be implemented for snakes that exhibit less discrete clustering of mortality.
- Published
- 2024
- Full Text
- View/download PDF
38. A regression tree method for longitudinal and clustered data with multivariate responses.
- Author
-
Jing, Wenbo and Simonoff, Jeffrey S.
- Subjects
- *
REGRESSION trees , *PANEL analysis , *LONGITUDINAL method , *MULTICASTING (Computer networks) - Abstract
In this paper, we propose a tree-based method called Multivariate RE-EM tree, which combines the regression tree and the linear mixed effects model for modeling multivariate response longitudinal or clustered data. The Multivariate RE-EM tree method estimates a population-level single tree structure that is driven by the multiple responses simultaneously and object-level random effects for each response variable, where correlation between the response variables and between the associated random effects are each allowed. Through simulation studies, we verify the advantage of the Multivariate RE-EM tree over the use of multiple univariate RE-EM trees and the Multivariate Regression Tree. We apply the Multivariate RE-EM tree to analyze a real data set that contains multidimensional nonfinancial characteristics of poverty of different countries as responses, and various potential causes of poverty as predictors. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Explaining Central Government's Tax Revenue Categories through the Bradley-Terry Regression Trunk Model.
- Author
-
Baldassarre, Alessio, D'Ambrosio, Antonio, and Conversano, Claudio
- Subjects
- *
INTERNAL revenue , *INCOME tax , *PUBLIC finance , *REGRESSION trees , *LOG-linear models - Abstract
The Bradley-Terry Regression Trunk (BTRT) model combines the log-linear Bradley-Terry model, including subject-specific covariates, with a particular tree-based model, the so-called regression trunk. It aims to consider simultaneously the main effects and the interaction effects of covariates on data expressed as paired comparisons. We apply this model to financial data expressed as rankings and then transformed into paired comparisons. Tax revenues differentiated by category represent the statistical units of the analysis (i.e., taxes on income, social security contributions, taxes on property, and taxes on goods and services). We combine data from OECD, World Bank, and IMF databases for the year 2018 to investigate the effect size of socio-economic covariates and their interaction on the composition of tax revenues for a set of 100 countries worldwide. We also present a comparison with a more established method proposed in tax determinants literature and with two alternative models used for matched pairs. Finally, we discuss the implications of reported results for stakeholders and policymakers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. Predicting Claim Reserves When the Loss Development Factors are Unstable: A Case Study from Indonesia’s General Insurance Company
- Author
-
Nugraha, Ruth Cornelia, Qoyyimi, Danang Teguh, Mustapha, Aida, editor, Ibrahim, Norzuria, editor, Basri, Hatijah, editor, Rusiman, Mohd Saifullah, editor, and Zuhaib Haider Rizvi, Syed, editor
- Published
- 2023
- Full Text
- View/download PDF
41. A Gaussian process regression-based Noise level Prediction technique for assisting Image Super-resolution
- Author
-
Rai, Deepak, Rajput, Shyam Singh, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Arya, Karm Veer, editor, Tripathi, Vipin Kumar, editor, Rodriguez, Ciro, editor, and Yusuf, Eddy, editor
- Published
- 2023
- Full Text
- View/download PDF
42. Prediction of CO Emission in Cars Using Machine Learning Algorithms
- Author
-
Sayed, Gehad Ismail, Hassanien, Aboul Ella, Kacprzyk, Janusz, Series Editor, Hassanien, Aboul Ella, editor, and Darwish, Ashraf, editor
- Published
- 2023
- Full Text
- View/download PDF
43. The Bradley–Terry Regression Trunk approach for Modeling Preference Data with Small Trees.
- Author
-
Baldassarre, Alessio, Dusseldorp, Elise, D'Ambrosio, Antonio, Rooij, Mark de, and Conversano, Claudio
- Subjects
LOG-linear models ,DATA modeling ,JUDGES ,REGRESSION analysis ,TREES - Abstract
This paper introduces the Bradley–Terry regression trunk model, a novel probabilistic approach for the analysis of preference data expressed through paired comparison rankings. In some cases, it may be reasonable to assume that the preferences expressed by individuals depend on their characteristics. Within the framework of tree-based partitioning, we specify a tree-based model estimating the joint effects of subject-specific covariates over and above their main effects. We, therefore, combine a tree-based model and the log-linear Bradley-Terry model using the outcome of the comparisons as response variable. The proposed model provides a solution to discover interaction effects when no a-priori hypotheses are available. It produces a small tree, called trunk, that represents a fair compromise between a simple interpretation of the interaction effects and an easy to read partition of judges based on their characteristics and the preferences they have expressed. We present an application on a real dataset following two different approaches, and a simulation study to test the model's performance. Simulations showed that the quality of the model performance increases when the number of rankings and objects increases. In addition, the performance is considerably amplified when the judges' characteristics have a high impact on their choices. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Neural Networks with Dependent Inputs.
- Author
-
Boskabadi, Mostafa and Doostparast, Mahdi
- Subjects
REGRESSION trees ,NEUROPLASTICITY ,WEIGHT training ,MONTE Carlo method ,DATA science - Abstract
Neural networks and decision tree algorithms are essential tools in machine learning and data science. They deal with patterns among inputs and provide predictions for targets. In this article, we use a hybrid approach in regression trees by incorporating possible dependencies among inputs and apply neural networks in terminal nodes. The proposed approach implements neural networks on the basis of dependency structures among inputs. We allow that the weights in training neural networks differ in various terminal nodes. In both regression and classification problems, the performance of the new approach is assessed by analyzing various real datasets and by conducting a Monte–Carlo simulation study. We show that the proposed approach provides more flexibility for neural networks when associations among inputs are observed. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Decision Tree-Supported Analysis of Gallium Arsenide Growth Using the LEC Method.
- Author
-
Tang, Xia, Chappa, Gagan Kumar, Vieira, Lucas, Holena, Martin, and Dropka, Natasha
- Subjects
DECISION making ,GALLIUM arsenide ,COMPUTATIONAL fluid dynamics ,DECISION trees ,CRYSTAL growth - Abstract
In this study, an axisymmetric Czochralski furnace model for the LEC growth of gallium arsenide is presented. We produced 88 datasets through computational fluid dynamics simulations. Among the many parameters that affect crystal growth, a total of 13 input parameters were selected, including the geometry and material parameters of the hot zone (crucible, heaters, radiation shield, and crystal), as well as the process parameters (such as pulling and rotation rates, heating power, etc.). Voronkov criteria (v/G
n ), interface deflection, and the average interface temperature gradient were selected as the output parameters. We carried out a correlation analysis between the variables and used decision trees to study the impact of the 13 input variables on the output variables. The results indicated that in the growth of gallium arsenide, the main factor affecting interface deflection and the average interface thermal gradients is the crucible rotation rate. For v/Gn , it is the pulling rate. [ABSTRACT FROM AUTHOR]- Published
- 2023
- Full Text
- View/download PDF
46. Comparative analysis of regression algorithms for the prediction of NavIC differential corrections.
- Author
-
Karthan, Madhu Krishna and Perumalla, Naveen Kumar
- Subjects
- *
REGRESSION analysis , *ARTIFICIAL satellites in navigation , *COMPARATIVE studies , *FORECASTING , *CITIES & towns , *REGRESSION trees - Abstract
Indian Regional Navigation Satellite System (IRNSS) or Navigation with Indian Constellation (NavIC) provides positioning, navigation and timing information services to various users in Indian region. Standalone NavIC may not meet the position accuracies for certain application such as civil aviation. Differential NavIC is used for improving the position accuracy of rover receiver, which make use of differential corrections (transmitted from reference station). However, if the satellite signals are temporarily lost due to abruptly changing atmosphere, satellite health issues or if the satellite signals are attenuated due to city infrastructures in urban areas, tree canopies, the accuracy of NavIC will be degraded. This article compares regression tree and bagging tree based differential corrections prediction algorithm with the actual differential corrections, by considering the NavIC satellite signal strength (C/No) and elevation angle (El), to improve the NavIC positioning accuracy. The improvement in the position accuracy is obtained by utilizing predicted differential corrections. The position accuracy of rover using actual differential corrections (2DRMS – 3.09 m), regression tree predicted differential corrections (2DRMS – 5.96 m) and bagged tree predicted differential corrections (2DRMS – 3.06 m) are compared. Here, the rover accuracy using actual differential corrections and bagged tree predicted differential corrections are approximately equal. So, the position accuracy using bagged tree predicted differential corrections are accurate when compared to regression tree predicted differential corrections. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Predicting and Mapping Dominant Height of Oriental Beech Stands Using Environmental Variables in Sinop, Northern Turkey.
- Author
-
Yener, Ismet and Guvendi, Engin
- Abstract
The dominant height of forest stands (SDH) is an essential indicator of site productivity in operational forest management. It refers to the capacity of a particular site to support stand growth. Sites with taller dominant trees are typically more productive and may be more suitable for certain management practices. The present study investigated the relationship between the dominant height of oriental beech stands and numerous environmental variables, including physiographic, climatic, and edaphic attributes. We developed models and generated maps of SDH using multilinear regression (MLR) and regression tree (RT) techniques based on environmental variables. With this aim, the total height, diameter at breast height, and age of sample trees were measured on 222 sample plots. Additionally, topsoil samples (0–20 cm) were collected from each plot to analyze the physical and chemical soil properties. The statistical results showed that latitude, elevation, mean annual maximum temperature, and several soil attributes (i.e., bulk density, field capacity, organic carbon, and pH) were significantly correlated with the SDH. The RT model outperformed the MLR model, explaining 57% of the variation in the SDH with an RMSE of 2.37 m. The maps generated by both models clearly indicated an increasing trend in the SDH from north to south, suggesting that elevation above sea level is a driving factor shaping forest canopy height. The assessments, models, and maps provided by this study can be used by forest planners and land managers, as there is no reliable data on site productivity in the studied region. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Corruption, quality of institutions and growth
- Author
-
Beyaert, Arielle, García-Solanes, José, and Lopez-Gomez, Laura
- Published
- 2023
- Full Text
- View/download PDF
49. Segmenting tourists by length of stay using regression tree models
- Author
-
Jackman, Mahalia and Naitram, Simon
- Published
- 2023
- Full Text
- View/download PDF
50. Modelling the Symphyotrichum lanceolatum invasion in Slovakia, Central Europe
- Author
-
Michalová, Martina, Hrabovský, Michal, Kubalová, Silvia, and Miháliková, Tatiana
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.