50 results on '"Additive model"'
Search Results
2. Regression Modeling of Machining Processes
- Author
-
Tchigirinsky, Yu. L., Chigirinskaya, N. V., Tikhonova, Z. S., Cavas-Martínez, Francisco, Series Editor, Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Haddar, Mohamed, Series Editor, Ivanov, Vitalii, Series Editor, Kwon, Young W., Series Editor, Trojanowska, Justyna, Series Editor, di Mare, Francesca, Series Editor, Radionov, Andrey A., editor, and Gasiyarov, Vadim R., editor
- Published
- 2021
- Full Text
- View/download PDF
3. Nighttime light intensity and child health outcomes in Bangladesh
- Author
-
Islam, Mohammad Rafiqul, Alam, Masud, Afzal, Munshi Naser İbne, and Alam, Sakila
- Published
- 2023
- Full Text
- View/download PDF
4. FITradeoff Method for the Location of Healthcare Facilities Based on Multiple Stakeholders’ Preferences
- Author
-
Dell’Ovo, Marta, Frej, Eduarda Asfora, Oppio, Alessandra, Capolongo, Stefano, Morais, Danielle Costa, de Almeida, Adiel Teixeira, van der Aalst, Wil, Series Editor, Mylopoulos, John, Series Editor, Rosemann, Michael, Series Editor, Shaw, Michael J., Series Editor, Szyperski, Clemens, Series Editor, Chen, Ye, editor, Kersten, Gregory, editor, Vetschera, Rudolf, editor, and Xu, Haiyan, editor
- Published
- 2018
- Full Text
- View/download PDF
5. Distributive Fairness in Educational Assessment : Psychometric Theory Meets Fuzzy Logic
- Author
-
Vossen, Paul Hubert, Kacprzyk, Janusz, Series editor, Pal, Nikhil R., Advisory editor, Bello Perez, Rafael, Advisory editor, Corchado, Emilio S., Advisory editor, Hagras, Hani, Advisory editor, Kóczy, László T., Advisory editor, Kreinovich, Vladik, Advisory editor, Lin, Chin-Teng, Advisory editor, Lu, Jie, Advisory editor, Melin, Patricia, Advisory editor, Nedjah, Nadia, Advisory editor, Nguyen, Ngoc Thanh, Advisory editor, Wang, Jun, Advisory editor, Balas, Valentina Emilia, editor, Jain, Lakhmi C., editor, and Balas, Marius Mircea, editor
- Published
- 2018
- Full Text
- View/download PDF
6. Empirical Methodology and Baseline Regression Results
- Author
-
Oto-Peralías, Daniel, Romero-Ávila, Diego, Oto-Peralías, Daniel, and Romero-Ávila, Diego
- Published
- 2017
- Full Text
- View/download PDF
7. A DSS for Resolving Evaluation of Criteria by Interactive Flexible Elicitation Procedure
- Author
-
de Almeida, Adiel Teixeira, Costa, Ana Paula Cabral Seixas, de Almeida, Jonatas Araujo, de Almeida-Filho, Adiel Teixeira, van der Aalst, Wil, Series editor, Mylopoulos, John, Series editor, Rosemann, Michael, Series editor, Shaw, Michael J., Series editor, Szyperski, Clemens, Series editor, Dargam, Fátima, editor, Hernández, Jorge E., editor, Zaraté, Pascale, editor, Liu, Shaofeng, editor, Ribeiro, Rita, editor, Delibašić, Boris, editor, and Papathanasiou, Jason, editor
- Published
- 2014
- Full Text
- View/download PDF
8. Multi-generation genomic prediction of maize yield using parametric and non-parametric sparse selection indices
- Author
-
Manje Gowda, Marco Lopez-Cruz, Gustavo de los Campos, Yoseph Beyene, Paulino Pérez-Rodríguez, and José Crossa
- Subjects
Genome ,Models, Genetic ,Nonparametric statistics ,Genomics ,Biology ,Quantitative trait ,Genetic models ,Polymorphism, Single Nucleotide ,Zea mays ,Article ,Set (abstract data type) ,Kernel method ,Phenotype ,Kernel (statistics) ,Statistics ,Genetics ,Additive model ,Genetics (clinical) ,Selection (genetic algorithm) ,Predictive modelling ,Parametric statistics - Abstract
Genomic prediction models are often calibrated using multi-generation data. Over time, as data accumulates, training data sets become increasingly heterogeneous. Differences in allele frequency and linkage disequilibrium patterns between the training and prediction genotypes may limit prediction accuracy. This leads to the question of whether all available data or a subset of it should be used to calibrate genomic prediction models. Previous research on training set optimization has focused on identifying a subset of the available data that is optimal for a given prediction set. However, this approach does not contemplate the possibility that different training sets may be optimal for different prediction genotypes. To address this problem, we recently introduced a sparse selection index (SSI) that identifies an optimal training set for each individual in a prediction set. Using additive genomic relationships, the SSI can provide increased accuracy relative to genomic-BLUP (GBLUP). Non-parametric genomic models using Gaussian kernels (KBLUP) have, in some cases, yielded higher prediction accuracies than standard additive models. Therefore, here we studied whether combining SSIs and kernel methods could further improve prediction accuracy when training genomic models using multi-generation data. Using four years of doubled haploid maize data from the International Maize and Wheat Improvement Center (CIMMYT), we found that when predicting grain yield the KBLUP outperformed the GBLUP, and that using SSI with additive relationships (GSSI) lead to 5–17% increases in accuracy, relative to the GBLUP. However, differences in prediction accuracy between the KBLUP and the kernel-based SSI were smaller and not always significant.
- Published
- 2021
9. Health gap for multimorbidity: comparison of models combining uniconditional health gap
- Author
-
Park, Bomi, Ock, Minsu, Jo, Min-Woo, Lee, Hye Ah, Lee, Eun-Kyung, Park, Bohyun, and Park, Hyesook
- Published
- 2020
- Full Text
- View/download PDF
10. Gaussian Mixture Model Based Semi-supervised Sparse Representation for Face Recognition
- Author
-
Ying Wen and Xinxin Shan
- Subjects
Computer Science::Machine Learning ,021110 strategic, defence & security studies ,business.industry ,Computer science ,Gaussian ,Supervised learning ,0211 other engineering and technologies ,Pattern recognition ,02 engineering and technology ,Semi-supervised learning ,Sparse approximation ,Mixture model ,Facial recognition system ,Statistics::Machine Learning ,symbols.namesake ,ComputingMethodologies_PATTERNRECOGNITION ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,In real life ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Additive model - Abstract
Sparse representation generally relies on supervised learning, however, the samples in real life are often unlabeled and sparse representation cannot make use of the information of the unlabeled samples. In this paper, we propose a Gaussian Mixture Model based Semi-supervised Sparse Representation (GSSR) for face recognition, and it takes full advantage of unlabeled samples to improve the performance of sparse representation. Firstly, we present a semi-supervised sparse representation, which is a linear additive model with rectification and makes all rectified samples conform to Gaussian distribution. Then, we reconstruct a new dictionary that derived from predicting the labels of unlabeled samples through Expectation-Maximization algorithm. Finally, we use the new dictionary embedded into sparse representation to recognize faces. Experiments on AR, LFW and PIE databases show that our method effectively improves the classification accuracy and has superiority even with a few unlabeled samples.
- Published
- 2021
11. Prioritizing Improvement Actions in a Fish Distribution Company: Integrating Elicitation by Decomposition and Holistic Evaluation with FITradeoff Method
- Author
-
Eduarda Asfora Frej, Adiel Teixeira de Almeida, and Marina Carvalhedo Correia
- Subjects
Strategic planning ,Decision support system ,Process management ,Computer science ,Process (engineering) ,Critical success factor ,Decomposition (computer science) ,Additive model ,SWOT analysis ,Preference - Abstract
In this paper, a prioritization problem of strategic improvement actions of a Fresh Fish Distribution Company (FFDC) is approached. Improvement actions to be prioritized are defined based on critical success factors derived from the SWOT analysis of the company, conducted by multiple stakeholders of the organization. Four criteria were defined to evaluate these actions, and then the FITradeoff (Flexible and Interactive Tradeoff) multicriteria method is applied to aid the decision process and preferences assessment. This method works based on partial information of the DMs to elicit criteria scaling constants in additive models, considering structured elicitation process based on tradeoffs. Throughout an interactive Decision Support System (DSS), this work illustrates how the decision process can be carried out with FITradeoff based on an integration of two paradigms in preference modeling: elicitation by decomposition and holistic evaluations.
- Published
- 2021
12. Generalized Additive Modeling for Learning Trajectories in E-Learning Environments
- Author
-
Wim Van Den Noortgate, Dries Debeer, Jinho Kim, and Jung Yeon Park
- Subjects
Computer science ,business.industry ,E-learning (theory) ,Learning environment ,Linear prediction ,Machine learning ,computer.software_genre ,Generalized linear mixed model ,Component (UML) ,A priori and a posteriori ,Artificial intelligence ,business ,Additive model ,computer ,Parametric statistics - Abstract
Adaptive E-learning is growing in popularity as it personalizes recommendations in response to learners’ learning needs. An a priori expectation of the learning environment is that the learners’ performance levels may change in real time as they complete a sequence of items and receive feedback. Also, the learners’ learning (performance) trajectories may be irregularly shaped over time. Therefore, a modeling approach that flexibly explores the learner’s learning change is desirable. In this study, we demonstrate the applicability of a semi-parametric modeling approach that can estimate learners’ unique learning trajectories in the E-learning environment. We use a generalized additive mixed model that integrates properties of generalized linear mixed models with those of additive models, in which the linear predictor is given by a sum of smooth functions of the covariates as well as a parametric component of the linear predictor. The model we consider explores the effect of time that the learners spend inside and outside the learning environment. We demonstrate its applicability to log data generated by a real-life E-learning environment.
- Published
- 2021
13. Explainable Boosting Machine for Predicting Alzheimer’s Disease from MRI Hippocampal Subfields
- Author
-
Alessia Sarica, Aldo Quattrone, and Andrea Quattrone
- Subjects
Boosting (machine learning) ,business.industry ,Computer science ,Context (language use) ,Disease ,Intelligibility (communication) ,Machine learning ,computer.software_genre ,Test set ,Pairwise comparison ,Artificial intelligence ,Additive model ,business ,computer ,Interpretability - Abstract
Although automatic prediction of Alzheimer’s disease (AD) from Magnetic Resonance Imaging (MRI) showed excellent performance, Machine Learning (ML) algorithms often provide high accuracy at the expense of interpretability of findings. Indeed, building ML models that can be understandable has fundamental importance in clinical context, especially for early diagnosis of neurodegenerative diseases. Recently, a novel interpretability algorithm has been proposed, the Explainable Boosting Machine (EBM), which is a glassbox model based on Generative Additive Models plus Interactions GA2Ms and designed to show optimal accuracy while providing intelligibility. Thus, the aim of present study was to assess – for the first time – the EBM reliability in predicting the conversion to AD and its ability in providing the predictions explainability. In particular, two-hundred brain MRIs from ADNI of Mild Cognitive Impairment (MCI) patients equally divided into stable (sMCI) and progressive (pMCI) were processed with Freesurfer for extracting twelve hippocampal subfields volumes, which already showed good AD prediction power. EBM models with and without pairwise interactions were built on training set (80%) comprised of these volumes, and global explanations were investigated. The performance of classifiers was evaluated with AUC-ROC on test set (20%) and local explanations of four randomly selected test patients (sMCIs and pMCIs correctly classified and misclassified) were given. EBMs without and with pairwise interactions showed accuracies of respectively 80.5% and 84.2%, thus demonstrating high prediction accuracy. Moreover, EBM provided practical clinical knowledge on why a patient was correctly or incorrectly predicted as AD and which hippocampal subfields drove such prediction.
- Published
- 2021
14. Interpreting Classification Models Using Feature Importance Based on Marginal Local Effects
- Author
-
Kellyton dos Santos Brito, Paulo J. L. Adeodato, and Rogerio Luiz Cardoso Silva Filho
- Subjects
Mechanism (biology) ,Computer science ,business.industry ,Machine learning ,computer.software_genre ,Logistic regression ,Feature (computer vision) ,Metric (mathematics) ,Applied research ,Artificial intelligence ,Scale (map) ,Additive model ,business ,computer ,Interpretability - Abstract
Machine learning models are widespread in many different fields due to their remarkable performances in many tasks. Some require greater interpretability, which often signifies that it is necessary to understand the mechanism underlying the algorithms. Feature importance is the most common explanation and is essential in data mining, especially in applied research. There is a frequent need to compare the effect of features over time, across models, or even across studies. For this, a single metric for each feature shared by all may be more suitable. Thus, analysts may gain better first-order insights regarding feature behavior across these different scenarios. The \(\beta\)-coefficients of additive models, such as logistic regressions, have been widely used for this purpose. They describe the relationships among predictors and outcomes in a single number, indicating both their direction and size. However, for black-box models, there is no metric with these same characteristics. Furthermore, even the \(\beta\)-coefficients in logistic regression models have limitations. Hence, this paper discusses these limitations together with the existing alternatives for overcoming them, and proposes new metrics of feature importance. As with the coefficients, these metrics indicate the feature effect’s size and direction, but in the probability scale within a model-agnostic framework. An experiment conducted on openly available breast cancer data from the UCI Archive verified the suitability of these metrics, and another on real-world data demonstrated how they may be helpful in practice.
- Published
- 2021
15. Regression Modeling of Machining Processes
- Author
-
Z. S. Tikhonova, Yu. L. Tchigirinsky, and N. V. Chigirinskaya
- Subjects
Database normalization ,Normalization (statistics) ,Multivariate statistics ,Approximation error ,Statistics ,Regression analysis ,Additive model ,Reliability (statistics) ,Statistical hypothesis testing ,Mathematics - Abstract
According to the experimental studies of force patterns and patterns of the formation of microgeometry of the surface layer, a comparative analysis of various methods for constructing multifactor, in the general case non-linear, regression models of machining processes are carried out. Estimates of the modeling error are determined for various methods of initial data normalization, in particular, the relative error and the standard quadratic error of the model. For each multivariate model, according to the value of the F-criterion, the probability value is calculated, considered as a threshold for the adequacy of the model. The same value was established as a confidence probability determining the significance of the factors under consideration. It is shown that obtaining nonlinear dependencies is possible only as a result of preliminary processing of the results of statistical tests, and the smallest relative error and the highest reliability of modeling is obtained as a result of preliminary normalization of the initial data in accordance with the rules of the “Italian cube”.
- Published
- 2021
16. Advanced Rule-Based Approaches in Customer Satisfaction Analysis: Recent Development and Future Prospects of fsQCA
- Author
-
Constantin Zopounidis, Evangelia Krassadaki, and Evangelos Grigoroudis
- Subjects
Operations research ,Computer science ,Regression analysis ,Context (language use) ,Customer satisfaction ,Rule-based system ,Additive model ,Multiple-criteria decision analysis ,Outcome (game theory) ,Preference - Abstract
Customer satisfaction is assessed by various quantitative and qualitative methods. Several quantitative methods adopt a regression analysis procedure, including Multiple Criteria Decision Aid (MCDA) techniques. However, most of them are compensatory approaches, based on an additive model that assumes preference independence among customer satisfaction criteria. During the last years, several rule-based methods have been proposed in the customer satisfaction analysis problem. Such approaches do not assume an analytical aggregation formula, and thus they may offer an alternative in this problem. The fsQCA method focuses on linguistic summarization of “if-then” type rules. This method provides all necessary/sufficient combinations (rules) of satisfaction criteria, which lead to the output (overall satisfaction). In this context, the criteria (causal conditions) constitute the input variables, while the presence of overall satisfaction is the desired outcome. The main aim of this chapter is to present the current progress in advanced rule-based approaches applied in customer satisfaction analysis, as well as the future prospects of fsQCA. For this reason, the chapter presents the theoretical background of the alternative tool that can identify any non-linear and asymmetric relationship among attribute performance and overall satisfaction. The applicability is illustrated through a case study. The dataset is analyzed using the fsQCA method, and the results are compared with an additive value-based model (MUSA method). The results provide a more detailed and valid analysis of customer satisfaction data and indicate the complementary nature of the alternative approach. Finally, the chapter discusses the potential future research efforts, given that rule-based approaches have gained increasing attention during the last years in analyzing customer satisfaction data.
- Published
- 2021
17. Green Buildings for Post Carbon City: Determining Market Premium Using Spline Smoothing Semiparametric Method
- Author
-
Francesco Del Giudice, Domenico Enrico Massimo, Pierfrancesco De Paola, Mariangela Musolino, Vincenzo Del Giudice, Del Giudice, V., Massimo, D. E., De Paola, P., Del Giudice, F. P., and Musolino, M.
- Subjects
Real estate market analysi ,Semiparametric regression ,Real estate ,Function (mathematics) ,Green building ,Semiparametric model ,Spline (mathematics) ,Spline smoothing ,Market analysis ,Econometrics ,Penalized spline semiparametric method ,Additive model ,Mathematics - Abstract
In this paper a hedonic price function built through a semiparametric additive model was applied for the real estate market analysis of the central area of Reggio Calabria. Based on Penalized Spline functions, the semiparametric model aimed to detect and identified the existence of a market premium arising from the choice of sustainable interventions, in terms of higher real estate values. The objective of the research is to demonstrate how choosing sustainability, i.e. policies oriented to Green Buildings practices, besides mitigating energy consumption respecting the historical instance of buildings, are also able to generate economic impacts in terms of increased market value of the properties.
- Published
- 2020
18. To Rank or to Permute When Comparing an Ordinal Outcome Between Two Groups While Adjusting for a Covariate?
- Author
-
Georg Zimmermann
- Subjects
Analysis of covariance ,media_common.quotation_subject ,Statistics ,Covariate ,Nonparametric statistics ,Statistics::Methodology ,Covariance ,Additive model ,Outcome (probability) ,Normality ,Mathematics ,Type I and type II errors ,media_common - Abstract
The classical parametric analysis of covariance (ANCOVA) is frequently used when comparing an ordinal outcome variable between two groups, while adjusting for a continuous covariate. However, the normality assumption might be crucial and assuming an underlying additive model might be questionable. Therefore, in the present manuscript, we consider the outcome as truly ordinal and dichotomize the covariate by a median split, in order to transform the testing problem to a nonparametric factorial setting. We propose using either a permutation-based Anderson–Darling type approach in conjunction with the nonparametric combination method or the pseudo-rank version of a nonparametric ANOVA-type test. The results of our extensive simulation study show that both methods maintain the type I error level well, but that the ANOVA-type approach is superior in terms of power for location-shift alternatives. We also discuss some further aspects, which should be taken into account when deciding for the one or the other method. The application of both approaches is illustrated by the analysis of real-life data from a randomized clinical trial with stroke patients.
- Published
- 2020
19. Interpretable Survival Gradient Boosting Models with Bagged Trees Base Learners
- Author
-
Wojciech Jarmulski and Alicja Wieczorkowska
- Subjects
Single variable ,Computer science ,business.industry ,Base (topology) ,Machine learning ,computer.software_genre ,Predictive power ,Pairwise comparison ,Gradient boosting ,Artificial intelligence ,business ,Additive model ,computer ,Interpretability - Abstract
In this paper we present a novel survival analysis modeling approach based on gradient boosting using bagged trees as base learners. The resulting models consist of additive components of single variable models and their pairwise interactions, which makes them visually interpretable. We show that our method produces competitive results often having the predictive power higher than full-complexity models. This is achieved while maintaining full interpretability of the model, which makes our method useful in medical applications.
- Published
- 2020
20. Variable Selection and Feature Screening
- Author
-
Runze Li and Wanjun Liu
- Subjects
Independent and identically distributed random variables ,Computer science ,Feature vector ,Linear model ,Statistical inference ,Feature selection ,Data mining ,Additive model ,computer.software_genre ,computer ,Data type ,Curse of dimensionality - Abstract
This chapter provides a selective review on feature screening methods for ultra-high dimensional data. The main idea of feature screening is reducing the ultra-high dimensionality of the feature space to a moderate size in a fast and efficient way and meanwhile retaining all the important features in the reduced feature space. This is referred to as the sure screening property. After feature screening, more sophisticated methods can be applied to reduced feature space for further analysis such as parameter estimation and statistical inference. This chapter only focuses on the feature screening stage. From the perspective of different types of data, we review feature screening methods for independent and identically distributed data, longitudinal data, and survival data. From the perspective of modeling, we review various models including linear model, generalized linear model, additive model, varying-coefficient model, Cox model, etc. We also cover some model-free feature screening procedures.
- Published
- 2019
21. Large-Scale Nonlinear Variable Selection via Kernel Random Features
- Author
-
Magda Gregorova, Stéphane Marchand-Maillet, Alexandros Kalousis, and Jason Ramapuram
- Subjects
Computer science ,Feature selection ,010501 environmental sciences ,01 natural sciences ,010104 statistics & probability ,Nonlinear system ,Kernel method ,Kernel (statistics) ,Feature (machine learning) ,Kernel regression ,ddc:025.063 ,0101 mathematics ,Additive model ,Nonlinear regression ,Algorithm ,0105 earth and related environmental sciences - Abstract
We propose a new method for input variable selection in nonlinear regression. The method is embedded into a kernel regression machine that can model general nonlinear functions, not being a priori limited to additive models. This is the first kernel-based variable selection method applicable to large datasets. It sidesteps the typical poor scaling properties of kernel methods by mapping the inputs into a relatively low-dimensional space of random features. The algorithm discovers the variables relevant for the regression task together with learning the prediction model through learning the appropriate nonlinear random feature maps. We demonstrate the outstanding performance of our method on a set of large-scale synthetic and real datasets. Code related to this paper is available at: https://bitbucket.org/dmmlgeneva/srff_pytorch.
- Published
- 2019
22. A Hybrid Pan-Sharpening Approach Using Nonnegative Matrix Factorization for WorldView Imageries
- Author
-
Zhaoqiang Xia, Qiqi Zhang, Ji Jiaqi, and Guiqing He
- Subjects
Image fusion ,business.industry ,Computer science ,Multiplicative function ,Pattern recognition ,Sharpening ,Mutual information ,Non-negative matrix factorization ,Panchromatic film ,Computer Science::Computer Vision and Pattern Recognition ,Component (UML) ,Artificial intelligence ,business ,Additive model - Abstract
With the advent of WorldView series imageries (WorldView-2/3/4), it is necessary to develop new fusion approaches for remote sensing images with higher spatial and spectral resolutions. Since most existing fusion approaches are not well capable of merging multi-spectral images with eight bands, a new hybrid pan-sharpening approach is proposed in this paper. The hybrid framework integrates the multiplicative model and additive model to improve the quality of multi-spectral images. In the additive procedure, the nonnegative matrix factorization (NMF) algorithm is utilized to synthesize the intensity component for obtaining the mutual information from multi-spectral images. Then the difference information between the panchromatic image and synthetic component is injected into multi-spectral images by the spectral-adjustable weights. In the multiplicative procedure, the smoothing filter-based intensity modulation (SFIM) is used to modulate the preliminary fusion. The nonlinear fitting method is utilized to calculate the optimal parameters of the hybrid model. Visual and quantitative assessments of fused images show that the proposed approach clearly improves the fusion quality compared to the state-of-the-art algorithms.
- Published
- 2019
23. Assessment of Energy Efficiency of Base Station Using SMART Approach in Wireless Communication Systems
- Author
-
Ait Ouhmane Abdellah, Gharnati Fatima, and Achki Samira
- Subjects
Base station ,Wireless network ,business.industry ,Computer science ,Process (computing) ,Cellular network ,Wireless ,Energy consumption ,business ,Additive model ,Reliability engineering ,Efficient energy use - Abstract
Optimization of energy consumption in wireless networks was considered a critical need, imposed by the physical constraint that is the lifetime of batteries of embedded equipment such as base station and mobile phones. In this works we proposed study and optimized the Energy Efficiency (EE) of base stations (BS) in cellular networks, the main goal is maximizing energy efficiency (EE) in retrieving to data by users .for it, we propose to integrate an efficient SMART (Simple multi-attribute rating technique) is based on a linear additive model. This means that an overall value of a given alternative is calculated as the total sum of the performance score (value) of each criterion (attribute) multiplied with the weight of that criterion. This bright selection considered a number of essential criteria. Implementation results confirmed that the proposed technical is more efficient than the traditional process of wireless communication systems.
- Published
- 2019
24. Weighted Compositional Vectors for Translating Collocations Using Monolingual Corpora
- Author
-
Marcos García-Salido, Marcos Garcia, and Margarita Alonso-Ramos
- Subjects
Computer science ,Principle of compositionality ,business.industry ,Artificial intelligence ,Distributional semantics ,Translation (geometry) ,computer.software_genre ,business ,Additive model ,computer ,Natural language processing ,Vector space - Abstract
This paper presents a method to automatically identify bilingual equivalents of collocations using only monolingual corpora in two languages. The method takes advantage of cross-lingual distributional semantics models mapped into a shared vector space, and of compositional methods to find appropriate translations of non-congruent collocations (e.g., pay attention–prestar atencao in English–Portuguese). This strategy is evaluated in the translation of English–Portuguese and English–Spanish collocations belonging to two syntactic patterns: adjective-noun and verb-object, and compared to other methods proposed in the literature. The results of the experiments performed show that the compositional approach, based on a weighted additive model, behaves better than the other strategies that have been evaluated, and that both the asymmetry and the compositional properties of collocations are captured by the combined vector representations. This paper also contributes with two freely available gold-standard data sets which are useful to evaluate the performance of automatic extraction of multilingual equivalents of collocations.
- Published
- 2019
25. Fuzzy Transform in Time Series Decomposition
- Author
-
Linh Nguyen and Vilém Novák
- Subjects
Series (mathematics) ,Degree (graph theory) ,Component (UML) ,Applied mathematics ,Additive model ,Fuzzy logic ,Decomposition of time series ,Mathematics - Abstract
In this paper, we provide a method for applying the fuzzy transform of higher degree to time series decomposition. We assume that a time series can be decomposed into a trend-cycle, a seasonal component and an irregular fluctuation, we devote theoretical justifications for decomposing it into an additive model. Several examples are consider to demonstrate our methodology.
- Published
- 2019
26. Using FITradeoff for Supporting a Decision Process of a Multicriteria Decision Problem
- Author
-
Adiel Teixeira de Almeida, Eduarda Asfora Frej, and Danielle Costa Morais
- Subjects
Flexibility (engineering) ,Decision support system ,Operations research ,Computer science ,Process (engineering) ,Rank (computer programming) ,Additive model ,Multiple-criteria decision analysis ,Preference (economics) ,Facility location problem - Abstract
In the scope of MAVT (Multi-Attribute Value Theory), one of the most difficult tasks is the elicitation of criteria scaling constants of an additive model for the aggregation of criteria. That might be the reason why there are so many MCDM/A (Multi-Criteria Decision Making/Aiding) methods, among which is the FITradeoff (Flexible and Interactive Tradeoff) method, which has been developed precisely to meet this challenge. One of its advantages is that it uses partial information about the preferences of a Decision Maker (DM). This requires less effort from the DM, since this method makes comparisons of consequences (or outcomes) based on strict preference rather than on indifference, which is what the traditional tradeoff procedure does. Two case studies are presented using the FITradeoff method: a supplier selection problem and a facility location problem. Using a Decision Support System of FITradeoff for the decision process, the flexibility of this process is analyzed, in order to determine the best one in a specified set of alternatives, or even to rank them.
- Published
- 2018
27. Isovalore Maps for the Spatial Analysis of Real Estate Market: A Case Study for a Central Urban Area of Reggio Calabria, Italy
- Author
-
Mariangela Musolino, Fabiana Forte, Alessandro Malerba, Vincenzo Del Giudice, Domenico Enrico Massimo, Pierfrancesco De Paola, F. Calabrò, L. Della Spina, C. Bevilacqua, De Paola, P., Del Giudice, V., Massimo, D. E., Forte, F., Musolino, M., Malerba, A., De Paola, Pierfrancesco, Del Giudice, Vincenzo, Massimo, Domenico Enrico, Forte, Fabiana, Musolino, Mariangela, and Malerba, Alessandro
- Subjects
geography.geographical_feature_category ,Computer Science (all) ,0211 other engineering and technologies ,Post carbon city ,Real estate market analysi ,Semiparametric regression ,Real estate ,02 engineering and technology ,Urban area ,Unitary state ,Green building ,Geoadditive model ,Spline (mathematics) ,Geography ,Decision Sciences (all) ,Kriging ,021105 building & construction ,Potential market ,Econometrics ,021108 energy ,Additive model - Abstract
Generally, with reference to the geographical variability of real estate values, the observed variables may have non-linear relationships with the response variable. For this reason it’s possible to combine kriging techniques with additive models to obtain the geoadditive models. In this paper a geoadditive model based on penalized spline functions has been applied, in order to obtain improvements respect to usual Kriging techniques and to provide a spatial distribution of real estate unitary values for a central area of the city of Reggio Calabria (Italy). This is the first preliminary phase for the verification of the robustness of the real estate sample, or for the subsequent individuation of progressive real estate sub-samples, for to detect and to identify possible potential market premium in real estate exchange and rent markets for green buildings.
- Published
- 2018
28. Cubic Splines and Additive Models
- Author
-
Jonathon D. Brown
- Subjects
Nonlinear system ,Relation (database) ,Linear model ,Applied mathematics ,Statistical model ,Cross product ,Additive model ,Mathematics - Abstract
Statistical models commonly assume that the relation between a predictor and a criterion can be described by a straight line. This assumption is often appropriate, but there are times when abandoning it is warranted. Under these circumstances, we have two choices: adapt a linear model to accommodate nonlinear relations (e.g., transform the variables; add cross product terms) or use statistical techniques that directly model nonlinear relations.
- Published
- 2018
29. FITradeoff Method for the Location of Healthcare Facilities Based on Multiple Stakeholders’ Preferences
- Author
-
Eduarda Asfora Frej, Alessandra Oppio, Marta Dell’Ovo, Adiel Teixeira de Almeida, Danielle Costa Morais, and Stefano Capolongo
- Subjects
Multicriteria decision ,Decision support system ,FITradeoff ,Operations research ,business.industry ,Process (engineering) ,Computer science ,02 engineering and technology ,Healthcare facilities location ,Multiple-criteria decision analysis ,Preference ,Facility location problem ,Multicriteria decision-making Additive model ,Healthcare facilities location, Multicriteria decision-making Additive model, FITradeoff ,020204 information systems ,Health care ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,business ,Additive model - Abstract
Multiple stakeholders’ preferences are considered for solving a healthcare facility location problem in the city of Milan, Italy. The preference modeling is based on the Flexible and Interactive Tradeoff (FITradeoff), a Multicriteria Decision Making (MCDM) method used to elicit criteria scaling constants in additive models. FITradeoff is an easy tool for decision makers, because it requires them to exert less effort than other traditional elicitation methods, as the tradeoff procedure. Therefore, it is expected that fewer inconsistencies will appear during the elicitation process. Sixteen criteria were used to evaluate in which of six potential areas a new hospital could be sited. An analyst with a strong background in MCDM interviewed four actors, and elicited their preferences with the help of the FITradeoff Decision Support System (FITradeoff DSS).
- Published
- 2018
30. Bivariate Copula Additive Models for Location, Scale and Shape with Applications in Biomedicine
- Author
-
Óscar Lado-Baleato, Francisco Gude, Carmen Carollo-Limeres, Jenifer Espasandín-Domínguez, Carmen Cadarso-Suárez, and Luis Coladas-Uría
- Subjects
Trust region ,030504 nursing ,Computer science ,business.industry ,Tail dependence ,Bivariate analysis ,01 natural sciences ,Regression ,Copula (probability theory) ,010104 statistics & probability ,03 medical and health sciences ,Covariate ,Econometrics ,0101 mathematics ,0305 other medical science ,business ,Additive model ,Biomedicine - Abstract
In many biomedical applications it is worthwhile to model not only the effect that covariates have on the mean but also on other parameters of the response distribution such as variance. Moreover, it is sometimes necessary to study the association between two or more variables and how such associations may depend on certain factors or covariates. Different models of flexible regression have recently been proposed in statistical literature but in this work we will focus on the study of Copula Additive Models for Location, Scale and Shape since this novel approach permits to model the dependence of two variables through copula functions and where covariates are also modelled in a flexible manner. Lastly, the benefits of using these models with real biomedical data will be illustrated.
- Published
- 2018
31. Explaining the Predictions of an Arbitrary Prediction Model: Feature Contributions and Quasi-nomograms
- Author
-
Igor Kononenko and Erik Štrumbelj
- Subjects
Structure (mathematical logic) ,Computer science ,Feature (machine learning) ,Prognostics ,Data mining ,Construct (python library) ,Type (model theory) ,computer.software_genre ,Additive model ,computer ,Predictive modelling ,Regression - Abstract
Acquisition of knowledge from data is the quintessential task of machine learning. The knowledge we extract this way might not be suitable for immediate use and one or more data postprocessing methods could be applied as well. Data postprocessing includes the integration, filtering, evaluation, and explanation of acquired knowledge. Nomograms, graphical devices for approximate calculations of functions, are a useful tool for visualising and comparing prediction models. It is well known that any generalised additive model can be represented by a quasi-nomogram – a nomogram where some summation performed by the human is required. Nomograms of this type are widely used, especially in medical prognostics. Methods for constructing such a nomogram were developed for specific types of prediction models thus assuming that the structure of the model is known. In this chapter we extend our previous work on a general method for explaining arbitrary prediction models (classification or regression) to a general methodology for constructing a quasi-nomogram for a black-box prediction model. We show that for an additive model, such a quasi-nomogram is equivalent to the one we would construct if the structure of the model was known.
- Published
- 2018
32. Statistical Model Development for Military Aircraft Engine Exhaust Emissions Data
- Author
-
Akhlitdin Nizamitdinov, Yasin Şöhret, T. Hikmet Karakoc, Aladdin Shamilov, Anadolu Üniversitesi, Fen Fakültesi, İstatistik Bölümü, and Karakoç, Tahir Hikmet
- Subjects
Variables ,Mean squared error ,Bayesian multivariate linear regression ,media_common.quotation_subject ,Parametric model ,Statistics ,Linear model ,Statistical model ,Regression analysis ,Additive model ,Automotive engineering ,Mathematics ,media_common - Abstract
Statistical regression models have a wide usage in various estimation problems. They can be used to find a relationship between dependent and independent variables. Generally, it is using regression parametric models to find the type of relationship between variables. But some problems could not be estimated with linear models, as they have a nonlinear effect on dependent variable. This study aims to show the difference between linear and nonlinear techniques. In this study, emission parameters of a military-type turboprop engine is determined at unmeasured operating points on the basis of data collected at various loads with the aid of regression techniques. It is using multivariate linear regression, additive models with B-spline basis function, and smoothing splines. Three different techniques used to reveal best approximation to the dataset. It is observed that the effect of three parameters: revolution per minute (min-1), air/fuel ratio (kg air/kg fuel), and mass flow rate (kg.s-1) to different mass flow rates (CO (kg/s), CO2 (kg/s); UHC (kg/s), NO2 (kg/s)). In the end of the study, results obtained from benefited approximations were compared with each other using MSE (mean squared error) performance criteria.
- Published
- 2017
33. Statistics Instead of Stopover—Range Predictions for Electric Vehicles
- Author
-
Christian Kluge, Stefan Schuster, and Diana Sellner
- Subjects
0301 basic medicine ,Mathematical model ,Linear model ,01 natural sciences ,Regression ,Support vector machine ,010104 statistics & probability ,03 medical and health sciences ,030104 developmental biology ,Statistics ,Polygon ,Ordinary least squares ,Range (statistics) ,0101 mathematics ,Additive model ,Mathematics - Abstract
Electric vehicles (EVs) can play a central role in today’s efforts to reduce CO\(_2\) emission and slow down the climate change. Two of the most important reasons against purchase or use of an EV are its short range and long charging times. In the project “E-WALD—Elektromobilitat Bayerischer Wald”, we develop mathematical models to predict the range of EVs by estimating the electrical power consumption (EPC) along possible routes. Based on the EPC forecasts the range is calculated and visualized by a range polygon on a navigation map. The models are based on data that are constantly collected by cars within a commercial car fleet. The dataset is modelled with three methods: a linear model, an additive model and a fully nonparametric model. To fit the linear model, ordinary least squares (OLS) regression as well as linear median regression are applied. The other models are fitted by modern machine learning algorithms: the additive model is fitted by boosting algorithm and the fully nonparametric model is fitted by support vector regression (SVR). The models are compared by mean absolute error (MAE). Our research findings show that data preparation is more influential than the chosen model.
- Published
- 2017
34. Spatial Analysis of Residential Real Estate Rental Market with Geoadditive Models
- Author
-
Pierfrancesco De Paola and Vincenzo Del Giudice
- Subjects
Mixed model ,Finance ,business.industry ,010102 general mathematics ,Real estate ,01 natural sciences ,Generalized linear mixed model ,Multivariate interpolation ,010104 statistics & probability ,Spline (mathematics) ,Software ,Kriging ,Econometrics ,Business ,0101 mathematics ,Additive model - Abstract
A study of geographical variability of real estate rents in the central urban area of Naples (Italy) benefits from geostatistical mapping or kriging. Often, some of the observed variables can have non-linear relationships with the response variable. To account for such effects properly we combine kriging techniques with additive models to obtain the geoadditive models, expressing both as linear mixed models. The resulting mixed model representation for the geoadditive model allows for fitting and analysis using standard methodology and software. In effect, the geoadditive models represent efficient and flexible tools, useful in modeling realistically complex situations, often based on semi-parametric regressions integrated by Kriging techniques for the spatial interpolation. In this paper a geoadditive model based on penalized spline functions has been applied, in order to obtain improvements respect to usual Kriging techniques, an analysis of rents values and their spatial distribution for the neighborhoods of Chiaia and Santa Lucia in Naples.
- Published
- 2017
35. Applications of SADR in Economics
- Author
-
Alexander Silbersdorff
- Subjects
Estimation ,Earnings ,Inequality ,media_common.quotation_subject ,05 social sciences ,Wage ,01 natural sciences ,010104 statistics & probability ,Frequentist inference ,Scale (social sciences) ,8. Economic growth ,0502 economics and business ,Unemployment ,Economics ,Econometrics ,050207 economics ,0101 mathematics ,Additive model ,media_common - Abstract
This chapter provides summaries of five papers that lie at the heart of the PhD thesis which underlies this book. The first paper considers the estimation of conditional income distributions for males in Germany using frequentist Generalised Additive Models of Location, Scale and Shape. The second paper considers the application Bayesian Structured Additive Distributional Regression (SADR) to region-specific earnings inequalities. The third paper proposes a modification of the gender wage gap which also considers distributional differences in earnings beyond the mean as well as an activity-based labour definition. The fourth paper applies SADR to the assessment of health inequalities. The fifth paper considers the relation between unemployment scarring and earnings in later life out of a distributional perspective.
- Published
- 2017
36. Genotype-by-Environment Interactions
- Author
-
P. M. Priyadarshan
- Subjects
Mixed model ,Biplot ,Linear regression ,Statistics ,Main effect ,Regression analysis ,Quantitative trait locus ,Biology ,Additive model ,Regression - Abstract
The penultimate success of a plant breeding programme depends on its ability to provide farmers with genotypes/clones with guaranteed superior performance (phenotype) in terms of yield and/or quality across a range of environments. While there can be clones that do well across a wide range of conditions (widely adapted genotypes), there are also clones that perform well exclusively under a restricted set of environments (specifically adapted genotypes). As in widely adapted genotypes, specific adaptation of genotypes is also closely related to the phenomenon of genotype-by-environment interaction. Information about phenotypic stability and adaptability assessed through GE interaction studies is prime for the selection of crop varieties/clones. Since phenotypic performance of a genotype is not necessarily the same under diverse agro-ecological conditions, the concept of stability has been defined and assessed in several ways and several biometrical methods including univariate and multivariate analyses (Lin et al. 1986; Becker and Leon 1988; Crossa 1990). The most widely used is the regression method, based on regressing the mean value of each genotype on the environmental index or marginal means of environments (Romagosa and Fox 1993). A good method to measure stability was proposed by Finlay and Wilkinson (1963) and was later improved by Eberhart and Russell (1966). They were followed by AMMI model (Gauch and Zobel 1996) and GGE biplot (Yan and Kang 2003). All these merely tried to group genotypes and environments and do not use other information than the two-way table of means. Further, factorial regression was introduced as an approach to explicitly utilize genotypic and environmental covariates for describing and explaining GE interactions. Finally, QTL modelling was put forth as a natural extension of factorial regression, where marker information is translated into genetic predictors. Tests for regression coefficients corresponding to these genetic predictors are tests for main effect QTL expression and QTL by environment interaction (QEI). QTL models for which QEI depends on environmental covariables form an interesting model class for predicting GEI, for new genotypes and new environments. QTL technology has not been efficient for predicting complex traits affected by a large number of loci. Recent delineation of high-density markers has been useful to predict genomic breeding values, thus increasing the precision of genetic value prediction over that achieved with the traditional use of pedigree information (Crossa 2012). Genomic data also allow assessing chromosome regions through marker effects and studying the pattern of covariability of marker effects across differential environmental conditions. For realistic modelling of genotypic differences across multiple environments, sophisticated mixed models are necessary to allow for heterogeneity of genetic variances and correlations across environments. Models like (a) additive model, (b) regression on the mean model, (c) additive main effects and multiplicative interactions model, (d) factorial regression models, (e) mixed models for genetic variances and covariances and (f) modelling main effect QTLs and QTL-by-environment interaction are some of the strategies being highlighted for the study of GE interactions (Malosetti et al. 2013).
- Published
- 2017
37. Multilevel Modeling with Structured Penalties for Classification from Imaging Genetics Data
- Author
-
Olivier Colliot and Pascal Lu
- Subjects
0303 health sciences ,Modality (human–computer interaction) ,Modalities ,Imaging genetics ,Computer science ,business.industry ,Pattern recognition ,Data type ,03 medical and health sciences ,0302 clinical medicine ,Lasso (statistics) ,Kernel (statistics) ,Proximal Gradient Methods ,Artificial intelligence ,Additive model ,business ,030217 neurology & neurosurgery ,030304 developmental biology - Abstract
In this paper, we propose a framework for automatic classification of patients from multimodal genetic and brain imaging data by optimally combining them. Additive models with unadapted penalties (such as the classical group lasso penalty or \(\ell _1\)-multiple kernel learning) treat all modalities in the same manner and can result in undesirable elimination of specific modalities when their contributions are unbalanced. To overcome this limitation, we introduce a multilevel model that combines imaging and genetics and that considers joint effects between these two modalities for diagnosis prediction. Furthermore, we propose a framework allowing to combine several penalties taking into account the structure of the different types of data, such as a group lasso penalty over the genetic modality and a \(\ell _2\)-penalty on imaging modalities. Finally, we propose a fast optimization algorithm, based on a proximal gradient method. The model has been evaluated on genetic (single nucleotide polymorphisms - SNP) and imaging (anatomical MRI measures) data from the ADNI database, and compared to additive models [13, 15]. It exhibits good performances in AD diagnosis; and at the same time, reveals relationships between genes, brain regions and the disease status.
- Published
- 2017
38. Neural Induction of a Lexicon for Fast and Interpretable Stance Classification
- Author
-
Nirmalie Wiratunga and Jérémie Clos
- Subjects
Computer science ,Generalization ,business.industry ,Probabilistic logic ,02 engineering and technology ,Pointwise mutual information ,computer.software_genre ,Lexicon ,Backpropagation ,Readability ,ComputingMethodologies_PATTERNRECOGNITION ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Limit (mathematics) ,Artificial intelligence ,business ,Additive model ,computer ,Natural language processing - Abstract
Large-scale social media classification faces the following two challenges: algorithms can be hard to adapt to Web-scale data, and the predictions that they provide are difficult for humans to understand. Those two challenges are solved at the cost of some accuracy by lexicon-based classifiers, which offer a white-box approach to text mining by using a trivially interpretable additive model. However current techniques for lexicon-based classification limit themselves to using hand-crafted lexicons, which suffer from human bias and are difficult to extend, or automatically generated lexicons, which are induced using point-estimates of some predefined probabilistic measure on a corpus of interest. In this work we propose a new approach to learn robust lexicons, using the backpropagation algorithm to ensure generalization power without sacrificing model readability. We evaluate our approach on a stance detection task, on two different datasets, and find that our lexicon outperforms standard lexicon approaches.
- Published
- 2017
39. Bayesian Grouped Horseshoe Regression with Application to Additive Models
- Author
-
Guoqi Qian, Zemei Xu, Daniel F. Schmidt, John L. Hopper, and Enes Makalic
- Subjects
Mathematics::Dynamical Systems ,Computer science ,business.industry ,Bayesian probability ,Estimator ,020206 networking & telecommunications ,Pattern recognition ,02 engineering and technology ,01 natural sciences ,Regression ,Statistics::Computation ,010104 statistics & probability ,Robustness (computer science) ,Bayesian multivariate linear regression ,0202 electrical engineering, electronic engineering, information engineering ,Astrophysics::Earth and Planetary Astrophysics ,Artificial intelligence ,0101 mathematics ,Bayesian linear regression ,Additive model ,business ,Horseshoe (symbol) - Abstract
The Bayesian horseshoe estimator is known for its robustness when handling noisy and sparse big data problems. This paper presents two extensions of the regular Bayesian horseshoe: (i) the grouped Bayesian horseshoe and (ii) the hierarchical Bayesian grouped horseshoe. The advantages of the proposed methods are their flexibility in handling grouped variables through extra shrinkage parameters at the group and within-group levels. We apply the proposed methods to the important class of additive models where group structures naturally exist, and we demonstrate that the grouped hierarchical Bayesian horseshoe has promising performance on both simulated and real data.
- Published
- 2016
40. Modeling the Evolution of Age and Cohort Effects
- Author
-
Sam Schulhofer-Wohl and Y. Claire Yang
- Subjects
05 social sciences ,Linear model ,Statistical model ,Parameter identification problem ,03 medical and health sciences ,0302 clinical medicine ,Cohort effect ,0502 economics and business ,Cohort ,Period effects ,030212 general & internal medicine ,050207 economics ,Additive model ,Birth Year ,Demography ,Mathematics - Abstract
The conventional linear model of age, period, and cohort (APC) effects suffers from a well-known identification problem: If an outcome depends on the sum of an age effect, a period effect, and a cohort effect, one cannot distinguish these three effects because birth year = current year − age. Less well appreciated is that the linear model suffers from a conceptual problem: It assumes that the marginal effect of age is the same at all times, the marginal effect of current conditions is the same for people of all ages, and cohorts do not change over time. We propose a new way of modeling APC effects that improves substantively and methodologically on the conventional linear model. We define cohort effects as an accumulation of age-by-period interactions. Although a long-standing literature conceptualizes cohort effects in exactly this way, we are the first to provide a statistical model. Our model allows age profiles to change over time and period effects to have different marginal effects on people of different ages. Except in special cases, the parameters of our model are identified. We apply the model to analyze changes in age-specific mortality in Sweden over 150 years. Our model fits the Swedish data dramatically better than the additive model. The rate of increase of mortality with age became more steep from 1881 to 1941, but since then has been roughly constant. The impact of early-life conditions lasts for several years but is unlikely to reach to old age.
- Published
- 2016
41. Automatic Component Selection in Additive Modeling of French National Electricity Load Forecasting
- Author
-
Xavier Brossat, Anestis Antoniadis, Yannig Goude, Jean-Michel Poggi, and Vincent Thouvenot
- Subjects
Mathematical optimization ,Engineering ,Bayesian information criterion ,business.industry ,Model selection ,Econometrics ,Estimator ,Feature selection ,Akaike information criterion ,business ,Additive model ,Selection (genetic algorithm) ,Cross-validation - Abstract
We consider estimation and model selection in sparse high-dimensional linear additive models when multiple covariates need to be modeled nonparametrically, and propose some multi-step estimators based on B-splines approximations of the additive components. In such models, the overall number of regressors d can be large, possibly much larger than the sample size n. However, we assume that there is a smaller than n number of regressors that capture most of the impact of all covariates on the response variable. Our estimation and model selection results are valid without assuming the conventional “separation condition”—namely, without assuming that the norm of each of the true nonzero components is bounded away from zero. Instead, we relax this assumption by allowing the norms of nonzero components to converge to zero at a certain rate. The approaches investigated in this paper consist of two steps. The first step implements the variable selection, typically by the Group Lasso, and the second step applies a penalized P-splines estimation to the selected additive components. Regarding the model selection task we discuss, the application of several criteria such as Akaike information criterion (AIC), Bayesian information criterion (BIC), and generalized cross validation (GCV) and study the consistency of BIC, i.e. its ability to select the true model with probability converging to 1. We then study post-model estimation consistency of the selected components. We end the paper by applying the proposed procedure on some real data related to electricity load consumption forecasting: the EDF (Electricite de France) portfolio.
- Published
- 2016
42. Veto Values Within MAUT for Group Decision Making on the basis of Dominance Measuring Methods with Fuzzy Weights
- Author
-
Alfonso Mateos, Antonio Jiménez-Martín, and Pilar Sabio
- Subjects
Mathematical optimization ,Ranking ,Veto ,Fuzzy number ,Context (language use) ,Function (mathematics) ,Additive model ,Mathematical economics ,Fuzzy logic ,Group decision-making ,Mathematics - Abstract
In this paper we extend the additive multi-attribute utility model to incorporate the concept of veto in a group decision-making context. Moreover, trapezoidal fuzzy numbers are used to represent the relative importance of criteria for each DM, and uncertainty about the alternative performances is considered by means of intervals. Although all DMs are allowed to provide veto values, the corresponding vetoes are effective for only the most important DMs. They are used to define veto ranges. Veto values corresponding to the other less important DMs are partially taken into account, leading to the construction of adjust ranges. Veto and an adjust function are then incorporated into the additive model, and a fuzzy dominance matrix is computed. A dominance measuring method is then used to derive a ranking of alternatives for each DM, which are then aggregated to account for the relative importance of DMs.
- Published
- 2015
43. Sparse Bayesian ELM Handling with Missing Data for Multi-class Classification
- Author
-
Shiji Song, Xunan Zhang, and Jiannan Zhang
- Subjects
Engineering ,business.industry ,Generalization ,Bayesian probability ,Missing data problem ,Pattern recognition ,Missing data ,Machine learning ,computer.software_genre ,Multiclass classification ,Artificial intelligence ,State (computer science) ,business ,Additive model ,computer ,Extreme learning machine - Abstract
Extreme learning machine (ELM) is a successful machine learning approach for its extremely fast training speed and good generalization performance. The sparse Bayesian ELM (SBELM) approach, which is a variant of ELM, can result in a more accurate and compact model. However, SBELM can not deal with the missing data problem in its standard form. To solve this problem, we design two novel methods, additive models for missing data (AMMD) and self-adjusting neuron state for missing data (SANSMD), by adjusting the calculation of outputs of the hidden layers in SBELM. Experimental results on several data sets from the UCI repository indicate that the proposed modified SBELM methods have significant advantages: high accuracy and good generalization performance compared with several other existing methods. Moreover, the proposed methods enrich ELM with new tools to solve missing data problem for multi-class classification even with up to 50% of the features missing in the input data.
- Published
- 2015
44. Model-Free vs. Model-Based Confidence Intervals
- Author
-
Dimitris N. Politis
- Subjects
Transformation (function) ,Resampling ,Statistics ,Prediction interval ,Residual ,Additive model ,Robust confidence intervals ,Confidence interval ,Mathematics ,Nonparametric regression - Abstract
The problem of confidence interval construction in nonparametric regression via the bootstrap is revisited. When a an additive model holds true, the usual residual bootstrap is available but it often leads to confidence interval under-coverage; the case is made that this under-coverage can be partially corrected using predictive—as opposed to fitted—residuals for resampling. Furthermore, it has been unclear to date if a bootstrap approach is feasible in the absence of an additive model. The main thrust of this paper is to show how the transformation approach (Chap. 4) in the related setting of prediction intervals can be found useful in order to construct bootstrap confidence intervals without an additive model.
- Published
- 2015
45. Multidimensional IRT Models to Analyze Learning Outcomes of Italian Students at the End of Lower Secondary School
- Author
-
Stefania Mignani, Mariagiulia Matteucci, Millsap, R.E., Bolt, D.M., van der Ark, L.A., Wang, W.-C., Matteucci, Mariagiulia, and Mignani, Stefania
- Subjects
Model checking ,Grammar ,business.industry ,Student assessment ,media_common.quotation_subject ,Bayesian probability ,ITEM RESPONSE THEORY ,multidimensional model ,Bayesian inference ,Machine learning ,computer.software_genre ,symbols.namesake ,Gibbs sampling ,Reading comprehension ,Item response theory ,symbols ,Mathematics education ,Artificial intelligence ,business ,Additive model ,Psychology ,computer ,media_common - Abstract
In this paper, different multidimensional IRT models are compared in order to choose the best approach to explain response data on Italian student assessment at the end of lower secondary school. The results show that the additive model with three specific dimensions (reading comprehension, grammar, and mathematics abilities) and an overall ability is able to recover the test structure meaningfully. In this model, the overall ability compensates for the specific ability (or vice versa) in order to determine the probability of a correct response. Given the item characteristics, the overall ability is interpreted as a reasoning and thinking capability. Model estimation is conducted via Gibbs sampler within a Bayesian approach, which allows the use of Bayesian model comparison techniques such as posterior predictive model checking for model comparison and fit.
- Published
- 2015
46. MINED: An Efficient Mutual Information Based Epistasis Detection Method to Improve Quantitative Genetic Trait Prediction
- Author
-
Dan He, Zhanyong Wang, and Laxmi Parada
- Subjects
business.industry ,Brute-force search ,Phenotypic trait ,Mutual information ,Biology ,Machine learning ,computer.software_genre ,Thresholding ,Trait ,Epistasis ,Pairwise comparison ,Artificial intelligence ,Additive model ,business ,computer - Abstract
Whole genome prediction of complex phenotypic traits using high-density genotyping arrays has attracted a great deal of attention, as it is very relevant to plant and animal breeding. More effective breeding strategies can be developed based on a more accurate prediction. Most of the existing work considers an additive model on single markers, or genotypes only. In this work, we studied the problem of epistasis detection for genetic trait prediction, where different alleles, or genes, can interact with each other. We have developed a novel method MINED to detect significant pairwise epistasis effects that contribute most to prediction performance. A dynamic thresholding and a sampling strategy allow very efficient detection, and it is generally 20 to 30 times faster than an exhaustive search. In our experiments on real plant data sets, MINED is able to capture the pairwise epistasis effects that improve the prediction. We show it achieves better prediction accuracy than the state-of-the-art methods. To our knowledge, MINED is the first algorithm to detect epistasis in the genetic trait prediction problem. We further proposed a constrained version of MINED that converts the epistasis detection problem into a Weighted Maximum Independent Set problem. We show that Constrained-MINED is able to improve the prediction accuracy even more.
- Published
- 2015
47. The Estimation of Standard Deviation of Premium Risk Under Solvency 2
- Author
-
Rocco Roberto Cerchiara and Vittorio Magatti
- Subjects
Estimation ,Deviation risk measure ,Solvency ,Actuarial science ,Order (exchange) ,Capital requirement ,Econometrics ,Range (statistics) ,Additive model ,Standard deviation ,Mathematics - Abstract
Solvency 2 Directive provides a range of methods to calculate the Solvency Capital Requirement (SCR). Focusing on the Standard Formula (SF) approach with Undertaking-Specific Parameters (USPs), the Technical Specifications (TS) of Quantitative Impact Study 5 (QIS5) describes a subset of the SF market parameters (standard deviations) that may be replaced by USPs, in order to calculate the SCR deriving from Premium Risk, using three different standardised methods. Compared to the existing literature and practice, this paper innovates in that this standard deviation will be calculated using a Partial Internal Risk Model (PIRM), based on Generalised Linear or Additive Models (GLM or GAM), showing how the techniques usually developed for premium calculation could be useful for this goal.
- Published
- 2014
48. Slack-Based DEA Models
- Author
-
Joe Zhu
- Subjects
Mathematical optimization ,Current (fluid) ,Additive model ,Input reduction ,Mathematics - Abstract
The input-oriented DEA models consider the possible (proportional) input reductions while maintaining the current levels of outputs. The output-oriented DEA models consider the possible (proportional) output augmentations while keeping the current levels of inputs. Charnes et al. develop an additive DEA model which considers possible input decreases as well as output increases simultaneously. The additive model is based upon input and output slacks.
- Published
- 2014
49. Parallel Fitting of Additive Models for Regression
- Author
-
Markus Hegland and Valeriy Khakhutskyy
- Subjects
Mathematical optimization ,Data point ,Computer science ,business.industry ,Big data ,Generalized additive model ,Data mining ,Additive model ,computer.software_genre ,business ,computer ,Regression - Abstract
To solve big data problems which occur in modern data mining applications, a comprehensive approach is required that combines a flexible model and an optimisation algorithm with fast convergence and a potential for efficient parallelisation both in the number of data points and the number of features.
- Published
- 2014
50. The Design of an Optimal Bonus-Malus System Based on the Sichel Distribution
- Author
-
George Tzougas, Nicholas Frangos, Silvestrov, Dmitrii, and Martin-Löf, Anders
- Subjects
HG Finance ,Scale (ratio) ,Distribution (number theory) ,Gaussian ,Negative binomial distribution ,Regression analysis ,symbols.namesake ,symbols ,Bonus-malus ,Econometrics ,QA Mathematics ,Additive model ,Mathematics ,Count data - Abstract
This chapter presents the design of an optimal Bonus-Malus System (BMS) using the Sichel distribution to model the claim frequency distribution. This system is proposed as an alternative to the optimal BMS obtained by the traditional Negative Binomial model [19]. The Sichel distribution has a thicker tail than the Negative Binomial distribution and it is considered as a plausible model for highly dispersed count data. We also consider the optimal BMS provided by the Poisson-Inverse Gaussian distribution (PIG), which is a special case of the Sichel distribution. Furthermore, we develop a generalised BMS that takes into account both the a priori and a posteriori characteristics of each policyholder. For this purpose we consider the generalised additive models for location, scale and shape (GAMLSS) in order to use all available information in the estimation of the claim frequency distribution. Within the framework of the GAMLSS we propose the Sichel GAMLSS for assessing claim frequency as an alternative to the Negative Binomial Type I (NBI) regression model used by Dionne and Vanasse [9, 10]. We also consider the NBI and PIG GAMLSS for assessing claim frequency.
- Published
- 2014
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.