Journal: data mining & knowledge discovery / Publisher: springer nature - Searchworks@Jio Institute Digital Library Search Results

1. For real: a thorough look at numeric attributes in subgroup discovery.

Author: Meeng, Marvin and Knobbe, Arno
Subjects: PAPER arts, DATA mining, STANDARD deviations, SEQUENTIAL pattern mining
Abstract: Subgroup discovery (SD) is an exploratory pattern mining paradigm that comes into its own when dealing with large real-world data, which typically involves many attributes, of a mixture of data types. Essential is the ability to deal with numeric attributes, whether they concern the target (a regression setting) or the description attributes (by which subgroups are identified). Various specific algorithms have been proposed in the literature for both cases, but a systematic review of the available options is missing. This paper presents a generic framework that can be instantiated in various ways in order to create different strategies for dealing with numeric data. The bulk of the work in this paper describes an experimental comparison of a considerable range of numeric strategies in SD, where these strategies are organised according to four central dimensions. These experiments are furthermore repeated for both the classification task (target is nominal) and regression task (target is numeric), and the strategies are compared based on the quality of the top subgroup, and the quality and redundancy of the top-k result set. Results of three search strategies are compared: traditional beam search, complete search, and a variant of diverse subgroup set discovery called cover-based subgroup selection. Although there are various subtleties in the outcome of the experiments, the following general conclusions can be drawn: it is often best to determine numeric thresholds dynamically (locally), in a fine-grained manner, with binary splits, while considering multiple candidate thresholds per attribute. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

2. Guest Editors' Introduction: special issue of selected papers from ECML PKDD 2011.

Author: Gunopulos, Dimitrios, Malerba, Donato, and Vazirgiannis, Michalis
Subjects: DATA mining, SOCIAL interaction, ONLINE social networks
Abstract: An introduction is presented in which the editor discusses various reports within the issue on topics including data mining, subgroup discovery and social relationship in online social network.
Published: 2012
Full Text: View/download PDF

3. Shapley values for cluster importance: How clusters of the training data affect a prediction.

Author: Brandsæter, Andreas and Glad, Ingrid K.
Subjects: ARTIFICIAL intelligence, GAME theory, PREDICTION models, FORECASTING, EXPLANATION
Abstract: This paper proposes a novel approach to explain the predictions made by data-driven methods. Since such predictions rely heavily on the data used for training, explanations that convey information about how the training data affects the predictions are useful. The paper proposes a novel approach to quantify how different data-clusters of the training data affect a prediction. The quantification is based on Shapley values, a concept which originates from coalitional game theory, developed to fairly distribute the payout among a set of cooperating players. A player's Shapley value is a measure of that player's contribution. Shapley values are often used to quantify feature importance, ie. how features affect a prediction. This paper extends this to cluster importance, letting clusters of the training data act as players in a game where the predictions are the payouts. The novel methodology proposed in this paper lets us explore and investigate how different clusters of the training data affect the predictions made by any black-box model, allowing new aspects of the reasoning and inner workings of a prediction model to be conveyed to the users. The methodology is fundamentally different from existing explanation methods, providing insight which would not be available otherwise, and should complement existing explanation methods, including explanations based on feature importance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Explanatory artificial intelligence (YAI): human-centered explanations of explainable AI and complex data.

Author: Sovrano, Francesco and Vitali, Fabio
Subjects: ARTIFICIAL intelligence, SOFTWARE development tools, SYSTEMS software, INDIVIDUAL needs, EXPLANATION
Abstract: In this paper we introduce a new class of software tools engaged in delivering successful explanations of complex processes on top of basic Explainable AI (XAI) software systems. These tools, that we call cumulatively Explanatory AI (YAI) systems, enhance the quality of the basic output of a XAI by adopting a user-centred approach to explanation that can cater to the individual needs of the explainees with measurable improvements in usability. Our approach is based on Achinstein's theory of explanations, where explaining is an illocutionary (i.e., broad yet pertinent and deliberate) act of pragmatically answering a question. Accordingly, user-centrality enters in the equation by considering that the overall amount of information generated by answering all questions can rapidly become overwhelming and that individual users may perceive the need to explore just a few of them. In this paper, we give the theoretical foundations of YAI, formally defining a user-centred explanatory tool and the space of all possible explanations, or explanatory space, generated by it. To this end, we frame the explanatory space as an hypergraph of knowledge and we identify a set of heuristics and properties that can help approximating a decomposition of it into a tree-like representation for efficient and user-centred explanation retrieval. Finally, we provide some old and new empirical results to support our theory, showing that explanations are more than textual or visual presentations of the sole information provided by a XAI. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Guest editors’ introduction: special issue of selected papers from ECML PKDD 2010.

Author: Balcázar, José, Bonchi, Francesco, Gionis, Aristides, and Sebag, Michèle
Subjects: MACHINE learning, DATA mining
Abstract: An introduction to the journal is presented in which the editor discusses articles related on machine learning and data mining which include "A Game-Theoretic Framework to Identify Overlapping Communities in Social Networks," by Wei Chen and colleagues, "Accelerating Spectral Clustering with Partial Supervision," by Dimitrios Mavroeidis, and "Maximal Exceptions with Minimal Descriptions," by Matthijs van Leeuwen.
Published: 2010
Full Text: View/download PDF

6. Guest editors’ introduction: special issue of selected papers from ECML PKDD 2009.

Author: Kolcz, Aleksander, Mladenic, Dunja, Buntine, Wray, Grobelnik, Marko, and Shawe-Taylor, John
Subjects: MACHINE learning, DATA mining, SUPPORT vector machines, CONTENT mining, DATA analysis
Abstract: The article discusses the papers presented at the European Conference on Machine Learning (ECML) and the European Conference on Principles of Data Mining and Knowledge Discovery (PKDD). It notes that the papers are related to machine learning and data mining communities including "Sparse kernel SVMs via cutting-plane training," which features the state of the art efficiency in inducting and evaluating support vector machine (SVM) models. Other topics of the papers are also presented.
Published: 2009
Full Text: View/download PDF

7. Network embedding based on high-degree penalty and adaptive negative sampling.

Author: Ma, Gang-Feng, Yang, Xu-Hua, Ye, Wei, Xu, Xin-Li, and Ye, Lei
Subjects: RANDOM walks, CLASSIFICATION algorithms
Abstract: Network embedding can effectively dig out potentially useful information and discover the relationships and rules which exist in the data, that has attracted increasing attention in many real-world applications. The goal of network embedding is to map high-dimensional and sparse networks into low-dimensional and dense vector representations. In this paper, we propose a network embedding method based on high-degree penalty and adaptive negative sampling (NEPS). First, we analyze the problem of imbalanced node training in random walk and propose an indicator base on high-degree penalty, which can control the random walk and avoid over-sampling high-degree neighbor node. Then, we propose a two-stage adaptive negative sampling strategy, which can dynamically obtain negative samples suitable for the current training according to the training stage to improve training effect. By comparing with seven well-known network embedding algorithms on eight real-world data sets, experiments show that the NEPS has good performance in node classification, network reconstruction and link prediction. The code is available at: https://github.com/Andrewsama/NEPS-master. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Optimal selection of benchmarking datasets for unbiased machine learning algorithm evaluation.

Author: Pereira, João Luiz Junho, Smith-Miles, Kate, Muñoz, Mario Andrés, and Lorena, Ana Carolina
Subjects: MACHINE learning, SUPERVISED learning, METAHEURISTIC algorithms, CLASSIFICATION algorithms, ALGORITHMS
Abstract: Whenever a new supervised machine learning (ML) algorithm or solution is developed, it is imperative to evaluate the predictive performance it attains for diverse datasets. This is done in order to stress test the strengths and weaknesses of the novel algorithms and provide evidence for situations in which they are most useful. A common practice is to gather some datasets from public benchmark repositories for such an evaluation. But little or no specific criteria are used in the selection of these datasets, which is often ad-hoc. In this paper, the importance of gathering a diverse benchmark of datasets in order to properly evaluate ML models and really understand their capabilities is investigated. Leveraging from meta-learning studies evaluating the diversity of public repositories of datasets, this paper introduces an optimization method to choose varied classification and regression datasets from a pool of candidate datasets. The method is based on maximum coverage, circular packing, and the meta-heuristic Lichtenberg Algorithm for ensuring that diverse datasets able to challenge the ML algorithms more broadly are chosen. The selections were compared experimentally with a random selection of datasets and with clustering by k-medoids and proved to be more effective regarding the diversity of the chosen benchmarks and the ability to challenge the ML algorithms at different levels. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Sky-signatures: detecting and characterizing recurrent behavior in sequential data.

Author: Gautrais, Clément, Cellier, Peggy, Guyet, Thomas, Quiniou, René, and Termier, Alexandre
Subjects: NATURAL language processing, DATA mining, POLITICAL oratory, RECURRENT neural networks
Abstract: This paper proposes the sky-signature model, an extension of the signature model Gautrais et al. (in: Proceedings of the Pacific-Asia conference on knowledge discovery and data mining (PAKDD), Springer, 2017b) to multi-objective optimization. The signature approach considers a sequence of itemsets, and given a number k it returns a segmentation of the sequence in k segments such that the number of items occuring in all segments is maximized. The limitation of this approach is that it requires to manually set k, and thus fixes the temporal granularity at which the data is analyzed. The sky-signature model proposed in this paper removes this requirement, and allows to examine the results at multiple levels of granularity, while keeping a compact output. This paper also proposes efficient algorithms to mine sky-signatures, as well as an experimental validation both real data both from the retail domain and from natural language processing (political speeches). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Gradient-based explanation for non-linear non-parametric dimensionality reduction.

Author: Corbugy, Sacha, Marion, Rebecca, and Frénay, Benoît
Subjects: NEIGHBORHOODS, EXPLANATION, ALGORITHMS
Abstract: Dimensionality reduction (DR) is a popular technique that shows great results to analyze high-dimensional data. Generally, DR is used to produce visualizations in 2 or 3 dimensions. While it can help understanding correlations between data, embeddings generated by DR are hard to grasp. The position of instances in low-dimension may be difficult to interpret, especially for non-linear, non-parametric DR techniques. Because most of the techniques are said to be neighborhood preserving (which means that explaining long distances is not relevant), some approaches try explaining them locally. These methods use simpler interpretable models to approximate the decision frontier locally. This can lead to misleading explanations. In this paper a novel approach to locally explain non-linear, non-parametric DR embeddings like t-SNE is introduced. It is the first gradient-based method for explaining these DR algorithms. The technique presented in this paper is applied on t-SNE, but is theoretically suitable for any DR method that is a minimization or maximization problem. The approach uses the analytical derivative of a t-SNE embedding to explain the position of an instance in the visualization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. The grammar of interactive explanatory model analysis.

Author: Baniecki, Hubert, Parzych, Dariusz, and Biecek, Przemyslaw
Subjects: MACHINE learning, SEQUENTIAL analysis, SOFTWARE frameworks, DECISION making, ARTIFICIAL intelligence
Abstract: The growing need for in-depth analysis of predictive models leads to a series of new methods for explaining their local and global properties. Which of these methods is the best? It turns out that this is an ill-posed question. One cannot sufficiently explain a black-box machine learning model using a single method that gives only one perspective. Isolated explanations are prone to misunderstanding, leading to wrong or simplistic reasoning. This problem is known as the Rashomon effect and refers to diverse, even contradictory, interpretations of the same phenomenon. Surprisingly, most methods developed for explainable and responsible machine learning focus on a single-aspect of the model behavior. In contrast, we showcase the problem of explainability as an interactive and sequential analysis of a model. This paper proposes how different Explanatory Model Analysis (EMA) methods complement each other and discusses why it is essential to juxtapose them. The introduced process of Interactive EMA (IEMA) derives from the algorithmic side of explainable machine learning and aims to embrace ideas developed in cognitive sciences. We formalize the grammar of IEMA to describe human-model interaction. It is implemented in a widely used human-centered open-source software framework that adopts interactivity, customizability and automation as its main traits. We conduct a user study to evaluate the usefulness of IEMA, which indicates that an interactive sequential analysis of a model may increase the accuracy and confidence of human decision making. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Counterfactual explanations as interventions in latent space.

Author: Crupi, Riccardo, Castelnovo, Alessandro, Regoli, Daniele, and San Miguel Gonzalez, Beatriz
Subjects: ARTIFICIAL intelligence, MACHINE learning, COUNTERFACTUALS (Logic), CAUSATION (Philosophy), TRUST
Abstract: Explainable Artificial Intelligence (XAI) is a set of techniques that allows the understanding of both technical and non-technical aspects of Artificial Intelligence (AI) systems. XAI is crucial to help satisfying the increasingly important demand of trustworthy Artificial Intelligence, characterized by fundamental aspects such as respect of human autonomy, prevention of harm, transparency, accountability, etc. Within XAI techniques, counterfactual explanations aim to provide to end users a set of features (and their corresponding values) that need to be changed in order to achieve a desired outcome. Current approaches rarely take into account the feasibility of actions needed to achieve the proposed explanations, and in particular, they fall short of considering the causal impact of such actions. In this paper, we present Counterfactual Explanations as Interventions in Latent Space (CEILS), a methodology to generate counterfactual explanations capturing by design the underlying causal relations from the data, and at the same time to provide feasible recommendations to reach the proposed profile. Moreover, our methodology has the advantage that it can be set on top of existing counterfactuals generator algorithms, thus minimising the complexity of imposing additional causal constrains. We demonstrate the effectiveness of our approach with a set of different experiments using synthetic and real datasets (including a proprietary dataset of the financial domain). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. NICE: an algorithm for nearest instance counterfactual explanations.

Author: Brughmans, Dieter, Leyman, Pieter, and Martens, David
Subjects: MACHINE learning, ALGORITHMS, EXPLANATION, CLASSIFICATION
Abstract: In this paper we propose a new algorithm, named NICE, to generate counterfactual explanations for tabular data that specifically takes into account algorithmic requirements that often emerge in real-life deployments: (1) the ability to provide an explanation for all predictions, (2) being able to handle any classification model (also non-differentiable ones), (3) being efficient in run time, and (4) providing multiple counterfactual explanations with different characteristics. More specifically, our approach exploits information from a nearest unlike neighbor to speed up the search process, by iteratively introducing feature values from this neighbor in the instance to be explained. We propose four versions of NICE, one without optimization and, three which optimize the explanations for one of the following properties: sparsity, proximity or plausibility. An extensive empirical comparison on 40 datasets shows that our algorithm outperforms the current state-of-the-art in terms of these criteria. Our analyses show a trade-off between on the one hand plausibility and on the other hand proximity or sparsity, with our different optimization methods offering users the choice to select the types of counterfactuals that they prefer. An open-source implementation of NICE can be found at https://github.com/ADMAntwerp/NICE. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. A case study of improving a non-technical losses detection system through explainability.

Author: Coma-Puig, Bernat, Calvo, Albert, Carmona, Josep, and Gavaldà, Ricard
Subjects: ARTIFICIAL intelligence, CORPORATE meetings, MACHINE learning, NATURE reserves, BUSINESS analysts
Abstract: Detecting and reacting to non-technical losses (NTL) is a fundamental activity that energy providers need to face in their daily routines. This is known to be challenging since the phenomenon of NTL is multi-factored, dynamic and extremely contextual, which makes artificial intelligence (AI) and, in particular, machine learning, natural areas to bring effective and tailored solutions. If the human factor is disregarded in the process of detecting NTL, there is a high risk of performance degradation since typical problems like dataset shift and biases cannot be easily identified by an algorithm. This paper presents a case study on incorporating explainable AI (XAI) in a mature NTL detection system that has been in production in the last years both in electricity and gas. The experience shows that incorporating this capability brings interesting improvements to the initial system and especially serves as a common ground where domain experts, data scientists, and business analysts can meet. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts.

Author: Schwalbe, Gesina and Finzel, Bettina
Subjects: ARTIFICIAL intelligence, RESEARCH personnel, TAXONOMY, TERMS & phrases, MOTIVATION (Psychology)
Abstract: In the meantime, a wide variety of terminologies, motivations, approaches, and evaluation criteria have been developed within the research field of explainable artificial intelligence (XAI). With the amount of XAI methods vastly growing, a taxonomy of methods is needed by researchers as well as practitioners: To grasp the breadth of the topic, compare methods, and to select the right XAI method based on traits required by a specific use-case context. Many taxonomies for XAI methods of varying level of detail and depth can be found in the literature. While they often have a different focus, they also exhibit many points of overlap. This paper unifies these efforts and provides a complete taxonomy of XAI methods with respect to notions present in the current state of research. In a structured literature analysis and meta-study, we identified and reviewed more than 50 of the most cited and current surveys on XAI methods, metrics, and method traits. After summarizing them in a survey of surveys, we merge terminologies and concepts of the articles into a unified structured taxonomy. Single concepts therein are illustrated by more than 50 diverse selected example methods in total, which we categorize accordingly. The taxonomy may serve both beginners, researchers, and practitioners as a reference and wide-ranging overview of XAI method traits and aspects. Hence, it provides foundations for targeted, use-case-oriented, and context-sensitive future research. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Approximation trees: statistical reproducibility in model distillation.

Author: Zhou, Yichen, Zhou, Zhengze, and Hooker, Giles
Subjects: RANDOM forest algorithms, DECISION trees, DISTILLATION, STATISTICAL models, PREDICTION models
Abstract: This paper examines the reproducibility of learned explanations for black-box predictions via model distillation using classification trees. We find that common tree distillation methods fail to reproduce a single stable explanation when applied to the same teacher model due the randomness of the distillation process. We study this issue of reliable interpretation and propose a standardized framework for tree distillation to achieve reproducibility. The proposed framework consists of (1) a statistical test to stabilize tree splits, and (2) a stopping rule for tree building when using a teacher that provides an estimate of the uncertainty of its predictions, e.g. random forests. We demonstrate the empirical performance of the proposed distillation method on a variety of synthetic and real-world datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Model-agnostic feature importance and effects with dependent features: a conditional subgroup approach.

Author: Molnar, Christoph, König, Gunnar, Bischl, Bernd, and Casalicchio, Giuseppe
Subjects: MACHINE learning, PERMUTATIONS, ARTIFICIAL intelligence
Abstract: The interpretation of feature importance in machine learning models is challenging when features are dependent. Permutation feature importance (PFI) ignores such dependencies, which can cause misleading interpretations due to extrapolation. A possible remedy is more advanced conditional PFI approaches that enable the assessment of feature importance conditional on all other features. Due to this shift in perspective and in order to enable correct interpretations, it is beneficial if the conditioning is transparent and comprehensible. In this paper, we propose a new sampling mechanism for the conditional distribution based on permutations in conditional subgroups. As these subgroups are constructed using tree-based methods such as transformation trees, the conditioning becomes inherently interpretable. This not only provides a simple and effective estimator of conditional PFI, but also local PFI estimates within the subgroups. In addition, we apply the conditional subgroups approach to partial dependence plots, a popular method for describing feature effects that can also suffer from extrapolation when features are dependent and interactions are present in the model. In simulations and a real-world application, we demonstrate the advantages of the conditional subgroup approach over existing methods: It allows to compute conditional PFI that is more true to the data than existing proposals and enables a fine-grained interpretation of feature effects and importance within the conditional subgroups. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Introduction to the special issue for the ECML PKDD 2019 journal track.

Author: Borgwardt, Karsten, Loh, Po-Ling, Terzi, Evimaria, and Ukkonen, Antti
Subjects: DATA mining
Abstract: Highlights from the article: This special issue contains papers accepted to Data Mining and Knowledge Discovery as part of the journal track of ECML PKDD 2019. Since its inception seven years ago, the ECML PKDD journal track has been organised jointly with Springer's Data Mining and Knowledge Discovery and Machine Learning journals. Authors could choose which of the two journals they would like their paper to appear in; the ECML PKDD journal track chairs serve as the guest editors in both journals.
Published: 2019
Full Text: View/download PDF

19. MODE-Bi-GRU: orthogonal independent Bi-GRU model with multiscale feature extraction.

Author: Wang, Wei, Ruan, Wenhan, and Meng, Xiangfu
Subjects: MULTISCALE modeling, FEATURE extraction, GENERALIZATION
Abstract: The core of sentence classification is to extract sentence semantic features. The existing hybrid methods have huge parameters and complex models. Due to the limited dataset, these methods are prone to feature redundancy and overfitting. To address this issue, this paper proposes an orthogonal independent Bi-GRU sentence classification model with multi-scale feature extraction, called Multi-scale Orthogonal Independent Bi-GRU (MODE-Bi-GRU). First, the hidden state of the Bi-GRU model is split into multiple small hidden states, and the corresponding recursive matrix is constrained orthogonally. Then, multiple sliding windows of different sizes are defined according to the forward and reverse angles of the sentence, and the sliding window is obtained. Finally, different sentence fragments are superimposed and input to the model, and the output results of multiple small Bi-GRU models are spliced and processed by soft pooling. The improved focal loss function is adopted to speed up the convergence of the model. Compared to the existing models, our proposed model achieves better results on four benchmark datasets, and it has better generalization ability with fewer parameters. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Guest editors' introduction: special section of selected papers from ECML-PKDD 2012.

Author: Bie, Tijl and Flach, Peter
Subjects: DATA mining, SUBGRAPHS
Abstract: An introduction to the journal is presented in which the guest editors discuss various reports published in the issue including subgraph pattern and data mining.
Published: 2013
Full Text: View/download PDF

21. Guest editors introduction: special issue of the ECMLPKDD 2015 journal track.

Author: Bielza, Concha, Gama, João, Jorge, Alípio, and Žliobaitė, Indrė
Subjects: NETWORK analysis (Communication), DATA mining, DOCUMENT clustering
Abstract: An introduction to the journal is presented in which the editor discusses various articles within the issue on topics including visual data mining, clustering and network analysis.
Published: 2015
Full Text: View/download PDF

22. MMA: metadata supported multi-variate attention for onset detection and prediction.

Author: Ravindranath, Manjusha, Candan, K. Selçuk, Sapino, Maria Luisa, and Appavu, Brian
Subjects: SENSOR placement, TIME series analysis, ELECTROENCEPHALOGRAPHY, ELECTROCARDIOGRAPHY, SEIZURES (Medicine), DEEP learning
Abstract: Deep learning has been applied successfully in sequence understanding and translation problems, especially in univariate, unimodal contexts, where large number of supervision data are available. The effectiveness of deep learning in more complex (multi-modal, multi-variate) contexts, where supervision data is rare, however, is generally not satisfactory. In this paper, we focus on improving detection and prediction accuracy in precisely such contexts – in particular, we focus on the problem of predicting seizure onsets relying on multi-modal (EEG, ICP, ECG, and ABP) sensory data streams, some of which (such as EEG) are inherently multi-variate due to the placement of multiple sensors to capture spatial distribution of the relevant signals. In particular, we note that multi-variate time series often carry robust, spatio-temporally localized features that could help predict onset events. We further argue that such features can be used to support implementation of metadata supported multivariate attention (or MMA) mechanisms that help significantly improve the effectiveness of neural networks architectures. In this paper, we use the proposed MMA approach to develop a multi-modal LSTM-based neural network architecture to tackle seizure onset detection and prediction tasks relying on EEG, ICP, ECG, and ABP data streams. We experimentally evaluate the proposed architecture under different scenarios – the results illustrate the effectiveness of the proposed attention mechanism, especially compared against other metadata driven competitors. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Benchmarking and survey of explanation methods for black box models.

Author: Bodria, Francesco, Giannotti, Fosca, Guidotti, Riccardo, Naretto, Francesca, Pedreschi, Dino, and Rinzivillo, Salvatore
Subjects: MACHINE learning, ARTIFICIAL intelligence, EXPLANATION
Abstract: The rise of sophisticated black-box machine learning models in Artificial Intelligence systems has prompted the need for explanation methods that reveal how these models work in an understandable way to users and decision makers. Unsurprisingly, the state-of-the-art exhibits currently a plethora of explainers providing many different types of explanations. With the aim of providing a compass for researchers and practitioners, this paper proposes a categorization of explanation methods from the perspective of the type of explanation they return, also considering the different input data formats. The paper accounts for the most representative explainers to date, also discussing similarities and discrepancies of returned explanations through their visual appearance. A companion website to the paper is provided as a continuous update to new explainers as they appear. Moreover, a subset of the most robust and widely adopted explainers, are benchmarked with respect to a repertoire of quantitative metrics. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

24. Anomaly detection in sleep: detecting mouth breathing in children.

Author: Biedebach, Luka, Óskarsdóttir, María, Arnardóttir, Erna Sif, Sigurdardóttir, Sigridur, Clausen, Michael Valur, Sigurdardóttir, Sigurveig Þ., Serwatko, Marta, and Islind, Anna Sigridur
Subjects: MOUTH breathing, INTRUSION detection systems (Computer security), SUPERVISED learning, DEEP learning, MACHINE learning, SLEEP, RESPIRATION
Abstract: Identifying mouth breathing during sleep in a reliable, non-invasive way is challenging and currently not included in sleep studies. However, it has a high clinical relevance in pediatrics, as it can negatively impact the physical and mental health of children. Since mouth breathing is an anomalous condition in the general population with only 2% prevalence in our data set, we are facing an anomaly detection problem. This type of human medical data is commonly approached with deep learning methods. However, applying multiple supervised and unsupervised machine learning methods to this anomaly detection problem showed that classic machine learning methods should also be taken into account. This paper compared deep learning and classic machine learning methods on respiratory data during sleep using a leave-one-out cross validation. This way we observed the uncertainty of the models and their performance across participants with varying signal quality and prevalence of mouth breathing. The main contribution is identifying the model with the highest clinical relevance to facilitate the diagnosis of chronic mouth breathing, which may allow more affected children to receive appropriate treatment. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Navigating the metric maze: a taxonomy of evaluation metrics for anomaly detection in time series.

Author: Sørbø, Sondre and Ruocco, Massimiliano
Subjects: TAXONOMY, MAZE tests, MAZE puzzles, EVALUATION methodology, TIME management
Abstract: The field of time series anomaly detection is constantly advancing, with several methods available, making it a challenge to determine the most appropriate method for a specific domain. The evaluation of these methods is facilitated by the use of metrics, which vary widely in their properties. Despite the existence of new evaluation metrics, there is limited agreement on which metrics are best suited for specific scenarios and domains, and the most commonly used metrics have faced criticism in the literature. This paper provides a comprehensive overview of the metrics used for the evaluation of time series anomaly detection methods, and also defines a taxonomy of these based on how they are calculated. By defining a set of properties for evaluation metrics and a set of specific case studies and experiments, twenty metrics are analyzed and discussed in detail, highlighting the unique suitability of each for specific tasks. Through extensive experimentation and analysis, this paper argues that the choice of evaluation metric must be made with care, taking into account the specific requirements of the task at hand. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Design and evaluation of highly accurate smart contract code vulnerability detection framework.

Author: Jeon, Sowon, Lee, Gilhee, Kim, Hyoungshick, and Woo, Simon S.
Subjects: LANGUAGE models, CONTRACTS, COMPUTER security vulnerabilities, BLOCKCHAINS
Abstract: Smart contracts are self-executing programs stored and executed on a blockchain platform. However, previous studies demonstrated that developing secure smart contracts is not easy. Unfortunately, the use of insecure smart contracts results in a significant financial loss for service providers or customers. Therefore, identifying security vulnerabilities in smart contracts would be essential in blockchain platforms using smart contracts. In this paper, we present SmartConDetect as a tool for detecting security vulnerabilities in Solidity smart contracts. SmartConDetect is a static analysis tool that extracts code fragments from Solidity smart contracts and uses a pre-trained BERT model to find susceptible code patterns. To demonstrate the performance of SmartConDetect, we use two public datasets, and our dataset (SmartConDataset) collected from the real-world Ethereum blockchain network. Our experimental results show that SmartConDetect significantly outperforms all state-of-the-art methods, achieving 90.9% F1-score when using our own dataset. Specifically, SmartConDetect is about 2 times faster than SmartCheck in detection. Furthermore, we conduct a real-world case study to analyze the distribution of detected vulnerabilities. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Traffic forecasting on new roads using spatial contrastive pre-training (SCPT).

Author: Prabowo, Arian, Xue, Hao, Shao, Wei, Koniusz, Piotr, and Salim, Flora D.
Subjects: TRAFFIC estimation, TRAFFIC signs & signals, DEEP learning, INTELLIGENT transportation systems
Abstract: New roads are being constructed all the time. However, the capabilities of previous deep forecasting models to generalize to new roads not seen in the training data (unseen roads) are rarely explored. In this paper, we introduce a novel setup called a spatio-temporal split to evaluate the models' capabilities to generalize to unseen roads. In this setup, the models are trained on data from a sample of roads, but tested on roads not seen in the training data. Moreover, we also present a novel framework called Spatial Contrastive Pre-Training (SCPT) where we introduce a spatial encoder module to extract latent features from unseen roads during inference time. This spatial encoder is pre-trained using contrastive learning. During inference, the spatial encoder only requires two days of traffic data on the new roads and does not require any re-training. We also show that the output from the spatial encoder can be used effectively to infer latent node embeddings on unseen roads during inference time. The SCPT framework also incorporates a new layer, named the spatially gated addition layer, to effectively combine the latent features from the output of the spatial encoder to existing backbones. Additionally, since there is limited data on the unseen roads, we argue that it is better to decouple traffic signals to trivial-to-capture periodic signals and difficult-to-capture Markovian signals, and for the spatial encoder to only learn the Markovian signals. Finally, we empirically evaluated SCPT using the ST split setup on four real-world datasets. The results showed that adding SCPT to a backbone consistently improves forecasting performance on unseen roads. More importantly, the improvements are greater when forecasting further into the future. The codes are available on GitHub:https://github.com/cruiseresearchgroup/forecasting-on-new-roads. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Community detection in interval-weighted networks.

Author: Alves, Hélder, Brito, Paula, and Campos, Pedro
Subjects: SOCIAL network analysis, CONTINGENCY tables
Abstract: In this paper we introduce and develop the concept of interval-weighted networks (IWN), a novel approach in Social Network Analysis, where the edge weights are represented by closed intervals composed with precise information, comprehending intrinsic variability. We extend IWN for both Newman's modularity and modularity gain and the Louvain algorithm, considering a tabular representation of networks by contingency tables. We apply our methodology to two real-world IWN. The first is a commuter network in mainland Portugal, between the twenty three NUTS 3 Regions (IWCN). The second focuses on annual merchandise trade between 28 European countries, from 2003 to 2015 (IWTN). The optimal partition of geographic locations (regions or countries) is developed and compared using two new different approaches, designated as "Classic Louvain" and "Hybrid Louvain" , which allow taking into account the variability observed in the original network, thereby minimizing the loss of information present in the raw data. Our findings suggest the division of the twenty three Portuguese regions in three main communities for the IWCN and between two to three country communities for the IWTN. However, we find different geographical partitions according to the community detection methodology used. This analysis can be useful in many real-world applications, since it takes into account that the weights may vary within the ranges, rather than being constant. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. An anomaly aware network embedding framework for unsupervised anomalous link detection.

Author: Duan, Dongsheng, Zhang, Cheng, Tong, Lingling, Lu, Jie, Lv, Cunchi, Hou, Wei, Li, Yangxi, and Zhao, Xiaofang
Subjects: VIRTUAL networks, NETWORK performance
Abstract: Most existing network embedding based anomalous link detection methods regard network embedding and anomalous link detection as two independent tasks. However, removing anomalous links from the original network can reduce the data noise, thus hopefully improving the performance of network embedding models and anomalous link detection. In this paper, we propose an Anomaly Aware Network Embedding (AANE) framework by simultaneously learning node embedding and detecting anomalous links in a unified way. To instantiate the AANE framework, we propose a heuristic anomalous link selection based model AANE-H and an embedding disentangling based model AANE-D on Graph Auto-Encoder (GAE). In AANE-H, we adopt an anomalous link selector to iteratively select significant anomalous links based on a heuristic rule during model training, while in AANE-D the normal and anomalous links are generated by disentangled normal and anomalous embedding respectively. For the evaluation purpose, we propose a heuristic anomalous link generation algorithm to inject synthetic anomalous links into six real-world network datasets used in our experiments. Experimental results show that AANE outperforms both the state-of-the-art network embedding models and anomalous node detection models in terms of anomalous link detection performance. As a general network embedding model, AANE can also improve other downstream tasks like node classification. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Mondrian forest for data stream classification under memory constraints.

Author: Khannouz, Martin and Glatard, Tristan
Subjects: MACHINE learning, SUPERVISED learning, MEMORY, CLASSIFICATION algorithms
Abstract: Supervised learning algorithms generally assume the availability of enough memory to store data models during the training and test phases. However, this assumption is unrealistic when data comes in the form of infinite data streams, or when learning algorithms are deployed on devices with reduced amounts of memory. In this paper, we adapt the online Mondrian forest classification algorithm to work with memory constraints on data streams. In particular, we design five out-of-memory strategies to update Mondrian trees with new data points when the memory limit is reached. Moreover, we design node trimming mechanisms to make Mondrian trees more robust to concept drifts under memory constraints. We evaluate our algorithms on a variety of real and simulated datasets, and we conclude with recommendations on their use in different situations: the Extend Node strategy appears as the best out-of-memory strategy in all configurations, whereas different node trimming mechanisms should be adopted depending on whether a concept drift is expected. All our methods are implemented in the OrpailleCC open-source library and are ready to be used on embedded systems and connected objects. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Regression tree-based active learning.

Author: Jose, Ashna, de Mendonça, João Paulo Almeida, Devijver, Emilie, Jakse, Noël, Monbet, Valérie, and Poloni, Roberta
Subjects: ACTIVE learning, MACHINE learning, REGRESSION analysis, SAMPLE size (Statistics), REGRESSION trees, SUPERVISED learning
Abstract: Machine learning algorithms often require large training sets to perform well, but labeling such large amounts of data is not always feasible, as in many applications, substantial human effort and material cost is needed. Finding effective ways to reduce the size of training sets while maintaining the same performance is then crucial: one wants to choose the best sample of fixed size to be labeled among a given population, aiming at an accurate prediction of the response. This challenge has been studied in detail in classification, but not deeply enough in regression, which is known to be a more difficult task for active learning despite its need in practice. Few model-free active learning methods have been proposed that detect the new samples to be labeled using unlabeled data, but they lack the information of the conditional distribution between the response and the features. In this paper, we propose a standard regression tree-based active learning method for regression that improves significantly upon existing active learning approaches. It provides impressive results for small and large training sets and an appreciably low variance within several runs. We also exploit model-free approaches, and adapt them to our algorithm to utilize maximum information. Through experiments on numerous benchmark datasets, we demonstrate that our framework improves existing methods and is effective in learning a regression model from a very limited labeled dataset, reducing the sample size for a fixed level of performance, even with many features. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. SALτ: efficiently stopping TAR by improving priors estimates.

Author: Molinari, Alessio and Esuli, Andrea
Subjects: TAR, ACTIVE learning, INFORMATION needs, OPTIMAL stopping (Mathematical statistics)
Abstract: In high recall retrieval tasks, human experts review a large pool of documents with the goal of satisfying an information need. Documents are prioritized for review through an active learning policy, and the process is usually referred to as Technology-Assisted Review (TAR). TAR tasks also aim to stop the review process once the target recall is achieved to minimize the annotation cost. In this paper, we introduce a new stopping rule called SAL τ R (SLD for Active Learning), a modified version of the Saerens–Latinne–Decaestecker algorithm (SLD) that has been adapted for use in active learning. Experiments show that our algorithm stops the review well ahead of the current state-of-the-art methods, while providing the same guarantees of achieving the target recall. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Editorial.

Author: Fürnkranz, Johannes
Subjects: RESEARCH papers (Students), DATA mining, ADVISORY boards, EDITORIAL boards
Abstract: The author focuses on editor-in-chief Geoffrey Webb of the journal and his contributions towards the journal. He mentions the role of Geoffrey in accepting the submitted papers for the journal and receiving fast feedback of the highest quality. He also mentions his service as Advisory Board of the journal and focus on data mining.
Published: 2015
Full Text: View/download PDF

34. On computing exact means of time series using the move-split-merge metric.

Author: Holznigenkemper, Jana, Komusiewicz, Christian, and Seeger, Bernhard
Subjects: TIME management, K-means clustering, TIME series analysis, DYNAMIC programming
Abstract: Computing an accurate mean of a set of time series is a critical task in applications like nearest-neighbor classification and clustering of time series. While there are many distance functions for time series, the most popular distance function used for the computation of time series means is the non-metric dynamic time warping (DTW) distance. A recent algorithm for the exact computation of a DTW-Mean has a running time of O (n 2 k + 1 2 k k) , where k denotes the number of time series and n their maximum length. In this paper, we study the mean problem for the move-split-merge (MSM) metric that not only offers high practical accuracy for time series classification but also carries of the advantages of the metric properties that enable further diverse applications. The main contribution of this paper is an exact and efficient algorithm for the MSM-Mean problem of time series. The running time of our algorithm is O (n k + 3 2 k k 3) , and thus better than the previous DTW-based algorithm. The results of an experimental comparison confirm the running time superiority of our algorithm in comparison to the DTW-Mean competitor. Moreover, we introduce a heuristic to improve the running time significantly without sacrificing much accuracy. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

35. Improving neural network's robustness on tabular data with D-layers.

Author: Xia, Haiyang, Zaidi, Nayyar, Zhang, Yishuo, and Li, Gang
Subjects: MACHINE learning, ARTIFICIAL neural networks
Abstract: Artificial neural networks (ANN ) are widely used machine learning models. Their widespread use has attracted a lot of interest in their robustness. Many studies show that ANN's performance can be highly vulnerable to input manipulation such as adversarial attacks and covariate drift. Therefore, various techniques that focus on improving ANN 's robustness have been proposed in the last few years. However, most of these works have mostly focused on image data. In this paper, we investigate the role of discretization in improving ANN 's robustness on tabular datasets. Two custom ANN layers– D1-Layer and D2-Layer (collectively called D-Layers) are proposed. The two layers integrate discretization during the training phase to improve ANN 's ability to defend against adversarial attacks. Additionally, D2-Layer integrates dynamic discretization during testing phase as well, to provide a unified strategy to handle adversarial attacks and covariate drift. The experimental results on 24 publicly available datasets show that our proposed D-Layers add much-needed robustness to ANN for tabular datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Variable screening for Lasso based on multidimensional indexing.

Author: Żogała-Siudem, Barbara and Jaroszewicz, Szymon
Subjects: NUMERIC databases, REGULARIZATION parameter, DATABASES, INDEXING, MULTIDIMENSIONAL databases
Abstract: In this paper we present a correlation based safe screening technique for building the complete Lasso path. Unlike many other Lasso screening approaches we do not consider prespecified values of the regularization parameter, but, instead, prune variables which cannot be the next best feature to be added to the model. Based on those results we present a modified homotopy algorithm for computing the regularization path. We demonstrate that, even though our algorithm provides the complete Lasso path, its performance is competitive with state of the art algorithms which, however, only provide solutions at a prespecified sample of regularization parameters. We also address problems of extremely high dimensionality, where the variables may not fit into main memory and are assumed to be stored on disk. A multidimensional index is used to quickly retrieve potentially relevant variables. We apply the approach to the important case when multiple models are built against a fixed set of variables, frequently encountered in statistical databases. We perform experiments using the complete Eurostat database as predictors and demonstrate that our approach allows for practical and efficient construction of Lasso models, which remain accurate and interpretable even when millions of highly correlated predictors are present. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Category tree distance: a taxonomy-based transaction distance for web user analysis.

Author: Zhang, Yinjia, Zhao, Qinpei, Shi, Yang, Li, Jiangfeng, and Rao, Weixiong
Subjects: INTERNET users, K-nearest neighbor classification, CLUSTER analysis (Statistics), VECTOR data, TREES, CYBERSPACE
Abstract: With the emergence of webpage services, huge amounts of customer transaction data are flooded in cyberspace, which are getting more and more useful for profiling users and making recommendations. Since web user transaction data are usually multi-modal, heterogeneous and large-scale, the traditional data analysis methods meet new challenges. One of the challenges is the distance definition on two transaction data or two web users. The distance definition takes an important role in further analysis, such as the cluster analysis or k-nearest neighbor query. We introduce a category tree distance in this paper, which makes use of the product taxonomy information to convert the user transaction data to vectors. Then, the similarity between web users can be evaluated by the vectors from their transaction data. The properties of the distance like upper and lower bounds and the complexity analysis are also given in the paper. To investigate the performance of the proposal, we conduct experiments on real web user transaction data. The results show that the proposed distance outperforms the other distances on user transaction analysis. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

38. An alternating nonmonotone projected Barzilai–Borwein algorithm of nonnegative factorization of big matrices.

Author: Li, Ting, Tang, Jiayi, and Wan, Zhong
Subjects: ALGORITHMS, FACTORIZATION, MATRIX decomposition, NONNEGATIVE matrices, IMAGE reconstruction, TRANSCRIPTOMES, FACE
Abstract: In this paper, a new alternating nonmonotone projected Barzilai–Borwein (BB) algorithm is developed for solving large scale problems of nonnegative matrix factorization. Unlike the existing algorithms available in the literature, a nonmonotone line search strategy is proposed to find suitable step lengths, and an adaptive BB spectral parameter is employed to generate search directions such that the constructed subproblems are efficiently solved. Apart from establishment of global convergence for this algorithm, numerical tests on three synthetic datasets, four public face image datasets and a real-world transcriptomic dataset are conducted to show advantages of the developed algorithm in this paper. It is concluded that in terms of numerical efficiency, noise robustness and quality of matrix factorization, our algorithm is promising and applicable to face image reconstruction, and deep mining of transcriptomic profiles of the sub-genomes in hybrid fish lineage, compared with the state-of-the-art algorithms. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

39. Bounding the family-wise error rate in local causal discovery using Rademacher averages.

Author: Simionato, Dario and Vandin, Fabio
Subjects: STATISTICAL learning, ERROR rates, FALSE discovery rate, ALGORITHMS
Abstract: Many algorithms have been proposed to learn local graphical structures around target variables of interest from observational data, focusing on two sets of variables. The first one, called Parent–Children (PC) set, contains all the variables that are direct causes or consequences of the target while the second one, known as Markov boundary (MB), is the minimal set of variables with optimal prediction performances of the target. In this paper we introduce two novel algorithms for the PC and MB discovery tasks with rigorous guarantees on the Family-Wise Error Rate (FWER), that is, the probability of reporting any false positive in output. Our algorithms use Rademacher averages, a key concept from statistical learning theory, to properly account for the multiple-hypothesis testing problem arising in such tasks. Our evaluation on simulated data shows that our algorithms properly control for the FWER, while widely used algorithms do not provide guarantees on false discoveries even when correcting for multiple-hypothesis testing. Our experiments also show that our algorithms identify meaningful relations in real-world data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. Model-agnostic variable importance for predictive uncertainty: an entropy-based approach.

Author: Wood, Danny, Papamarkou, Theodore, Benatan, Matt, and Allmendinger, Richard
Subjects: MACHINE learning, CONDITIONAL expectations, PREDICTION models, ENTROPY, FORECASTING
Abstract: In order to trust the predictions of a machine learning algorithm, it is necessary to understand the factors that contribute to those predictions. In the case of probabilistic and uncertainty-aware models, it is necessary to understand not only the reasons for the predictions themselves, but also the reasons for the model's level of confidence in those predictions. In this paper, we show how existing methods in explainability can be extended to uncertainty-aware models and how such extensions can be used to understand the sources of uncertainty in a model's predictive distribution. In particular, by adapting permutation feature importance, partial dependence plots, and individual conditional expectation plots, we demonstrate that novel insights into model behaviour may be obtained and that these methods can be used to measure the impact of features on both the entropy of the predictive distribution and the log-likelihood of the ground truth labels under that distribution. With experiments using both synthetic and real-world data, we demonstrate the utility of these approaches to understand both the sources of uncertainty and their impact on model performance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack.

Author: Manzanares-Salor, Benet, Sánchez, David, and Lison, Pierre
Subjects: LANGUAGE models, MACHINE learning, DATA mining, DISCLOSURE, PERSONALLY identifiable information
Abstract: The availability of textual data depicting human-centered features and behaviors is crucial for many data mining and machine learning tasks. However, data containing personal information should be anonymized prior making them available for secondary use. A variety of text anonymization methods have been proposed in the last years, which are standardly evaluated by comparing their outputs with human-based anonymizations. The residual disclosure risk is estimated with the recall metric, which quantifies the proportion of manually annotated re-identifying terms successfully detected by the anonymization algorithm. Nevertheless, recall is not a risk metric, which leads to several drawbacks. First, it requires a unique ground truth, and this does not hold for text anonymization, where several masking choices could be equally valid to prevent re-identification. Second, it relies on human judgements, which are inherently subjective and prone to errors. Finally, the recall metric weights terms uniformly, thereby ignoring the fact that the influence on the disclosure risk of some missed terms may be much larger than of others. To overcome these drawbacks, in this paper we propose a novel method to evaluate the disclosure risk of anonymized texts by means of an automated re-identification attack. We formalize the attack as a multi-class classification task and leverage state-of-the-art neural language models to aggregate the data sources that attackers may use to build the classifier. We illustrate the effectiveness of our method by assessing the disclosure risk of several methods for text anonymization under different attack configurations. Empirical results show substantial privacy risks for most existing anonymization methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Efficient learning with projected histograms.

Author: Huang, Zhanliang, Kabán, Ata, and Reeve, Henry
Subjects: BUDGET management, GENERALIZATION, PRIVACY, CLASSIFICATION, ALGORITHMS
Abstract: High dimensional learning is a perennial problem due to challenges posed by the "curse of dimensionality"; learning typically demands more computing resources as well as more training data. In differentially private (DP) settings, this is further exacerbated by noise that needs adding to each dimension to achieve the required privacy. In this paper, we present a surprisingly simple approach to address all of these concerns at once, based on histograms constructed on a low-dimensional random projection (RP) of the data. Our approach exploits RP to take advantage of hidden low-dimensional structures in the data, yielding both computational efficiency, and improved error convergence with respect to the sample size—whereby less training data suffice for learning. We also propose a variant for efficient differentially private (DP) classification that further exploits the data-oblivious nature of both the histogram construction and the RP based dimensionality reduction, resulting in an efficient management of the privacy budget. We present a detailed and rigorous theoretical analysis of generalisation of our algorithms in several settings, showing that our approach is able to exploit low-dimensional structure of the data, ameliorates the ill-effects of noise required for privacy, and has good generalisation under minimal conditions. We also corroborate our findings experimentally, and demonstrate that our algorithms achieve competitive classification accuracy in both non-private and private settings. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Opinion dynamics in social networks incorporating higher-order interactions.

Author: Zhang, Zuobai, Xu, Wanyue, Zhang, Zhongzhi, and Chen, Guanrong
Subjects: RANDOM walks, GRAPH theory, MATRIX multiplications, SPECTRAL theory, SOCIAL networks
Abstract: The issue of opinion sharing and formation has received considerable attention in the academic literature, and a few models have been proposed to study this problem. However, existing models are limited to the interactions among nearest neighbors, with those second, third, and higher-order neighbors only considered indirectly, despite the fact that higher-order interactions occur frequently in real social networks. In this paper, we develop a new model for opinion dynamics by incorporating long-range interactions based on higher-order random walks that can explicitly tune the degree of influence of higher-order neighbor interactions. We prove that the model converges to a fixed opinion vector, which may differ greatly from those models without higher-order interactions. Since direct computation of the equilibrium opinion is computationally expensive, which involves the operations of huge-scale matrix multiplication and inversion, we design a theoretically convergence-guaranteed estimation algorithm that approximates the equilibrium opinion vector nearly linearly in both space and time with respect to the number of edges in the graph. We conduct extensive experiments on various social networks, demonstrating that the new algorithm is both highly efficient and effective. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. De-confounding representation learning for counterfactual inference on continuous treatment via generative adversarial network.

Author: Zhao, Yonghe, Huang, Qiang, Zeng, Haolong, Peng, Yun, and Sun, Huiyan
Subjects: GENERATIVE adversarial networks, STATISTICAL models, CAUSAL inference, ERYTHROCYTES, COUNTERFACTUALS (Logic)
Abstract: Counterfactual inference for continuous rather than binary treatment variables is more common in real-world causal inference tasks. While there are already some sample reweighting methods based on Marginal Structural Model for eliminating the confounding bias, they generally focus on removing the treatment's linear dependence on confounders and rely on the accuracy of the assumed parametric models, which are usually unverifiable. In this paper, we propose a de-confounding representation learning (DRL) framework for counterfactual outcome estimation of continuous treatment by generating the representations of covariates decorrelated with the treatment variables. The DRL is a non-parametric model that eliminates both linear and nonlinear dependence between treatment and covariates. Specifically, we train the correlations between the de-confounding representations and the treatment variables against the correlations between the covariate representations and the treatment variables to eliminate confounding bias. Further, a counterfactual inference network is embedded into the framework to make the learned representations serve both de-confounding and trusted inference. Extensive experiments on synthetic and semi-synthetic datasets show that the DRL model performs superiorly in learning de-confounding representations and outperforms state-of-the-art counterfactual inference models for continuous treatment variables. In addition, we apply the DRL model to a real-world medical dataset MIMIC III and demonstrate a detailed causal relationship between red cell width distribution and mortality. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Explainable decomposition of nested dense subgraphs.

Author: Tatti, Nikolaj
Subjects: DENSE graphs, NP-hard problems, TREE size, SUBGRAPHS, ALGORITHMS, GREEDY algorithms
Abstract: Discovering dense regions in a graph is a popular tool for analyzing graphs. While useful, analyzing such decompositions may be difficult without additional information. Fortunately, many real-world networks have additional information, namely node labels. In this paper we focus on finding decompositions that have dense inner subgraphs and that can be explained using labels. More formally, we construct a binary tree T with labels on non-leaves that we use to partition the nodes in the input graph. To measure the quality of the tree, we model the edges in the shell and the cross edges to the inner shells as a Bernoulli variable. We reward the decompositions with the dense regions by requiring that the model parameters are non-increasing. We show that our problem is NP-hard, even inapproximable if we constrain the size of the tree. Consequently, we propose a greedy algorithm that iteratively finds the best split and applies it to the current tree. We demonstrate how we can efficiently compute the best split by maintaining certain counters. Our experiments show that our algorithm can process networks with over million edges in few minutes. Moreover, we show that the algorithm can find the ground truth in synthetic data and produces interpretable decompositions when applied to real world networks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. On regime changes in text data using hidden Markov model of contaminated vMF distribution.

Author: Zhang, Yingying, Sarkar, Shuchismita, Chen, Yuanyuan, and Zhu, Xuwen
Subjects: FINANCIAL statements, MARKOV processes, EXPECTATION-maximization algorithms
Abstract: This paper presents a novel methodology for analyzing temporal directional data with scatter and heavy tails. A hidden Markov model with contaminated von Mises-Fisher emission distribution is developed. The model is implemented using forward and backward selection approach that provides additional flexibility for contaminated as well as non-contaminated data. The utility of the method for finding homogeneous time blocks (regimes) is demonstrated on several experimental settings and two real-life text data sets containing presidential addresses and corporate financial statements respectively. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Discord-based counterfactual explanations for time series classification.

Author: Bahri, Omar, Li, Peiyu, Filali Boubrahimi, Soukaina, and Hamdi, Shah Muhammad
Subjects: MACHINE learning, ARTIFICIAL intelligence, TIME series analysis, COUNTERFACTUALS (Logic), DATA mining
Abstract: The opacity inherent in machine learning models presents a significant hindrance to their widespread incorporation into decision-making processes. To address this challenge and foster trust among stakeholders while ensuring decision fairness, the data mining community has been actively advancing the explainable artificial intelligence paradigm. This paper contributes to the evolving field by focusing on counterfactual generation for time series classification models, a domain where research is relatively scarce. We develop, a post-hoc, model agnostic counterfactual explanation algorithm that leverages the Matrix Profile to map time series discords to their nearest neighbors in a target sequence and use this mapping to generate new counterfactual instances. To our knowledge, this is the first effort towards the use of time series discords for counterfactual explanations. We evaluate our algorithm on the University of California Riverside and University of East Anglia archives and compare it to three state-of-the-art univariate and multivariate methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Robust explainer recommendation for time series classification.

Author: Nguyen, Thu Trang, Le Nguyen, Thach, and Ifrim, Georgiana
Subjects: HUMAN activity recognition, TIME series analysis, DATA analytics, LEAD time (Supply chain management), QUANTITATIVE research
Abstract: Time series classification is a task which deals with temporal sequences, a prevalent data type common in domains such as human activity recognition, sports analytics and general sensing. In this area, interest in explanability has been growing as explanation is key to understand the data and the model better. Recently, a great variety of techniques (e.g., LIME, SHAP, CAM) have been proposed and adapted for time series to provide explanation in the form of saliency maps, where the importance of each data point in the time series is quantified with a numerical value. However, the saliency maps can and often disagree, so it is unclear which one to use. This paper provides a novel framework to quantitatively evaluate and rank explanation methods for time series classification. We show how to robustly evaluate the informativeness of a given explanation method (i.e., relevance for the classification task), and how to compare explanations side-by-side. The goal is to recommend the best explainer for a given time series classification dataset. We propose AMEE, a Model-Agnostic Explanation Evaluation framework, for recommending saliency-based explanations for time series classification. In this approach, data perturbation is added to the input time series guided by each explanation. Our results show that perturbing discriminative parts of the time series leads to significant changes in classification accuracy, which can be used to evaluate each explanation. To be robust to different types of perturbations and different types of classifiers, we aggregate the accuracy loss across perturbations and classifiers. This novel approach allows us to recommend the best explainer among a set of different explainers, including random and oracle explainers. We provide a quantitative and qualitative analysis for synthetic datasets, a variety of time-series datasets, as well as a real-world case study with known expert ground truth. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Introduction to the special issue of the ECML PKDD 2021 journal track.

Author: Appice, Annalisa, Escalera, Sergio, Gámez, Jose A., and Trautmann, Heike
Subjects: DATA mining, TIME series analysis, MACHINE learning
Published: 2021
Full Text: View/download PDF

50. Introduction to the special issue for the ECML PKDD 2018 journal track.

Author: Greene, Derek, Bringmann, Bjørn, Fromont, Elisa, and Davis, Jesse
Subjects: DATA mining, MACHINE learning, DISTRIBUTION (Probability theory), TIME series analysis, ANOMALY detection (Computer security)
Published: 2018
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Region

Database

1,088 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources