28 results on '"Maddouri, Mondher"'
Search Results
2. Multiple instance learning for sequence data with across bag dependencies
- Author
-
Zoghlami, Manel, Aridhi, Sabeur, Maddouri, Mondher, and Mephu Nguifo, Engelbert
- Published
- 2020
- Full Text
- View/download PDF
3. A Structure Based Multiple Instance Learning Approach for Bacterial Ionizing Radiation Resistance Prediction
- Author
-
Zoghlami, Manel, Aridhi, Sabeur, Maddouri, Mondher, and Nguifo, Engelbert Mephu
- Published
- 2019
- Full Text
- View/download PDF
4. An experimental survey on big data frameworks
- Author
-
Inoubli, Wissem, Aridhi, Sabeur, Mezni, Haithem, Maddouri, Mondher, and Mephu Nguifo, Engelbert
- Published
- 2018
- Full Text
- View/download PDF
5. A New Feature Selection Method for Nominal Classifier based on Formal Concept Analysis
- Author
-
Trabelsi, Marwa, Meddouri, Nida, and Maddouri, Mondher
- Published
- 2017
- Full Text
- View/download PDF
6. Density-based data partitioning strategy to approximate large-scale subgraph mining
- Author
-
Aridhi, Sabeur, d'Orazio, Laurent, Maddouri, Mondher, and Mephu Nguifo, Engelbert
- Published
- 2015
- Full Text
- View/download PDF
7. A Novel Scalable Clustering Method for Distributed Networks
- Author
-
Inoubli, Wissm, Aridhi, Sabeur, Mezni, Haithem, Maddouri, Mondher, Mephu Nguifo, Engelbert, Université de Tunis El Manar (UTM), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria), Strategies for Modelling and ARtificial inTelligence Laboratory (SMART-LAB), Université de Tunis, University of Jeddah (University of Jeddah), Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS), Ecole Nationale Supérieure des Mines de St Etienne-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020]), Computational Algorithms for Protein Structures and Interactions (CAPSID), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS), and inoubli, wissem
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,Structural graph clustering ,Community detection ,[INFO.INFO-DC] Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB] ,Graph processing ,Outliers detection ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,hubs detection ,Big Graph Analysis ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
Graph clustering is one of the key techniques to understand structures that are present in networks. In addition to clusters, bridges and outliers detection is also a critical task as it plays an important role in the analysis of networks. Recently, several graph clustering methods are developed and used in multiple application domains such as biological network analysis, recommendation systems and community detection. Most of these algorithms are based on the structural clustering algorithm. Yet, this kind of algorithm is based on the structural similarity, this later requires to parse all graph ' edges in order to compute the structural similarity. However, the height needs of similarity computing make this algorithm more adequate for small graphs, without significant support to deal with large-scale networks. In this paper, we propose a novel distributed graph clustering algorithm based on structural graph clustering. The experimental results show the efficiency in terms of running time of the proposed algorithm in large networks compared to existing structural graph clustering methods.
- Published
- 2020
8. New voting strategies designed for the classification of nucleic sequences
- Author
-
Elloumi, Mourad and Maddouri, Mondher
- Published
- 2005
- Full Text
- View/download PDF
9. Encoding of primary structures of biological macromolecules within a data mining perspective
- Author
-
Maddouri, Mondher and Elloumi, Mourad
- Published
- 2004
- Full Text
- View/download PDF
10. An experimental survey on big data frameworks (Highlight paper)
- Author
-
Inoubli, Wissem, Aridhi, Sabeur, Mezni, Haithem, Maddouri, Mondher, Nguifo, Engelbert, Université de Tunis El Manar (UTM), Computational Algorithms for Protein Structures and Interactions (CAPSID), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Université de Jendouba (UJ), Taibah University, Laboratoire d'Informatique, de Modélisation et d'optimisation des Systèmes (LIMOS), Université Blaise Pascal - Clermont-Ferrand 2 (UBP)-Université d'Auvergne - Clermont-Ferrand I (UdA)-SIGMA Clermont (SIGMA Clermont)-Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), and SIGMA Clermont (SIGMA Clermont)-Université d'Auvergne - Clermont-Ferrand I (UdA)-Ecole Nationale Supérieure des Mines de St Etienne-Centre National de la Recherche Scientifique (CNRS)-Université Blaise Pascal - Clermont-Ferrand 2 (UBP)
- Subjects
Spark ,Samza ,Big data ,HDFS ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,Batch/stream processing ,Hadoop ,Storm ,MapReduce ,[INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC] ,Flink - Abstract
International audience; Recently, increasingly large amounts of data are generated from a variety of sources. Existing data processing technologies are not suitable to cope with the huge amounts of generated data. Yet, many research works focus on Big Data, a buzzword referring to the processing of massive volumes of (unstructured) data. Recently proposed frameworks for Big Data applications help to store, analyze and process the data. In this paper, we discuss the challenges of Big Data and we survey existing Big Data frameworks. We also present an experimental evaluation and a comparative study of the most popular Big Data frameworks with several representative batch.
- Published
- 2018
11. An Overview of in Silico Methods for the Prediction of Ionizing Radiation Resistance in Bacteria
- Author
-
Zoghlami, Manel, Aridhi, Sabeur, Maddouri, Mondher, Mephu Nguifo, Engelbert, Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS), Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Informatique, Programmation, Algorithmique et Heuristique (LIPAH), Faculté des Sciences Mathématiques, Physiques et Naturelles de Tunis (FST), Université de Tunis El Manar (UTM)-Université de Tunis El Manar (UTM), Computational Algorithms for Protein Structures and Interactions (CAPSID), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Taibah University, University of Jeddah, Tamar Reeve, Ecole Nationale Supérieure des Mines de St Etienne-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Informatique, de Modélisation et d'optimisation des Systèmes ( LIMOS ), Centre National de la Recherche Scientifique ( CNRS ) -Sigma CLERMONT ( Sigma CLERMONT ) -Université d'Auvergne - Clermont-Ferrand I ( UdA ) -Université Blaise Pascal - Clermont-Ferrand 2 ( UBP ), Laboratoire Lorrain de Recherche en Informatique et ses Applications ( LORIA ), Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université de Lorraine ( UL ) -Centre National de la Recherche Scientifique ( CNRS ), Centre National de la Recherche Scientifique ( CNRS ), Computational Algorithms for Protein Structures and Interactions ( CAPSID ), Institut National de Recherche en Informatique et en Automatique ( Inria ) -Institut National de Recherche en Informatique et en Automatique ( Inria ) -Department of Complex Systems, Artificial Intelligence & Robotics ( LORIA - AIS ), Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université de Lorraine ( UL ) -Centre National de la Recherche Scientifique ( CNRS ) -Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université de Lorraine ( UL ) -Centre National de la Recherche Scientifique ( CNRS ) -Laboratoire Lorrain de Recherche en Informatique et ses Applications ( LORIA ), Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université de Lorraine ( UL ) -Centre National de la Recherche Scientifique ( CNRS ) -Université de Lorraine ( UL ) -Centre National de la Recherche Scientifique ( CNRS ), Ecole Nationale Supérieure des Mines de St Etienne-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020]), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)
- Subjects
bacterial ionizing radiation resistance ,phenotype prediction ,multiple instance learning ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[ INFO.INFO-BI ] Computer Science [cs]/Bioinformatics [q-bio.QM] ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,[ INFO.INFO-AI ] Computer Science [cs]/Artificial Intelligence [cs.AI] ,ComputingMilieux_MISCELLANEOUS ,[ INFO.INFO-LG ] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Abstract
International audience; Ionizing-radiation-resistant bacteria (IRRB) could be used for biore-mediation of radioactive wastes and in the therapeutic industry. Limited computational works are available for the prediction of bacterial ionizing radiation resistance (IRR). In this chapter, we present some works that study the causes of the high resistance of IRRB to ionizing radiation. Then we focus on presenting in silico approaches that use protein sequences of bacteria in order to predict if an unknown bacterium belongs to IRRB or ionizing-radiation-sensitive bacteria (IRSB). These approaches formulate the problem of predicting bacterial IRR as a multiple instance learning (MIL) problem where bacteria represent the bags and * Corresponding Author: manel.zoghlami@gmail.com. 2 Manel Zoghlami, Sabeur Aridhi, Mondher Maddouri et al. primary structure of basal DNA repair proteins of each bacterium represent the instances inside the bags. We also present a formulation of the problem of MIL in sequence data and explain how it could be used to solve the problem of IRR prediction in bacteria. A brief comparison of the presented approaches is provided.
- Published
- 2018
12. ABClass : Une approche d'apprentissage multi-instances pour les séquences(ABClass: A multiple instance learning approach for sequence data)
- Author
-
Zoghlami, Manel, Aridhi, Sabeur, Maddouri, Mondher, Mephu Nguifo, Engelbert, Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS), Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS), Ecole Nationale Supérieure des Mines de St Etienne-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020]), and DOREAU, Bastien
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Published
- 2018
13. Protein sequences classification by means of feature extraction with substitution matrices
- Author
-
Saidi Rabie, Maddouri Mondher, and Mephu Nguifo Engelbert
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background This paper deals with the preprocessing of protein sequences for supervised classification. Motif extraction is one way to address that task. It has been largely used to encode biological sequences into feature vectors to enable using well-known machine-learning classifiers which require this format. However, designing a suitable feature space, for a set of proteins, is not a trivial task. For this purpose, we propose a novel encoding method that uses amino-acid substitution matrices to define similarity between motifs during the extraction step. Results In order to demonstrate the efficiency of such approach, we compare several encoding methods using some machine learning classifiers. The experimental results showed that our encoding method outperforms other ones in terms of classification accuracy and number of generated attributes. We also compared the classifiers in term of accuracy. Results indicated that SVM generally outperforms the other classifiers with any encoding method. We showed that SVM, coupled with our encoding method, can be an efficient protein classification system. In addition, we studied the effect of the substitution matrices variation on the quality of our method and hence on the classification quality. We noticed that our method enables good classification accuracies with all the substitution matrices and that the variances of the obtained accuracies using various substitution matrices are slight. However, the number of generated features varies from a substitution matrix to another. Furthermore, the use of already published datasets allowed us to carry out a comparison with several related works. Conclusions The outcomes of our comparative experiments confirm the efficiency of our encoding method to represent protein sequences in classification tasks.
- Published
- 2010
- Full Text
- View/download PDF
14. A multiple instance learning approach for sequence data with across bag dependencies
- Author
-
Zoghlami, Manel, Aridhi, Sabeur, Sghaier, Haïtham, Maddouri, Mondher, Mephu Nguifo, Engelbert, DOREAU, Bastien, Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS), Ecole Nationale Supérieure des Mines de St Etienne-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020]), and Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Published
- 2016
15. Cost models for distributed pattern mining in the cloud: application to graph patterns
- Author
-
Aridhi, Sabeur, d'Orazio, Laurent, Maddouri, Mondher, Mephu Nguifo, Engelbert, Dorazio, Laurent, Nguifo, Engelbert Mephu, University of Trento [Trento], Université Blaise Pascal - Clermont-Ferrand 2 (UBP), Université de Tunis El Manar (UTM), and Taibah University
- Subjects
[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,[INFO]Computer Science [cs] - Abstract
International audience; Recently, distributed pattern mining approaches have become very popular, especially in certain domains such as bioinformatics, chemoinformatics and social networks. In most cases, the distribution of the pattern mining process generates a loss of information in the output results. Reducing this loss may affect the performance of the distributed approach and thus, the monetary cost when using cloud environments. In this context, cost models are needed to help selecting the best parameters of the used approach in order to achieve a better performance especially in the cloud. In this paper, we address the multi-criteria optimization problem of tuning thresholds related to distributed frequent pattern mining in cloud computing environment while optimizing the global monetary cost of storing and querying data in the cloud. To achieve this goal, we design cost models for managing and mining graph data with large scale pattern mining framework over a cloud architecture. We define four objective functions, with respect to the needs of customers. We present an experimental validation of the proposed cost models in the case of distributed subgraph mining in the cloud.
- Published
- 2015
16. Efficiently Mining Recurrent Substructures from Protein Three-Dimensional Structure Graphs.
- Author
-
Saidi, Rabie, Dhifli, Wajdi, Maddouri, Mondher, and Mephu Nguifo, Engelbert
- Published
- 2019
- Full Text
- View/download PDF
17. Etude de stabilité de méthodes d'extraction de motifs à partir des séquences protéiques
- Author
-
Saidi, Rabie, Aridhi, Sabeur, Maddouri, Mondher, Mephu Nguifo, Engelbert, Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes (LIMOS), Ecole Nationale Supérieure des Mines de St Etienne (ENSM ST-ETIENNE)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020])-Centre National de la Recherche Scientifique (CNRS), Ecole Nationale Supérieure des Mines de St Etienne-Centre National de la Recherche Scientifique (CNRS)-Université Clermont Auvergne [2017-2020] (UCA [2017-2020]), and DOREAU, Bastien
- Subjects
[INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB] ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] - Published
- 2010
18. Prediction of Ionizing Radiation Resistance in Bacteria Using a Multiple Instance Learning Model.
- Author
-
Aridhi, Sabeur, Sghaier, Haïtham, Zoghlami, Manel, Maddouri, Mondher, and Nguifo, Engelbert Mephu
- Published
- 2016
- Full Text
- View/download PDF
19. Computational phenotype prediction of ionizing-radiation-resistant bacteria with a multiple-instance learning model.
- Author
-
Aridhi, Sabeur, Maddouri, Mondher, Sghaier, Haitham, and Nguifo, Engelbert Mephu
- Published
- 2013
- Full Text
- View/download PDF
20. Diversity Analysis on Boosting Nominal Concepts.
- Author
-
Meddouri, Nida, Khoufi, Héla, and Maddouri, Mondher Sadok
- Published
- 2012
- Full Text
- View/download PDF
21. Adaptive Learning of Nominal Concepts for Supervised Classification.
- Author
-
Meddouri, Nida and Maddouri, Mondher
- Abstract
In recent decades, several machine learning methods based on Formal Concept Analysis have been proposed. The learning process is based on the construction of the mathematical structure of the Galois lattice. Two major limits characterize these methods. First, most of them are limited to the binary data processing. Second, the exponential complexity of a Galois lattice generation limits their fields of application. In this paper, we consider the Boosting of classifiers, which is an adaptive approach of classification. We propose the Boosting of classifiers based on Nominal Concepts. This method builds part of the lattice including the best concepts (pertinent concepts). It is distinguished from the other methods based on Formal Concept Analysis by its ability to handle nominal data. The discovered concepts are called Nominal Concepts and they are used as classification rules. The comparative studies and the experimental results carried out, prove the interest of this method compared to those existing in literature. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
22. Comparing graph-based representations of protein for mining purposes.
- Author
-
Saidi, Rabie, Maddouri, Mondher, and Nguifo, Engelbert Mephu
- Published
- 2009
- Full Text
- View/download PDF
23. Boosting Formal Concepts to Discover Classification Rules.
- Author
-
Meddouri, Nida and Maddouri, Mondher
- Abstract
Supervised classification is a spot/task of data mining which consists in building a classifier from a set of examples labeled by their class (learning step) and then predicting the class of new examples with a classifier (classification step). In supervised classification, several approaches were proposed such as: Induction of Decision Trees, and Formal Concept Analysis. The learning of formal concepts is based, generally, on the mathematical structure of Galois lattice (or concept lattice). The complexity of generation of Galois lattice, limits the application fields of these systems. In this paper, we present several methods of supervised classification based on Formal Concept Analysis. We present methods based on concept lattice or sub lattice. We also present the boosting of classifiers, an emerging technique of classification. Finally, we propose the boosting of formal concepts: a new adaptive approach to build only a part of the lattice including the best concepts. These concepts are used as classification rules. Experimental results are given to prove the interest of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
24. Improving Boosting by Exploiting Former Assumptions.
- Author
-
Bahri, Emna, Nicoloyannis, Nicolas, and Maddouri, Mondher
- Abstract
The error reduction in generalization is one of the principal motivations of research in machine learning. Thus, a great number of work is carried out on the classifiers aggregation methods in order to improve generally, by voting techniques, the performance of a single classifier. Among these methods of aggregation, we find the Boosting which is most practical thanks to the adaptive update of the distribution of the examples aiming at increasing in an exponential way the weight of the badly classified examples. However, this method is blamed because of overfitting, and the convergence speed especially with noise. In this study, we propose a new approach and modifications carried out on the algorithm of AdaBoost. We will demonstrate that it is possible to improve the performance of the Boosting, by exploiting assumptions generated with the former iterations to correct the weights of the examples. An experimental study shows the interest of this new approach, called hybrid approach. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
25. On Semantic Properties of Interestingness Measures for Extracting Rules from Data.
- Author
-
Hutchison, David, Kanade, Takeo, Kittler, Josef, Kleinberg, Jon M., Mattern, Friedemann, Mitchell, John C., Naor, Moni, Nierstrasz, Oscar, Rangan, C. Pandu, Steffen, Bernhard, Sudan, Madhu, Terzopoulos, Demetri, Tygar, Doug, Vardi, Moshe Y., Weikum, Gerhard, Beliczynski, Bartlomiej, Dzielinski, Andrzej, Iwanowski, Marcin, Ribeiro, Bernardete, and Maddouri, Mondher
- Abstract
The extraction of IF-THEN rules from data is a promising task of data mining including both Artificial Intelligence and Statistics. One of the difficulties encountered is how to evaluate the relevance of the extracted rules? Many authors use statistical interestingness measures to evaluate the relevance of each rule (taken alone). Recently, few research works have done a synthesis study of the existing interestingness measures but their study presents some limits. In this paper, firstly, we present an overview of related works studying more than forty interestingness measures. Secondly, we establish a list of nineteen other interestingness measures not referenced by the related works. Then, we identify twelve semantic properties characterizing the behavior of interestingness measures. Finally, we did a theoretical study of sixty two interestingness measures by outlining their semantic properties. The results of this study are useful to the users of a data-mining system in order to help them to choose an appropriate measure. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
26. Biological Sequences Encoding for Supervised Classification.
- Author
-
Istrail, Sorin, Pevzner, Pavel, Waterman, Michael S., Hochreiter, Sepp, Wagner, Roland, Saidi, Rabie, Maddouri, Mondher, and Nguifo, Engelbert Mephu
- Abstract
The classification of biological sequences is one of the significant challenges in bioinformatics as well for protein as for nucleic sequences. The presence of these data in huge masses, their ambiguity and especially the high costs of the in vitro analysisin terms of time and money, make the use of data mining rather a necessity than a rational choice. However, the data mining techniques, which often process data under the relational format, are confronted with the inappropriate format of the biological sequences. Hence, an inevitable step of pre-processing must be established. This work presents the biological sequences encoding as a preparation step before their classification. We present three existing encoding methods based on the motifs extraction. We also propose to improve one of these methods and we carry out a comparative study which takes into account, of course, the effect of each method on the classification accuracy but also the number of generated attributes and the CPU time. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
27. Parallel Learning and Classification for Rules based on Formal Concepts.
- Author
-
Meddouri, Nida, Khoufi, Hela, and Maddouri, Mondher
- Subjects
SUPERVISED learning ,DATA mining ,CLASSIFICATION rule mining ,MACHINE learning ,COMPUTATIONAL complexity - Abstract
Supervised classification is a spot/task of data mining which consist on building a classifier from a set of instances labeled with their class ( learning step ) and then predicting the class of new instances with a classifier ( classification step ). In supervised classification, several approaches were proposed such as: Induction of Decision Tree and Formal Concept Analysis . The learning of formal concepts is generally based on the mathematical structure of Galois lattice (or concept lattice ). The complexity of Galois lattice generation limits the application fields of these systems. In this paper, we discuss about supervised classification based on Formal Concept Analysis and we present methods based on concept lattice or sub lattice . We propose a new approach that builds only a part of the lattice, including the best concepts (i.e pertinent concepts). These concepts are used as classifiers in parallel combination using voting rule. The proposed method is based on Dagging of Nominal Classifier . Experimental results are given to prove the interest of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
28. Towards a machine learning approach based on incremental concept formation.
- Author
-
Maddouri, Mondher
- Subjects
- *
MACHINE theory , *ARTIFICIAL intelligence , *ALGORITHMS , *DATA flow computing , *CANCER diagnosis - Abstract
In many real-world learning problems the data flows continuously and learning algorithms should be able to respond to this circumstance: the induced concept description should gradually change over time. In this paper, we outline some existing incremental learners based on the theory of Formal Concept Analysis: FCA. Then, we introduce a new learning approach that improves incremental concept formation. This approach has the advantage of handling both the problem of data addition, data deletion, data update, attribute addition and attribute deletion. Finally, we apply the proposed approach to the problem of cancer diagnosis. We measure the effect of incrementality on the quality of the discovered rules using cross-validation. [ABSTRACT FROM AUTHOR]
- Published
- 2004
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.