Descriptor: "association rules" / Topic: association rule learning - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"association rules"' showing total 278 results

Start Over Descriptor "association rules" Topic association rule learning

278 results on '"association rules"'

1. Apriori Association Rules Learning

Author: Dinov, Ivo D. and Dinov, Ivo D.
Published: 2018
Full Text: View/download PDF

2. Research on correlation factor analysis and prediction method of overhead transmission line defect state based on association rule mining and RBF-SVM

Author: Zuming Yan, Xinghua Wang, Xiangang Peng, Yongbin Zeng, Xiaoye Liu, and Haoliang Yuan
Subjects: Transmission lines, Support vector machine, Association rule learning, Computer science, 020209 energy, Defect state, Decision tree, Classification prediction, 02 engineering and technology, Association rules, computer.software_genre, General Energy, Electric power transmission, 020401 chemical engineering, Transmission line, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), Overhead (computing), lcsh:Electrical engineering. Electronics. Nuclear engineering, State (computer science), Data mining, 0204 chemical engineering, lcsh:TK1-9971, computer
Abstract: The effective assessment and prediction of the defect state of transmission lines can provide important technical support for the maintenance management of transmission lines. This paper proposes a method of correlation factors analysis and prediction for transmission line defect state based on association rule mining and RBF-SVM since the single operation parameter is often used in the analysis and prediction of transmission line defect state, and ignoring the influence of internal and external factors such as the meteorological conditions, operating conditions, etc. Firstly, according to the defect state assessment of transmission lines, based on the existing data, a characteristic library of the defect state and correlation factors is constructed by considering various relevant influencing factors. Then FP-Growth algorithm is introduced into the association rules mining, which can find the internal and external factors that have a strong association with defect, and the association rules can be used as the input feature of the prediction model, so as to avoid the influence of low association factors on the accuracy of defect state prediction. Finally, RBF-SVM was used to predict the defect state, and have a better prediction accuracy compared with three commonly used methods of the linear SVM, ANN and the decision tree. The proposed approach is illustrated by predicting the defect state of an overhead transmission line in a certain area. The results verify the effectiveness of the method and provide a certain reference for the maintenance of the transmission line. more...
Published: 2021
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

3. A novel artificial intelligent approach: comparison of machine learning tools and algorithms based on optimization DEA Malmquist productivity index for eco-efficiency evaluation

Author: Kamyar Kabirifar, Elham Shadkam, Mirpouya Mirmozaffari, Tayyebeh Asgari Gashteroodkhani, Seyyed Mohammad Khalili, Reza Yazdani, University of Texas at Arlington [Arlington], Khayyam University, Ferdowsi University Mashhad, University of New South Wales [Sydney] (UNSW), Islamic Azad University, and University of Guilan more...
Subjects: Artificial Intelligent, Optimization, [SPI.OTHER]Engineering Sciences [physics]/Other, Association rule learning, Eco-efficiency, Process (engineering), Computer science, 020209 energy, Strategy and Management, 02 engineering and technology, 010501 environmental sciences, Association rules, Machine learning, computer.software_genre, 7. Clean energy, 01 natural sciences, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], 0202 electrical engineering, electronic engineering, information engineering, Data envelopment analysis, Additive model, Productivity, 0105 earth and related environmental sciences, business.industry, [INFO.INFO-RO]Computer Science [cs]/Operations Research [cs.RO], Classification, Slack variable, Statistical classification, General Energy, Data Envelopment Analysis, 13. Climate action, Two-stage additive models, Artificial intelligence, business, computer, Algorithm
Abstract: Purpose Cement as one of the major components of construction activities, releases a tremendous amount of carbon dioxide (CO2) into the atmosphere, resulting in adverse environmental impacts and high energy consumption. Increasing demand for CO2 consumption has urged construction companies and decision-makers to consider ecological efficiency affected by CO2 consumption. Therefore, this paper aims to develop a method capable of analyzing and assessing the eco-efficiency determining factor in Iran’s 22 local cement companies over 2015–2019. Design/methodology/approach This research uses two well-known artificial intelligence approaches, namely, optimization data envelopment analysis (DEA) and machine learning algorithms at the first and second steps, respectively, to fulfill the research aim. Meanwhile, to find the superior model, the CCR model, BBC model and additive DEA models to measure the efficiency of decision processes are used. A proportional decreasing or increasing of inputs/outputs is the main concern in measuring efficiency which neglect slacks, and hence, is a critical limitation of radial models. Thus, the additive model by considering desirable and undesirable outputs, as a well-known DEA non-proportional and non-radial model, is used to solve the problem. Additive models measure efficiency via slack variables. Considering both input-oriented and output-oriented is one of the main advantages of the additive model. Findings After applying the proposed model, the Malmquist productivity index is computed to evaluate the productivity of companies over 2015–2019. Although DEA is an appreciated method for evaluating, it fails to extract unknown information. Thus, machine learning algorithms play an important role in this step. Association rules are used to extract hidden rules and to introduce the three strongest rules. Finally, three data mining classification algorithms in three different tools have been applied to introduce the superior algorithm and tool. A new converting two-stage to single-stage model is proposed to obtain the eco-efficiency of the whole system. This model is proposed to fix the efficiency of a two-stage process and prevent the dependency on various weights. Converting undesirable outputs and desirable inputs to final desirable inputs in a single-stage model to minimize inputs, as well as turning desirable outputs to final desirable outputs in the single-stage model to maximize outputs to have a positive effect on the efficiency of the whole process. Originality/value The performance of the proposed approach provides us with a chance to recognize pattern recognition of the whole, combining DEA and data mining techniques during the selected period (five years from 2015 to 2019). Meanwhile, the cement industry is one of the foremost manufacturers of naturally harmful material using an undesirable by-product; specific stress is given to that pollution control investment or undesirable output while evaluating energy use efficiency. The significant concentration of the study is to respond to five preliminary questions. more...
Published: 2021
Full Text: View/download PDF

4. Sustainable Development of College and University Education by use of Data Mining Methods

Author: Liwen Wang and Soo-Jin Chung
Subjects: Scheme (programming language), Education reform, Apriori algorithm, sustainable development, Warning system, Association rule learning, lcsh:T58.5-58.64, apriori algorithm, lcsh:Information technology, Teaching method, Mathematical statistics, General Engineering, data mining, Education, association rules, students-oriented, Mathematics education, ComputingMilieux_COMPUTERSANDEDUCATION, lcsh:L, computer, Decision tree model, computer.programming_language, lcsh:Education
Abstract: To improve the education efficiency of the students, the student-centered education plan is explored. First, the Apriori algorithm of association rules is used to mine the potential related patterns in the score data of college students and establish a reasonable teaching method. Second, aided by the decision tree model, the factors affecting students' academic performance are studied, and the potential relationship between different courses is studied. Finally, the Apriori algorithm of association rules combined with decision tree model is used to generate the early warning mechanism of students' achievement, and the course performance of college students is empirically analyzed. The results show that: C language has two sides of dependence on many subjects; higher mathematics → linear algebra → mathematical statistics → computer composition principle → computer network. The teaching scheme of C language → C + + → Java more conforms to the learning mechanism of college students. Through empirical analysis, the early warning mechanism of association rule Apriori algorithm and decision tree model can effectively analyze student's course and give student's achievement. It is found that the method proposed can provide theoretical basis for students, teachers, and university administrators to carry out education reform and education management decision-making, improve students' performance and education quality, and realize the "student-oriented" education concept, so it can be applied to the actual education management. more...
Published: 2021

5. Research of insomnia on traditional Chinese medicine diagnosis and treatment based on machine learning

Author: Tao Liu, Zechen Li, Dongdong Yang, Shan Liang, Yu Fang, Yuqi Tang, and Shanshan Gao
Subjects: 0301 basic medicine, Insomnia, Association rule learning, Computer science, Sample (statistics), Traditional Chinese medicine, Machine learning, computer.software_genre, Association rules, 03 medical and health sciences, 0302 clinical medicine, Cluster analysis, Diagnosis, Acupuncture, medicine, Medical prescription, Pharmacology, business.industry, Research, lcsh:Other systems of medicine, lcsh:RZ201-999, Hierarchical clustering, Random forest, TCM, 030104 developmental biology, Complementary and alternative medicine, Artificial intelligence, medicine.symptom, business, computer, 030217 neurology & neurosurgery
Abstract: Background Insomnia as one of the dominant diseases of traditional Chinese medicine (TCM) has been extensively studied in recent years. To explore the novel approaches of research on TCM diagnosis and treatment, this paper presents a strategy for the research of insomnia based on machine learning. Methods First of all, 654 insomnia cases have been collected from an experienced doctor of TCM as sample data. Secondly, in the light of the characteristics of TCM diagnosis and treatment, the contents of research samples have been divided into four parts: the basic information, the four diagnostic methods, the treatment based on syndrome differentiation and the main prescription. And then, these four parts have been analyzed by three analysis methods, including frequency analysis, association rules and hierarchical cluster analysis. Finally, a comprehensive study of the whole four parts has been conducted by random forest. Results Researches of the above four parts revealed some essential connections. Simultaneously, based on the algorithm model established by the random forest, the accuracy of predicting the main prescription by the combinations of the four diagnostic methods and the treatment based on syndrome differentiation was 0.85. Furthermore, having been extracted features through applying the random forest, the syndrome differentiation of five zang-organs was proven to be the most significant parameter of the TCM diagnosis and treatment. Conclusions The results indicate that the machine learning methods are worthy of being adopted to study the dominant diseases of TCM for exploring the crucial rules of the diagnosis and treatment. more...
Published: 2021

6. Web Kullanıcılarının Bilgi Erişim ve Ziyaret Desenlerinin Web Madenciliği ile Keşfi: Kırklareli Üniversitesi Örneği

Author: Çiğdem Selçukcan Erol and Veli Özcan Budak
Subjects: General Computer Science, Association rule learning, web usage mining, Computer science, 020209 energy, Communication. Mass media, QA75.5-76.95, 02 engineering and technology, P87-96, association rules, World Wide Web, apriori, web kullanım madenciliği, Web mining, bilgi erişim, birliktelik kuralları, Electronic computers. Computer science, 0202 electrical engineering, electronic engineering, information engineering, T1-995, 020201 artificial intelligence & image processing, information retrieval, Technology (General)
Abstract: Web siteleri, kurumsal ya da bireysel açıdan hitap edilen kitleyle ilk temasın sağlandığı bir etkileşim aracıdır. Bu araç, yoğun bir bilgi erişim ve ziyaret trafiğinin bulunduğu süreçlerde, kullanıcı davranışlarındaki farklı desenlerin tespit edilebileceği önemli bir potansiyeli içinde barındırmaktadır. Bu desenler, kullanıcı ihtiyaçlarının daha belirginleştirilmesi ve site geliştiricilerinin bu ihtiyaçlar doğrultusunda güncellemeler yapabilmesi açısından oldukça kritik görevler üstlenebilir. Bu çalışmanın amacı, dünya genelinde yaşanan Covid-19 pandemisinin ülkemizde etkinliğini arttırdığı süreçte, Kırklareli Üniversitesi web sitelerindeki kullanıcıların bilgi erişim ihtiyaçlarındaki değişimin belirlenmesidir. Bu amaç doğrultusunda, kullanıcıların bilgi erişim ve ziyaret davranışları, apriori algoritmasıyla bağımsız ve birlikte olacak şekilde incelenerek, aralarındaki ilişkilerin ortaya çıkarılması hedeflenmiştir. Bilgi erişim kavramı açısından çalışma sonuçları, kullanıcıların “tez yazımı”na yönelik çeşitli arama terimleriyle bilgi ihtiyaçlarını karşılamaya çalıştıklarını göstermiştir. Bu sonuç, özellikle lisansüstü öğrencilerin ilgili süreçte aktif olduklarına işaret etmektedir. Ziyaret davranışları açısından, “uzaktan eğitim”, “koronavirüs” ve “tatil” temalı sayfaların ağırlıklı olarak ziyaret edildiği ortaya çıkmıştır. Bilgi erişim davranışları sonrasında sergilenen ziyaret davranışları açısındansa, “tez yazımı”, “tatil” ve “eğitim öğretimin ertelenmesi” temalı ziyaretlerin birliktelikleri göze çarpmıştır. Çalışma sonucunda ortaya çıkarılmış olan davranış desenleri ve bu desenlerden nasıl faydalanılabileceğine yönelik öneriler çalışma kapsamında detaylı bir şekilde açıklanmıştır. more...
Published: 2020
Full Text: View/download PDF

7. Machine Learning-Based HIV Risk Estimation Using Incidence Rate Ratios

Author: Oliver Haas, Andreas Maier, and Eva Rothgang
Subjects: Medicine (General), bias, QH471-489, Association rule learning, Population, Rate ratio, Machine learning, computer.software_genre, association rules, R5-920, Acquired immunodeficiency syndrome (AIDS), risk estimation, Medicine, Medical diagnosis, education, Estimation, education.field_of_study, Receiver operating characteristic, business.industry, Reproduction, HIV, General Medicine, Emergency department, medicine.disease, machine learning, clinical data, Artificial intelligence, ddc:620, business, computer
Abstract: HIV/AIDS is an ongoing global pandemic, with an estimated 39 million infected worldwide. Early detection is anticipated to help improve outcomes and prevent further infections. Point-of-care diagnostics make HIV/AIDS diagnoses available both earlier and to a broader population. Wide-spread and automated HIV risk estimation can offer objective guidance. This supports providers in making an informed decision when considering patients with high HIV risk for HIV testing or pre-exposure prophylaxis (PrEP). We propose a novel machine learning method that allows providers to use the data from a patient's previous stays at the clinic to estimate their HIV risk. All variables available in the clinical data are considered, making the set of variables objective and independent of expert opinions. The proposed method builds on association rules that are derived from the data. The incidence rate ratio (IRR) is determined for each rule. Given a new patient, the average IRR of all applicable rules is used to estimate their HIV risk. The method was tested and validated on the publicly available clinical database MIMIC-IV, which consists of around 525,000 hospital stays that included a stay at the intensive care unit or emergency department. We evaluated the method using the area under the receiver operating characteristic curve (AUC). The best performance with an AUC of 0.88 was achieved with a model consisting of 78 rules. A threshold value of 1.0, i.e. an IRR that denotes no association, leads to a sensitivity of 98% and a specificity of 51%. The rules were grouped into social factors (e.g. homelessness, violence), drug abuse, psychological illnesses (e.g. depression, PTSD), previously known associations (e.g. pulmonary, neurological diseases), and new associations (e.g. diabetes, insulin uptake). In conclusion, we propose a novel HIV risk estimation method that builds on existing clinical data. It incorporates a wide range of variables, leading to a model that is independent of expert opinions. It supports providers in making informed decisions in the point-of-care diagnostics process by estimating a patient's HIV risk. more...
Published: 2021

8. An Association Rules-Based Method for Outliers Cleaning of Measurement Data in the Distribution Network

Author: He Mi, Cheng Guo, Xian Meng, Xin He, Ruimin Duan, Hua Kuang, and Risheng Qin
Subjects: DBSCAN, Economics and Econometrics, Association rule learning, Computer science, Energy Engineering and Power Technology, computer.software_genre, General Works, association rules, Set (abstract data type), outliers cleaning, outliers repairing, distribution network, Reliability (statistics), Mahalanobis distance, measurement data, Renewable Energy, Sustainability and the Environment, InformationSystems_DATABASEMANAGEMENT, ComputingMethodologies_PATTERNRECOGNITION, Fuel Technology, Data quality, Outlier, outliers detection, Noise (video), Data mining, computer
Abstract: For any power system, the reliability of measurement data is essential in operation, management and also in planning. However, it is inevitable that the measurement data are prone to outliers, which may impact the results of data-based applications. In order to improve the data quality, the outliers cleaning method for measurement data in the distribution network is studied in this paper. The method is based on a set of association rules (AR) that are automatically generated form historical measurement data. First, the association rules are mining in conjunction with the density-based spatial clustering of application with noise (DBSCAN), k-means and Apriori technique to detect outliers. Then, for the outliers repairing process after outliers detection, the proposed method uses a distance-based model to calculate the repairing cost of outliers, which describes the similarity between outlier and normal data. Besides, the Mahalanobis distance is employed in the repairing cost function to reduce the errors, which could implement precise outliers cleaning of measurement data in the distribution network. The test results for the simulated datasets with artificial errors verify that the superiority of the proposed outliers cleaning method for outliers detection and repairing. more...
Published: 2021
Full Text: View/download PDF

9. Multi-Objective Optimization for High-Dimensional Maximal Frequent Itemset Mining

Author: Xuan Ma, Hisakazu Ogura, Yalong Zhang, Dongfen Ye, and Wei Yu
Subjects: Technology, Association rule learning, Computer science, QH301-705.5, QC1-999, Big data, frequent itemset mining, High dimensional, Space (commercial competition), computer.software_genre, Multi-objective optimization, Set (abstract data type), association rules, big data, General Materials Science, Biology (General), Instrumentation, QD1-999, Fluid Flow and Transfer Processes, maximal frequent itemset, business.industry, Process Chemistry and Technology, Physics, General Engineering, InformationSystems_DATABASEMANAGEMENT, Engineering (General). Civil engineering (General), Computer Science Applications, Running time, Chemistry, multi-objective optimization, A priori and a posteriori, Data mining, TA1-2040, business, computer
Abstract: The solution space of a frequent itemset generally presents exponential explosive growth because of the high-dimensional attributes of big data. However, the premise of the big data association rule analysis is to mine the frequent itemset in high-dimensional transaction sets. Traditional and classical algorithms such as the Apriori and FP-Growth algorithms, as well as their derivative algorithms, are unacceptable in practical big data analysis in an explosive solution space because of their huge consumption of storage space and running time. A multi-objective optimization algorithm was proposed to mine the frequent itemset of high-dimensional data. First, all frequent 2-itemsets were generated by scanning transaction sets based on which new items were added in as the objects of population evolution. Algorithms aim to search for the maximal frequent itemset to gather more non-void subsets because non-void subsets of frequent itemsets are all properties of frequent itemsets. During the operation of algorithms, lethal gene fragments in individuals were recorded and eliminated so that individuals may resurge. Finally, the set of the Pareto optimal solution of the frequent itemset was gained. All non-void subsets of these solutions were frequent itemsets, and all supersets are non-frequent itemsets. Finally, the practicability and validity of the proposed algorithm in big data were proven by experiments. more...
Published: 2021

10. Big Data-Driven Abnormal Behavior Detection in Healthcare Based on Association Rules

Author: Hui Yang, Runtong Zhang, Donghua Chen, Jie He, and Shengyao Zhou
Subjects: General Computer Science, Association rule learning, Computer science, Big data, 02 engineering and technology, Disease cluster, abnormal behavior, association rules, 03 medical and health sciences, Order (exchange), Health care, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, health care economics and organizations, 030304 developmental biology, 0303 health sciences, business.industry, General Engineering, Medical insurance, Risk analysis (engineering), Benchmark (computing), 020201 artificial intelligence & image processing, healthcare insurance, lcsh:Electrical engineering. Electronics. Nuclear engineering, Abnormality, business, lcsh:TK1-9971
Abstract: Healthcare insurance frauds are causing millions of dollars of public healthcare fund losses around the world in various ways, which makes it very important to strengthen the management of medical insurance in order to guarantee the steady operation of medical insurance funds. Healthcare fraud detection methods can reduce the losses of healthcare insurance funds and improve medical quality. Existing fraud detection studies mostly focus on finding normal behavior patterns and treat those violating normal behavior patterns as fraudsters. However, fraudsters can often disguise themselves with some normal behaviors, such as some consistent behaviors when they seek medical treatments. To address these issues, we combined a MapReduce distributed computing model and association rule mining to propose a medical cluster behavior detection algorithm based on frequent pattern mining. It can detect certain consistent behaviors of patients in medical treatment activities. By analyzing 1.5 million medical claim records, we have verified the effectiveness of the method. Experiments show that this method has better performance than several benchmark methods. more...
Published: 2020

11. Cause Analysis of Traffic Accidents on Urban Roads Based on an Improved Association Rule Mining Algorithm

Author: Qiuru Cai
Subjects: Apriori algorithm, General Computer Science, Association rule learning, Lift (data mining), Computer science, Control (management), General Engineering, causes of traffic accidents, data mining, association rules, Transport engineering, Multiple factors, Cause analysis, General Materials Science, lcsh:Electrical engineering. Electronics. Nuclear engineering, Dimension (data warehouse), Urban roads, lcsh:TK1-9971
Abstract: The traffic accidents on urban roads are result of joint actions between multiple factors, namely, human, vehicle, road and environment. To identify the key causes to such accidents, it is necessary to mine the association rules between relevant risk factors out of the statistics on these accidents. Considering the multiple layers and dimensions of accident data, this paper improves the Apriori algorithm to mine the association rules between risk factors, and probes deep into the causes of traffic accidents on urban roads. According to the layer and dimension of specific attributes, the parameters like support, confidence and lift were adjusted to find the qualified association rules between risk factors. The results were further screened to obtain a series of meaningful association rules. The research results enable the traffic department to formulate pertinent accident control measures, and promote the traffic safety on urban roads. more...
Published: 2020
Full Text: View/download PDF

12. Exploring the Correlation Between Attention and Cognitive Load Through Association Rule Mining by Using a Brainwave Sensing Headband

Author: Shu-Chen Cheng, Yu-Ping Cheng, Yueh-Min Huang, and You-Yi Chen
Subjects: General Computer Science, Association rule learning, Internet of Things, 0206 medical engineering, 02 engineering and technology, Electroencephalography, association rules, Correlation, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, medicine, General Materials Science, Baseline (configuration management), Wearable technology, medicine.diagnostic_test, business.industry, General Engineering, Cognition, data mining, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, Psychology, lcsh:TK1-9971, electroencephalography, 020602 bioinformatics, Cognitive load, Cognitive psychology
Abstract: In recent years, Internet of Things (IoT) technology has brought many applications and developments for wearable devices, and the use of non-invasive electroencephalography (EEG) instruments to measure attention has been a topic of discussion. However, the correlation between attention and cognitive load has rarely been analyzed by data mining. For this reason, this study used head-mounted non-invasive EEG instruments based on IoT technology to collect attention values related to two courses and extracurricular activities and used a cognitive load questionnaire to investigate the cognitive loads of subjects. Correlation analysis was carried out through data mining technology to find the correlation between attention and cognitive load. In addition, six short-term experiments and relaxation experiments were designed to measure the subjects' maximum attention and minimum attention values, so as to propose a strategy for setting the attention baseline. According to the results of the various experiments, subjects suffering from overload showed a state of inattention during the whole activity while subjects suffering a high load showed low sustained attention; only subjects with a medium load showed high sustained attention. Subjects with a low load showed inattention for nearly the entire activity. In this study, a strategy for setting an attention baseline was proposed to normalize the attention values from different EEG instruments. The correlation between attention value and cognitive load is analyzed using association rule mining technology so that the change of cognitive load could be effectively estimated by measuring the attention value instead of using questionnaire in the future. more...
Published: 2020
Full Text: View/download PDF

13. Power System Fault Classification and Prediction Based on a Three-Layer Data Mining Structure

Author: Yunliang Wang, Xiaodong Wang, Yanjuan Wu, and Yannan Guo
Subjects: General Computer Science, Association rule learning, Computer science, 020209 energy, 02 engineering and technology, Association rules, Fault (power engineering), computer.software_genre, Data modeling, Electric power system, Local optimum, Linear regression, power system fault, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Cluster analysis, K-means, stochastic gradient descent algorithm, Training set, General Engineering, Regression analysis, data mining, TK1-9971, machine learning, Stochastic gradient descent, 020201 artificial intelligence & image processing, Electrical engineering. Electronics. Nuclear engineering, Data mining, Fault model, computer
Abstract: In traditional fault diagnosis methods in power systems, it is difficult to accurately classify and predict the types of faults. With the emergence of big data technology, the fault classification and prediction methods based on big data analysis and processing have been applied in power systems. To make the classification and prediction of the fault types more accurate, this paper proposes a hybrid data mining method for power system fault classification and prediction based on clustering, association rules and stochastic gradient descent. This method uses a three-layer data mining model: The first layer uses the $K$ -means clustering algorithm to preprocess the original fault data source, and it proposes to use self-encoding to simplify the data form. The second layer effectively eliminates the data that have little impact on the prediction results by using association rules, and the highly correlated data are mined to become the regression training data. The third layer first uses the cross-validation method to obtain the optimal parameters of each fault model, and then, it uses stochastic gradient descent for data regression training to obtain a classification and prediction model for each fault type. Finally, a verification example shows that compared with a single data mining algorithm model, the proposed method is more comparative in terms of the data mining, and the established power system fault classification and prediction model has global optimality and higher prediction accuracy, which has a certain feasibility for real-time online power system fault classification and prediction. This method reduces the disturbances from low-impact or irrelevant data by mining the fault data three times, and it uses cross-validation to optimize the multiple regression parameters of the regression model to solve the problems of low accuracy, large errors and easily falling into a local optimum, given the conduct of fault classification and prediction. more...
Published: 2020
Full Text: View/download PDF

14. High-Frequency Path Mining-Based Reward and Punishment Mechanism for Multi-Colony Ant Colony Optimization

Author: Han Pan, Xiaoming You, and Sheng Liu
Subjects: 0209 industrial biotechnology, Mathematical optimization, General Computer Science, Association rule learning, Computer science, 02 engineering and technology, minimum spanning tree, Minimum spanning tree, ComputingMethodologies_ARTIFICIALINTELLIGENCE, association rules, 020901 industrial engineering & automation, Local optimum, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Motion planning, Cluster analysis, path planning, gaussian filter, Reward and punishment mechanism, Lift (data mining), Ant colony optimization algorithms, General Engineering, Pheromone, 020201 artificial intelligence & image processing, lcsh:Electrical engineering. Electronics. Nuclear engineering, lcsh:TK1-9971
Abstract: To solve the problem of falling into local optimum and poor convergence speed of traditional ant colony algorithm, this paper proposes a High-frequency path mining-based Reward and Punishment mechanism for multi-colony Ant Colony Optimization (HRPACO). Firstly, the pheromone concentration on the path of effective strong association is rewarded adaptively according to the lift of association rules to accelerate the convergence speed. Secondly, the pheromone concentration on the path of minimum spanning tree is punished adaptively according to the support of association rules to improve the diversity of the colony. The interaction of reward and punishment mechanism can effectively balance the diversity and convergence. Finally, a self-evolutionary mechanism based on Gaussian filter is proposed to adaptively adjust the pheromone concentration by dynamic smoothing of the pheromone matrix, so as to help the colony jump out of the local optimum. The TSP is used to verify the performance of the algorithm. The simulation results show that the proposed algorithm can effectively accelerate the convergence speed and improve the accuracy of solution, especially for large-scale problems. Meanwhile, path planning is used to verify the feasibility of the proposed algorithm. The simulation results show that the algorithm can find an effective and better path even in the environment of complex obstacles. more...
Published: 2020
Full Text: View/download PDF

15. Discovery of Frequent Patterns of Episodes Within a Time Window for Alarm Management Systems

Author: Adel Hidri, Minyar Sassi Hidri, and Ahmed Selmi
Subjects: General Computer Science, Association rule learning, Computer science, 020209 energy, media_common.quotation_subject, alarm management, Sequential pattern mining, 02 engineering and technology, Machine learning, computer.software_genre, Field (computer science), Adaptability, association rules, 020401 chemical engineering, Alarm management, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, 0204 chemical engineering, media_common, business.industry, Scale (chemistry), General Engineering, data mining, artificial intelligence, Marketing strategy, Analytics, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, Line (text file), business, computer, lcsh:TK1-9971
Abstract: The sequential pattern mining field is expanding through numerous researches and has a large number of applications such as language processing, alarms management and event management on a broader scale. Its use began with processing items baskets to learn patterns and have a directed marketing strategy but it is generalized to telecommunication alarms management with several works. Our work is in line with this, as it tries to locate patterns and identify them to make predictive statements about certain patterns. It is axed around providing a way to break sequences into episodes and assigning them a value of confidence and support, more precisely in the discovery of frequent patterns of episodes within a time window. Experimental results have shown the effectiveness of our sequential pattern mining approach and its adaptability to alarm management and analytics. more...
Published: 2020

16. A Combined Approach for Customer Profiling in Video on Demand Services Using Clustering and Association Rule Mining

Author: Serhat Peker, Cigdem Turhan, and Sinem Guney
Subjects: Apriori algorithm, General Computer Science, Association rule learning, business.industry, Computer science, General Engineering, Customer segmentation, IPTV, data mining, Service provider, computer.software_genre, Marketing strategy, association rules, VoD services, Market segmentation, Profiling (information science), General Materials Science, RFM model, Data mining, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, Cluster analysis, computer, lcsh:TK1-9971, clustering
Abstract: The purpose of this paper is to propose a combined data mining approach for analyzing and profiling customers in video on demand (VoD) services. The proposed approach integrates clustering and association rule mining. For customer segmentation, the LRFMP model is employed alongside the k-means and Apriori algorithms to generate association rules between the identified customer groups and content genres. The applicability of the proposed approach is demonstrated on real-world data obtained from an Internet protocol television (IPTV) operator. In this way, four main customer groups are identified: “high consuming-valuable subscribers”,” less consuming subscribers”,” less consuming-loyal subscribers” and “disloyal subscribers”. In detail, for each group of customers, a different marketing strategy or action is proposed, mainly campaigns, special-day promotions, discounted materials, offering favorite content, etc. Further, genres preferred by these customer segments are extracted using the Apriori algorithm. The results obtained from this case study also show that the proposed approach provides an efficient tool to form different customer segments with specific content rental characteristics, and to generate useful association rules for these distinct groups. The proposed combined approach in this research would be beneficial for IPTV service providers to implement effective CRM and customer-based marketing strategies. more...
Published: 2020

17. Sensing the Web for Induction of Association Rules and their Composition through Ensemble Techniques

Author: Giovanni Pilato, Filippo Vella, Ignazio Infantino, and Agnese Augello
Subjects: World Wide Web, Boosting (machine learning), Association rule learning, Computer science, 0202 electrical engineering, electronic engineering, information engineering, General Earth and Planetary Sciences, 020206 networking & telecommunications, 020201 artificial intelligence & image processing, Association Rules, Web Sensing, Emergency, Big Data, Boosting, Ensemble techniques, 02 engineering and technology, General Environmental Science
Abstract: Starting from geophysical data collected from heterogeneous sources, such as meteorological stations and information gathered from the web, we seek unknown connections between the sampled values through the extraction of association rules. These rules imply the co-occurrence of two or more symbols in the same representation, and the rule confidence may vary according to the collected data. We propose, starting from traditional algorithms such as FP-Growth and Apriori, the creation of complex association rules through boosting of simpler ones. The composition enables the creation of rules that are robust and let emerge a larger number of interesting rules. more...
Published: 2020
Full Text: View/download PDF

18. A Novel Association Rule-Based Data Mining Approach for Internet of Things Based Wireless Sensor Networks

Author: Sohail Abbas, Walid Osamy, Ahmed Salim, and Ahmed M. Khedr
Subjects: Scheme (programming language), distributed databases, General Computer Science, Association rule learning, Computer science, Internet of Things, Stability (learning theory), 02 engineering and technology, computer.software_genre, Association rules, Base station, 0202 electrical engineering, electronic engineering, information engineering, Overhead (computing), General Materials Science, computer.programming_language, network lifetime, business.industry, General Engineering, Volume (computing), 020206 networking & telecommunications, stability, wireless sensor networks based-clustering, Analytics, 020201 artificial intelligence & image processing, Data mining, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, Wireless sensor network, computer, lcsh:TK1-9971
Abstract: Wireless Sensor Network (WSN) is one of the fundamental technologies used in the Internet of Things (IoT) which is deployed for diverse applications to carry out precise real-time observations. The limited resources of WSN with massive volume of fast-flowing IoT data make the aggregation and analytics of data more challenging. Recently, data mining-based solutions have been proposed to effectively handle the data being generated by the sensors and to analyze the data patterns for deducing the required information from it. The increasing need of these techniques motivated us to propose a distributed and efficient data mining technique that not only handles the massive and rapidly generated data by the nodes, but also increases the life span of the network. In this paper, we propose a novel scheme for the IoT based WSN that mines the sensor data using association rule without moving it to any Cluster Head (CH) or Base Station (BS). The new proposed scheme enables sensors to perform computations locally and only the minimum higher-level statistical summaries of the data at Cluster Members (CMs) are exchanged with their CH. This considerably reduces the communication overhead which ultimately prolongs the network lifetime. The proposed scheme is evaluated via extensive simulations and the results obtained demonstrate that the integration of the proposed scheme in the existing protocols significantly reduces the communication overhead which ultimately prolongs the network lifetime and stability. more...
Published: 2020

19. A Study On Profiling Students via Data Mining

Author: Mustafa Temiz and Mehmet Ali Alan
Subjects: Apriori algorithm, Operations Research and Management Science, Association rule learning, lcsh:T55.4-60.8, Computer science, Big data, data warehouse, lcsh:Business, computer.software_genre, Personalization, association rules, Management of Technology and Innovation, ComputingMilieux_COMPUTERSANDEDUCATION, Profiling (information science), lcsh:Industrial engineering. Management engineering, student profile, business.industry, data mining, Data warehouse, Data Mining,Association Rules,Student Profile,Data Warehouse, Financial transaction, Data analysis, Data mining, business, lcsh:HF5001-6182, computer, Yöneylem, Araştırma ve Yönetim Bilimi
Abstract: Data mining is a significant method which is utilized in order to reveal the hidden patterns and connections within big data. The method is used at various fields such as financial transactions, banking, education, health sector, logistics and security. Even though analysis towards the consumption habits of the customers is carried out via association rules mining more often, which is one of the basic methods of data mining, the method is also utilized in order to profile patients and students. As well as the customization of a customer is of high significance, so is distinguishing and customizing a student. Within this study, students were tried to be profiled via data mining of the student data of a high school. A set of qualities, that can directly affect the performance of students such as health conditions, financial resources, life standards and education level of the families, were taken into consideration. For that purpose, upon the analysis of data of 443 students in the database, a data warehouse was established. The Apriori algorithm, which is one of the popular algorithms of association rules mining, is utilized for the data analysis. Apriori algorithm was able to produce 72 rules which are accurate above 90%. It is thought that the produced rules can be of help in profiling the students, and they can contribute to work of school management, teachers, parents and students. more...
Published: 2019

20. Mining Algorithm for Weighted FP-Growth Frequent Item Sets based on Ordered FP-Tree

Author: Shaohong Yin and Yuanyuan Li
Subjects: China, Association rule learning, Computer science, Computer Science::Information Retrieval, Weighted Model, General Engineering, Downward closure property, Ordered FPTree, Weighted Ordered FP-Tree, Space (commercial competition), computer.software_genre, Data mining algorithm, Association Rules, Tree (data structure), Data Mining, Data mining, computer
Abstract: FP-growth algorithm is a classic algorithm of mining frequent item sets, but there exist certain disadvantages for mining the weighted frequent item sets. Based on the weighted downward closure property of the weighted model, this paper proposed a method to reduce the use of storage space by constructing a weight ordered FP-tree, so as to improve the generation efficiency of weighted frequent item sets. more...
Published: 2019
Full Text: View/download PDF

21. Discovering hidden patterns in Turkish construction projects delays related to project characteristic

Author: İsmail Cengiz Yılmaz and Ezgi Kazan
Subjects: Association rule learning, Computer science, Turkish, Delay analysis, İnşaat Mühendisliği, construction projects, Data science, Civil Engineering, language.human_language, lcsh:TH1-9745, apriori algorithms, association rules, Construction Projects,Delay Analysis,Apriori Algorithms,Association Rules, delay analysis, language, lcsh:TA401-492, lcsh:Materials of engineering and construction. Mechanics of materials, lcsh:Building construction
Abstract: There are delays in delivering construction projects, which have great affects in all countries' economies, due to many factors and reasons and these delays bring many negative consequences. Checking the source of these consequences or minimizing their effects is very important in terms of time and cost savings in the construction sector, especially for countries which have a continuous improvement such as Turkey. Determination of project factors that will cause delay, analysis of their impacts and taking protective measures will help to reduce losses. Hereby in this study, the project factors that may cause to delays are identified and some important association rules between these factors and delay are extracted by collecting the data from Turkish Public and private construction projects. Also, some recommendations were presented for reducing delays by using the extracted rules. more...
Published: 2019

22. Rough set‐based rule generation and Apriori‐based rule generation from table data sets: a survey and a combination

Author: Hiroshi Sakai and Michinori Nakata
Subjects: information incompleteness, 0209 industrial biotechnology, Apriori algorithm, database management systems, granular computing, Association rule learning, apriori algorithm, Computer Networks and Communications, Computer science, 02 engineering and technology, table data sets, computer.software_genre, incomplete information databases, information analysis, association rules, 020901 industrial engineering & automation, Software, Artificial Intelligence, Complete information, rule generators, authors, 0202 electrical engineering, electronic engineering, information engineering, Information system, rough set theory, software tools, rough sets nondeterministic information analysis, lcsh:Computer software, business.industry, Granular computing, nondeterministic information systems, lcsh:P98-98.5, data mining, outstanding researches, Knowledge acquisition, Human-Computer Interaction, knowledge acquisition, novel researches, lcsh:QA76.75-76.765, 020201 artificial intelligence & image processing, apriori-based rule generation, Computer Vision and Pattern Recognition, Rough set, Data mining, intelligent rule generator, lcsh:Computational linguistics. Natural language processing, business, computer, computational methodologies, Information Systems
Abstract: The authors have been coping with new computational methodologies such as rough sets, information incompleteness, data mining, granular computing, etc., and developed some software tools on association rules as well as new mathematical frameworks. They simply term this research Rough sets Non-deterministic Information Analysis (RNIA). They followed several novel types of research, especially Pawlak's rough sets, Lipski's incomplete information databases, Orłowska's non-deterministic information systems, Agrawal's Apriori algorithm. These are outstanding researches related to information incompleteness, data mining, and rule generation. They have been trying to combine such novel researches, and they have been trying to realise more intelligent rule generator handling data sets with information incompleteness. This study surveys the authors’ research highlights on rule generators, and considers a combination of them. more...
Published: 2019
Full Text: View/download PDF

23. A novel machine learning approach for database exploitation detection and privilege control

Author: Chee Keong Wee and Richi Nayak
Subjects: reinforcement learning, Association rule learning, Computer Networks and Communications, Network security, Computer science, Control (management), privilege control, ComputingMilieux_LEGALASPECTSOFCOMPUTING, Privilege (computing), computer.software_genre, lcsh:Telecommunication, Database, association rules, lcsh:TK5101-6720, Computer Science (miscellaneous), Reinforcement learning, self-healing, Electrical and Electronic Engineering, lcsh:T58.5-58.64, business.industry, lcsh:Information technology, anomaly detection, Computer Science Applications, ComputingMilieux_MANAGEMENTOFCOMPUTINGANDINFORMATIONSYSTEMS, Anomaly detection, business, computer
Abstract: Despite protected by firewalls and network security systems, databases are vulnerable to attacks especially when the perpetrators are from within the organization and have authorized access to these systems. Detecting their malicious activities is difficult as each database has its own set of unique usage activities and the generic exploitation avoidance rules are usually not applicable. This paper proposes a novel method to improve the security of a database by using machine learning to learn the user behaviour unique to a database environment and apply that learning to detect anomalous user activities through the analysis of sequences of user session data. Once these suspicious users are detected, their privileges are systematically suppressed. The empirical analysis shows that the proposed approach can intuitively adapt to any database that supports a wide variety of clients and enforce stringent control customized to the specific IT systems. more...
Published: 2019

24. Market basket analysis with association rules in the retail sector using Orange. Case Study: Appliances Sales Company

Author: Garcia-Diaz Maria-Elena, Marcos Martinez, Bel´en Escobar, and Diego P. Pinto-Roa
Subjects: Association rule learning, Market Basket Analysis, Orange Canvas, Affinity analysis, QA75.5-76.95, General Medicine, Orange (colour), Association Rules, Commerce, Knowledge Discovery in Databases, Electronic computers. Computer science, Data Mining, Business, FP- Growth, Retail sector
Abstract: This research is conducted to analyze the shopping basket by using association rules in the retail area, more specifically in a home goods sales company such as appliances, computer items, furniture, and sporting goods, among others. With the rise of globalization and the advancement of technology, retail companies are constantly struggling to maintain and raise their profits, as well ordering the products and services that the customer wants to obtain. In this sense, they need a new approach to identify different objectives in order to be more competitive and successful, looking for new decision-making strategies. To achieve this goal, and to obtain clear and efficient strategies, by providing large amounts of data collected in business transactions, the need arises to intelligently analyze such data in order to extract useful knowledge that will support decision-making and, an understanding of the association patterns that occur in sales-customer behavior. Predicting which product will make the most profit, products that are sold together, this type of information is of great value for storing products in inventory. Knowing when a product is out of fashion can support inventory management effectively. In this sense, this work presents the rules of association of products obtained by analyzing the data with the FPGrowth algorithm using the Orange tool. more...
Published: 2021
Full Text: View/download PDF

25. Study of the Behavior of Cryptocurrencies in Turbulent Times Using Association Rules

Author: Miguel Andrés Porro V., José Benito Hernández C., and Andrés García-Medina
Subjects: Apriori algorithm, Cryptocurrency, Association rule learning, Series (mathematics), Turbulence, General Mathematics, 02 engineering and technology, cryptocurrencies, 01 natural sciences, 010305 fluids & plasmas, association rules, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Computer Science (miscellaneous), Econometrics, Economics, QA1-939, 020201 artificial intelligence & image processing, time series, Engineering (miscellaneous), Database transaction, Mathematics
Abstract: We studied the effects of the recent financial turbulence of 2020 on the cryptocurrency market, taking into account both prices and volumes from December 2019 to July 2020. Time series were transformed into transaction matrices, and the Apriori algorithm was applied to find the association rules between different currencies, identifying whether the price or the volume of the currencies compose the rules. We divided the data set into two subsets and found that before the decline in cryptocurrency prices, the association rules were generally formed by these prices and that, then, the volumes of the transactions dominated to form the association rules. more...
Published: 2021
Full Text: View/download PDF

26. A Novel Decision-Making Process for COVID-19 Fighting Based on Association Rules and Bayesian Methods

Author: Adel Thaljaoui, Fayez Alfayez, and Salim El Khediri
Subjects: General Computer Science, Coronavirus disease 2019 (COVID-19), Association rule learning, Computer science, AcademicSubjects/SCI01540, Bayesian probability, 02 engineering and technology, Bayesian network’s structure learning based on data approach, Machine learning, computer.software_genre, 030218 nuclear medicine & medical imaging, association rules, 03 medical and health sciences, 0302 clinical medicine, Section C: Computational Intelligence, Machine Learning and Data Analytics, 0202 electrical engineering, electronic engineering, information engineering, Decision-making, autonomous decision-making, business.industry, Bayesian network, COVID-19, Bayesian networks, 020201 artificial intelligence & image processing, Original Article, Artificial intelligence, business, computer
Abstract: Since recording the first case in Wuhan in November 2020, COVID-19 is still spreading widely and rapidly affecting the health of millions all over the globe. For fighting against this pandemic, numerous strategies have been made, where the early isolation is considered among the most effective ones. Proposing useful methods to screen and diagnose the patient’s situation for the purpose of specifying the adequate clinical management represents a significant challenge in diminishing the rates of mortality. Inspired from this current global health situation, we introduce a new autonomous process of decision-making that consists of two modules. The first module is the data analysis based on Bayesian network that is employed to indicate the coronavirus symptoms severity and then classify COVID-19 cases as severe, moderate or mild. The second module represents the decision-making based on association rules method that generates autonomously the adequate decision. To construct the model of Bayesian network, we used an effective method-oriented data for the sake of learning its structure. As a result, the algorithm accuracy in making the correct decision is 30% and in making the adequate decision is 70%. These experimental results demonstrate the importance of the suggested methods for decision-making. more...
Published: 2021

27. A novel association rule mining method for the identification of rare functional dependencies in Complex Technical Infrastructures from alarm data

Author: Ahmed Shokry, Luigi Serio, Piero Baraldi, Federico Antonello, Enrico Zio, Ugo Gentile, Politecnico di Milano [Milan] (POLIMI), Centre de Mathématiques Appliquées - Ecole Polytechnique (CMAP), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS), Centre de recherche sur les Risques et les Crises (CRC), MINES ParisTech - École nationale supérieure des mines de Paris, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Kyung Hee University (KHU), and European Organization for Nuclear Research (CERN) more...
Subjects: 0209 industrial biotechnology, Association rule learning, Computer science, General Engineering, Alarm data, 02 engineering and technology, computer.software_genre, Association rules, Computer Science Applications, Identification (information), ALARM, 020901 industrial engineering & automation, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Abnormal behaviors, Data mining, Complex Technical Infrastructures, [SHS.GEST-RISQ]Humanities and Social Sciences/domain_shs.gest-risq, Representation (mathematics), Functional dependency, Rare functional dependencies, computer, ComputingMilieux_MISCELLANEOUS
Abstract: This work presents a data-driven method for identifying rare functional dependencies among components of different systems of Complex Technical Infrastructures (CTIs) from large-scale databases of alarm messages. It is based on the representation of the alarm data in a binary form, the use of a novel association rule mining algorithm properly tailored for discovering rare dependencies among components of different systems and on the identification of groups of functionally dependent components. The proposed method is applied to a synthetic alarm database generated by a simulated CTI model and to a real large-scale database of alarms collected in the CTI of CERN (European Organization for Nuclear Research). The obtained results show the effectiveness of the proposed method. more...
Published: 2021
Full Text: View/download PDF

28. Unsupervised Machine Learning and Data Mining Procedures Reveal Short Term, Climate Driven Patterns Linking Physico-Chemical Features and Zooplankton Diversity in Small Ponds

Author: Nicolò Bellin, Marco Bartoli, Valeria Rossi, Erica Racchetti, and Catia Maurone
Subjects: 0106 biological sciences, Fuzzy clustering, Association rule learning, Computer science, Ecology (disciplines), Geography, Planning and Development, Fuzzy set, Aquatic Science, 010603 evolutionary biology, 01 natural sciences, Biochemistry, Fuzzy logic, association rules, nutrients, chlorophyll, Cluster analysis, TD201-500, Water Science and Technology, Water supply for domestic and industrial purposes, 010604 marine biology & hydrobiology, Hydraulic engineering, Unsupervised learning, fuzzy clustering, Physical geography, Surface runoff, TC1-978
Abstract: Machine Learning (ML) is an increasingly accessible discipline in computer science that develops dynamic algorithms capable of data-driven decisions and whose use in ecology is growing. Fuzzy sets are suitable descriptors of ecological communities as compared to other standard algorithms and allow the description of decisions that include elements of uncertainty and vagueness. However, fuzzy sets are scarcely applied in ecology. In this work, an unsupervised machine learning algorithm, fuzzy c-means and association rules mining were applied to assess the factors influencing the assemblage composition and distribution patterns of 12 zooplankton taxa in 24 shallow ponds in northern Italy. The fuzzy c-means algorithm was implemented to classify the ponds in terms of taxa they support, and to identify the influence of chemical and physical environmental features on the assemblage patterns. Data retrieved during 2014 and 2015 were compared, taking into account that 2014 late spring and summer air temperatures were much lower than historical records, whereas 2015 mean monthly air temperatures were much warmer than historical averages. In both years, fuzzy c-means show a strong clustering of ponds in two groups, contrasting sites characterized by different physico-chemical and biological features. Climatic anomalies, affecting the temperature regime, together with the main water supply to shallow ponds (e.g., surface runoff vs. groundwater) represent disturbance factors producing large interannual differences in the chemistry, biology and short-term dynamic of small aquatic ecosystems. Unsupervised machine learning algorithms and fuzzy sets may help in catching such apparently erratic differences. more...
Published: 2021

29. COVID-19 patient diagnosis and treatment data mining algorithm based on association rules

Author: Wei Miao and Zicheng Shan
Subjects: Decision support system, Information retrieval, Association rule learning, Computer science, business.industry, Online analytical processing, Decision tree, online analytical processing, data warehouse, Original Articles, computer.software_genre, Data warehouse, Expert system, Theoretical Computer Science, association rules, Computational Theory and Mathematics, Web mining, Knowledge base, Artificial Intelligence, Control and Systems Engineering, COVID‐19 patients, Original Article, diagnosis treatment data mining, business, computer
Abstract: Association rules are used in different data mining applications, including Web mining, intrusion detection, and bioinformatics. This study mainly discusses the COVID‐19 patient diagnosis and treatment data mining algorithm based on association rules. General data The key time interval during the main diagnosis and treatment process (including onset to dyspnea, first diagnosis, admission, mechanical ventilation, death, and the time from first diagnosis to admission, etc.), the cause of death by laboratory examination, and so forth. The frequency of drug use was counted and association rule algorithm was used to analyse and study the effect of drug treatment. The results could provide reference for rational drug use in COVID‐19 patients. In this study, in order to improve the efficiency of data mining in data processing, it is necessary to pre‐process these data. Secondly, in the application of this data mining, the main objective is to extract association rules of COVID‐19 complications. So its properties for mining should be various diseases. Therefore, it is necessary to classify individual disease types. During the construction of association rules database, the data in the data warehouse is analysed online and the association rules data mining is analysed. The results are stored in the knowledge base for decision support. For example, the prediction results of the decision tree can be displayed at this level. After the construction of the mining model, the display interface can be mined, and the decision‐maker can input the corresponding attribute value and then predict it. 0.76% of people had both COVID‐19, CHD and hypertension, while 46.5% of people with COVID‐19 and CHD were likely to have hypertension. This study is helpful to analyse the imaging factors of COVID‐19 disease. [ABSTRACT FROM AUTHOR] Copyright of Expert Systems is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) more...
Published: 2021

30. Finding Effective Item Assignment Plans with Weighted Item Associations Using A Hybrid Genetic Algorithm

Author: Kwang Il Ahn, Kichun Lee, and Minho Ryu
Subjects: 0209 industrial biotechnology, Association rule learning, Computer science, Association (object-oriented programming), Crossover, hybrid genetic algorithm, 02 engineering and technology, computer.software_genre, lcsh:Technology, association rules, lcsh:Chemistry, 020901 industrial engineering & automation, Operator (computer programming), item assignment, Genetic algorithm, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, lcsh:QH301-705.5, Instrumentation, Fluid Flow and Transfer Processes, cross-selling, lcsh:T, Process Chemistry and Technology, General Engineering, lcsh:QC1-999, Tabu search, Purchasing, Computer Science Applications, lcsh:Biology (General), lcsh:QD1-999, lcsh:TA1-2040, Benchmark (computing), 020201 artificial intelligence & image processing, Data mining, lcsh:Engineering (General). Civil engineering (General), computer, lcsh:Physics
Abstract: By identifying useful relationships between massive datasets, association rule mining can provide new insights to decision-makers. Item assignment models based on association between items are used to place items in a retail or e-commerce environment to increase sales. However, existing models fail to combine these associations with item-specific information, such as profit and purchasing frequency. To find effective assignments with item-specific information, we propose a new hybrid genetic algorithm that incorporates a robust tabu search with a novel rectangular partially matched crossover, focusing on rectangular layouts. Interestingly, we show that our item assignment model is equivalent to popular quadratic assignment NP-hard problems. We show the effectiveness of the proposed algorithm, using benchmark instances from QAPLIB and synthetic databases that represent real-life retail situations, and compare our algorithm with other existing algorithms. We also show that the proposed crossover operator outperforms a few existing ones in both fitness values and search times. The experimental results show that not only does the proposed item assignment model generates a more profitable assignment plan than the other tested models based on association alone but it also obtains better solutions than the other tested algorithms. more...
Published: 2021
Full Text: View/download PDF

31. Establishing a Multiple-Criteria Decision-Making Model for Stock Investment Decisions Using Data Mining Techniques

Author: Mu-Jung Huang, Cheng-Kai Fu, Kuo-Chih Cheng, Kuo-Hua Wang, Lan-Hui Lin, and Huo-Ming Wang
Subjects: Apriori algorithm, Association rule learning, Computer science, apriori algorithm, Geography, Planning and Development, Decision tree, Financial ratio, TJ807-830, 02 engineering and technology, Management, Monitoring, Policy and Law, TD194-195, Profit (economics), Renewable energy sources, association rules, decision tree, 0202 electrical engineering, electronic engineering, information engineering, Econometrics, multiple-criteria decision-making, GE1-350, Environmental effects of industries and plants, Renewable Energy, Sustainability and the Environment, Decision tree learning, 020207 software engineering, data mining, Investment (macroeconomics), Environmental sciences, Multiple criteria, 020201 artificial intelligence & image processing, Decision model, Decision-making models
Abstract: This study attempts to integrate the decision tree algorithm with the Apriori algorithm to explore the relationship among financial ratio, corporate governance, and stock returns to establish a stock investment decision model. The sports and leisure related industries are employed as the research target. The data are collected and processed for generating decision tree and association rules. Based on the analysis outcome, an investment decision model is constructed for investors expecting to decrease their investment risks and further increase their profits. This stock investment decision model is one type of multiple-criteria decision-making model. This study makes three critical contributions to investors. (1) It proposes a systematical model of exploring related data through the decision tree algorithm and the Apriori algorithm to reveal the implicit investment knowledge. (2) An effective investment decision model is established and expected to provide a reference basis during stock-picking decisions. (3) The investment decision model is enhanced with implicit rules found among variables using association rules. more...
Published: 2021

32. Clustering Based Approach to Enhance Association Rule Mining

Author: Anu Sahni, Paul Stynes, Samruddhi Kanhere, and Pramod Pathak
Subjects: Association rule learning, Computer science, differential market basket analysis, Differential (mechanical device), Affinity analysis, computer.software_genre, retail analytics, lcsh:Telecommunication, association rules, Set (abstract data type), Product (business), market basket analysis, lcsh:TK5101-6720, Scalability, Data mining, Cluster analysis, computer, scalability, Consumer behaviour, clustering
Abstract: Association rule mining algorithms such as Apriori and FPGrowth are extensively being used in the retail industry to uncover consumer buying patterns. However, the scalability of these algorithms to deal with the voraciously increasing data is the major challenge. This research presents a novel Clustering based approach by reducing the dataset size as a solution. The products are clustered based on their frequency and price. Another important aspect of this study is to find interesting rules by performing differential market basket analysis to identify association rules which are likely ignored in the trivial approach. When using a cluster-based approach, it is observed that the same set of rules can be generated by using only 7% of the total 16210 items, which in turn directly contributes to reducing the processing overheads and thus reducing the computation time. Furthermore, results obtained from differential market basket analysis have highlighted a few interesting rules which were missing from the original set of rules. A clustering-based approach used in this study not only consists of frequent items but also considers their contribution to the overall revenue generation by considering its price. In addition to this, the least contributing product exclusion rate is also improved from 45% to 93 %. These results evidently suggest that the computation cost can be significantly reduced, and more accurate rules can be generated by applying differential market basket analysis. more...
Published: 2021
Full Text: View/download PDF

33. Construction of Materialized Views in Non-Binary Data Space

Author: Bibekananda Shit, Santanu Roy, Agostino Cortesi, and Soumya Sen
Subjects: Materialized view, Speedup, Settore INF/01 - Informatica, Association rule learning, Computer science, Non-binary data space, Association rules, Dynamic support count, Construct (python library), Space (commercial competition), computer.software_genre, Database-centric architecture, Binary data, Benchmark (computing), Data mining, computer
Abstract: Materialized views are heavily used to speed up the query response time of any data centric application. In literature, the construction and dynamic maintenance of materialized views are carried out in a Binary Data Space where all attributes are given the same weight. Considering different weights may be particularly significant when similar queries are posed by multiple users, as taking into account the number of accesses to the different attribute values may reflect into the ability of tuning the materialized views accordingly. The methodology to construct weighted materialized view introduced in this paper is based on the association mining techniques, by applying it in a Non-Binary Data Space. The proposed algorithm has been verified by simulation experiments with two benchmark datasets using practical transactional queries. The experimental results prove the superiority of our proposal in terms of query Hit-Miss ratio and flexibility of view size extendibility according to the requirement of practical applications. more...
Published: 2021
Full Text: View/download PDF

34. Association rule-based malware classification using common subsequences of API calls

Author: Gianni D'Angelo, Francesco Palmieri, Massimo Ficco, D'Angelo, G., Ficco, M., and Palmieri, F.
Subjects: Malware dynamic analysi, 0209 industrial biotechnology, Exploit, Association rule learning, Computer science, Markov chain, Evasion (network security), 02 engineering and technology, API call sequence, Association rules, Machine learning, Malware dynamic analysis, Markov chains, Sequence alignment, computer.software_genre, 020901 industrial engineering & automation, Obfuscation, 0202 electrical engineering, electronic engineering, information engineering, Code (cryptography), Obfuscation (software), Association rule, Malware, 020201 artificial intelligence & image processing, Data mining, computer, Software
Abstract: Emerging malware pose increasing challenges to detection systems as their variety and sophistication continue to increase. Malware developers use complex techniques to produce malware variants, by removing, replacing, and adding useless API calls to the code, which are specifically designed to evade detection mechanisms, as well as do not affect the original functionality of the malicious code involved. In this work, a new recurring subsequences alignment-based algorithm that exploits associative rules has been proposed to infer malware behaviors. The proposed approach exploits the probabilities of transitioning from two API invocations in the call sequence, as well as it also considers their timeline, by extracting subsequence of API calls not necessarily consecutive and representative of common malicious behaviors of specific subsets of malware. The resulting malware classification scheme, capable to operate within dynamic analysis scenarios in which API calls are traced at runtime, is inherently robust against evasion/obfuscation techniques based on the API call flow perturbation. It has been experimentally compared with two detectors based on Markov chain and API call sequence alignment algorithms, which are among the most widely adopted approaches for malware classification. In such experimental assessment the proposed approach showed an excellent classification performance by outperforming its competitors. more...
Published: 2021

35. Checking Sets of Pure Evolving Association Rules

Author: Carlo Combi, Romeo Rizzi, and Pietro Sala
Subjects: Algebra and Number Theory, Association rule learning, Computer science, computer.software_genre, Data complexity, Theoretical Computer Science, Association Rules, Computational Theory and Mathematics, Data Complexity, Data Mining, Data mining, computer, Information Systems, Data Mining, Association Rules, Data Complexity
Abstract: Extracting association rules from large datasets has been widely studied in many variants in the last two decades; they allow to extract relations between values that occur more “often” in a database. With temporal association rules the concept has been declined to temporal databases. In this context the “most frequent” patterns of evolution of one or more attribute values are extracted. In the temporal setting, especially where the interference betweeen temporal patterns cannot be neglected (e.g., in medical domains), there may be the case that we are looking for a set of temporal association rules for which a “significant” portion of the original database represents a consistent model for all of them. In this work, we introduce a simple and intuitive form for temporal association rules, called pure evolving association rules (PE-ARs for short), and we study the complexity of checking a set of PE-ARs over an instance of a temporal relation under approximation (i.e., a percentage of tuples that may be deleted from the original relation). As a by-product of our study we address the complexity class for a general problem on Directed Acyclic Graphs which is theoretically interesting per se. more...
Published: 2021

36. Knowledge Discovering on Graphene Green Technology by Text Mining in National R&D Projects in South Korea

Author: Richa Kumari, Byeong-Hee Lee, Jae Yun Jeong, Tae-Hyun Kim, and Ji Yeon Lee
Subjects: Topic model, Engineering, Association rule learning, 020209 energy, media_common.quotation_subject, Geography, Planning and Development, topic modeling, lcsh:TJ807-830, lcsh:Renewable energy sources, 02 engineering and technology, Management, Monitoring, Policy and Law, New Deal, association rules, Green New Deal, 0202 electrical engineering, electronic engineering, information engineering, Project management, Project management system, National Science and Technology Information Service, lcsh:Environmental sciences, media_common, lcsh:GE1-350, Government, Renewable Energy, Sustainability and the Environment, business.industry, lcsh:Environmental effects of industries and plants, 021001 nanoscience & nanotechnology, project management, Engineering management, lcsh:TD194-195, Service (economics), green new deal policy, 0210 nano-technology, business
Abstract: This paper reviews the development of South Korea’s national research and development (R&D) in graphene technology, focusing on projects that have been classified as “green” technology. A total of 826 projects (USD 210 billion) from 2010 to 2019 were collected from the National Science and Technology Information Service (NTIS), which is full-cycle national R&D project management system in South Korea. Then we analyzed its R&D trend by conducting diverse text mining methods including frequency analysis, association rule mining, and topic modeling. The analysis suggests that the number of graphene green technology (GT) R&D projects and the research expenses will show a rising curve again in the incumbent government along with the implementation of the Korean New Deal policy, which integrates the Green New Deal and the Digital New Deal. more...
Published: 2020

37. Fault Diagnosis of Traction Transformer Based on Bayesian Network

Author: Pan Weiguo, Lin Sheng, Feng Ding, Sheng Bi, Xiao Yong, and Guo Xiaomin
Subjects: Control and Optimization, Association rule learning, Computer science, Energy Engineering and Power Technology, 02 engineering and technology, 01 natural sciences, lcsh:Technology, law.invention, association rules, Traction transformer, law, 0103 physical sciences, 0202 electrical engineering, electronic engineering, information engineering, Electrical and Electronic Engineering, Transformer, Engineering (miscellaneous), Leakage (electronics), 010302 applied physics, Renewable Energy, Sustainability and the Environment, business.industry, lcsh:T, Fossil fuel, Bayesian network, traction transformer, fault diagnosis, Reliability engineering, conditional probability, 020201 artificial intelligence & image processing, business, human activities, Energy (miscellaneous), Test data, Voltage
Abstract: As the core equipment of a traction power supply system, the traction transformer is very important to ensure the safe and reliable operation of the system. At present, the three-ratio method is mainly used to distinguish transformer faults, whereas such a method has some defects, such as insufficient coding and over-general fault classification. At the same time, on-site maintenance personnel make an empirical judgment based on various test data, which is subjective and uncertain to a certain extent. For cases with multiple abnormal data and relatively complex conditions, on-site personnel often need to discuss and even dismantle the transformer to identify the fault, which is time-consuming and costly. In order to improve the effect of fault diagnosis for traction transformer, this paper uses Bayesian network to correlate the cause and effect of various tests and faults. By combining the results of field tests, the fault is diagnosed by the causal probability of the Bayesian network, rather than relying on the exception that occurred in a single experiment to judge its fault. The diagnosis results are more accurate and objective by using the Bayesian network. In this paper, the frequent test anomalies of the traction transformer are taken into account in the network, so that the network can more comprehensively analyze the operation situation of the traction transformer and judge the type of fault. According to field situations, based on the existing set of symptoms of the Bayesian network fault diagnosis, this paper further considers the insulation resistance, dielectric loss tangent value, oil and gas, power frequency voltage, and leakage current. By combining the association rules algorithm and the experience of the field operators, the cause–effect relationship of test data and the conditional probability parameters of the network are obtained. Then, the Bayesian network is constructed and used for traction transformer fault diagnosis. The case study shows that the four types of fault diagnosed using the Bayesian network model proposed in this paper are consistent with the fault types inspected by on-site operators, which shows promising engineering application prospects. more...
Published: 2020

38. Apriori Algorithm for the Data Mining of Global Cyberspace Security Issues for Human Participatory Based on Association Rules

Author: Zhi Li, Xuyu Li, Runhua Tang, and Lin Zhang
Subjects: Apriori algorithm, Association rule learning, cyberspace security, Internet privacy, lcsh:BF1-990, Sample (statistics), 02 engineering and technology, 050105 experimental psychology, association rules, Web page, 0202 electrical engineering, electronic engineering, information engineering, Data Protection Act 1998, Psychology, 0501 psychology and cognitive sciences, General Psychology, Original Research, ComputingMilieux_THECOMPUTINGPROFESSION, business.industry, network sovereignty, 05 social sciences, data mining, lcsh:Psychology, Cyber-attack, 020201 artificial intelligence & image processing, The Internet, business, Cyberspace
Abstract: This study explored the global cyberspace security issues, with the purpose of breaking the stereotype of people’s cognition of cyberspace problems, which reflects the relationship between interdependence and association. Based on the Apriori algorithm in association rules, a total of 181 strong rules were mined from 40 target websites and 56,096 web pages were associated with global cyberspace security. Moreover, this study analyzed support, confidence, promotion, leverage, and reliability to achieve comprehensive coverage of data. A total of 15,661 sites mentioned cyberspace security-related words from the total sample of 22,493 professional websites, accounting for 69.6%, while only 735 sites mentioned cyberspace security-related words from the total sample of 33,603 non-professional sites, accounting for 2%. Due to restrictions of language, the number of samples of target professional websites and non-target websites is limited. Meanwhile, the number of selections of strong rules is not satisfactory. Nowadays, the cores of global cyberspace security issues include internet sovereignty, cyberspace security, cyber attack, cyber crime, data leakage, and data protection. more...
Published: 2020

39. Medical Health Benefit Management System for Real-Time Notification of Fraud Using Historical Medical Records

Author: Shoab A. Khan, Irum Matloob, Habibur Rahman, and Farhan Hussain
Subjects: Knowledge management, Association rule learning, Computer science, Specialty, anomaly, 02 engineering and technology, 01 natural sciences, lcsh:Technology, association score, association rules, lcsh:Chemistry, Health care, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Instrumentation, lcsh:QH301-705.5, Reimbursement, Fluid Flow and Transfer Processes, Service (business), Government, business.industry, lcsh:T, Process Chemistry and Technology, 010401 analytical chemistry, General Engineering, outlier, lcsh:QC1-999, 0104 chemical sciences, Computer Science Applications, lcsh:Biology (General), lcsh:QD1-999, lcsh:TA1-2040, Management system, 020201 artificial intelligence & image processing, fraud, business, lcsh:Engineering (General). Civil engineering (General), Transaction data, lcsh:Physics, clustering
Abstract: This paper presents a novel framework for fraud detection in healthcare systems which self-learns from the historical medical data. Historical medical records are required for training and testing of machine learning models. The main problem being faced by both private and government health supported schemes is a rapid rise in the amount of claims by beneficiaries mostly based on fraudulent billing. Detection of fraudulent transactions in healthcare systems is a strenuous task due to intricate relationships among dynamic elements including doctors, patients, service. In light of aforementioned challenges in health support programs, there is a need to develop intelligent fraud detection models for tracing the loopholes in procedures which may lead to successful reimbursement of fraudulent medical bills. In order to address the issue of fraud in healthcare programs our solution proposes a framework based on three entities (patient, doctor, service). Firstly, the framework computes association scores for three elements of the healthcare ecosystem namely patients, doctors or services. The framework filters out identified cases using association scores. The Confidence values, after G-means clustering of transactional data, are computed for each service in each specialty. Rules are generated based on the confidence values of services for each specialty. Then, an evaluation of identified cases is done using rule engine. The framework classifies cases into fraudulent activities based on the similarity bit&rsquo, s value. The validation of framework is performed on local hospital employees transactional data which includes many reported cases of fraudulent activities in addition to some introduced anomalies. more...
Published: 2020

40. Automatic identification of knowledge related to dengue cases in the state of Piauí in public databases using Filtered-Association Rules Networks

Author: Jâina Carolina Meneses Calçada, Solange Oliveira Rezende, Joan Davi Santos Silva, and Dario Brito Calçada
Subjects: Association Rules, Dengue, Epidemiological surveillance, Knowledge Discovery, Networks, education.field_of_study, DENGUE, General Computer Science, Association rule learning, Computer science, Population, medicine.disease, computer.software_genre, Dengue fever, Identification (information), Knowledge extraction, Computer Science, Information system, medicine, State (computer science), Data mining, education, computer
Abstract: Dengue is an endemic disease in Brazil since the 1980s and since 1996 in Piau ́ı. The number of cases increases each year, with the incidence of more severe symptoms. This research aimed to evaluate the use of an automatic knowledge identification technique in factors related to the number of dengue occurrences. We built a dataset formed by data available in the Information System for Notifiable Diseases (SINAN) and meteorological data of the municipalities of the coastal plain of Piau ́ı. The technique used was that of Filtered Association Rules Networks, which allows visual analysis of knowledge through the use of network structures and rules filtering. As a main result, we confirmed the understanding that the most significant number of cases occurs in May, as it is the moment when the rainfall indexes are decreasing, besides that socio-cultural and race factors do not interfere in the identification of the population of higher risk. This research presents the innovation of the use of a computational technique of automatic knowledge discovery that can assist in the elaboration of prevention actions by epidemiological surveillance. more...
Published: 2020

41. SAERMA: Stacked Autoencoder Rule Mining Algorithm for the Interpretation of Epistatic Interactions in GWAS for Extreme Obesity

Author: Carl Chalmers, Nurul Hashimah Ahamed Hassain Malim, Basma Abdulaimma, Casimiro Aday Curbelo Montañez, Denis Reilly, Paul Fergus, and Francesco Falciani
Subjects: FOS: Computer and information sciences, QA75, epistasis, Computer Science - Machine Learning, obesity, General Computer Science, Association rule learning, Computer science, Machine Learning (stat.ML), Genome-wide association study, 02 engineering and technology, Association rules, Machine Learning (cs.LG), QA76, 03 medical and health sciences, 0302 clinical medicine, Statistics - Machine Learning, autoencoders, Genetic variation, 0202 electrical engineering, electronic engineering, information engineering, genome-wide association studies (GWAS), General Materials Science, Quantitative Biology - Genomics, 030212 general & internal medicine, Gene, Genetic association, Interpretability, Genomics (q-bio.GN), Artificial neural network, business.industry, Deep learning, General Engineering, Linear model, deep learning, Autoencoder, R1, FOS: Biological sciences, Epistasis, 020201 artificial intelligence & image processing, Artificial intelligence, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, Algorithm, Classifier (UML), lcsh:TK1-9971, Curse of dimensionality
Abstract: One of the most important challenges in the analysis of high-throughput genetic data is the development of efficient computational methods to identify statistically significant Single Nucleotide Polymorphisms (SNPs). Genome-wide association studies (GWAS) use single-locus analysis where each SNP is independently tested for association with phenotypes. The limitation with this approach, however, is its inability to explain genetic variation in complex diseases. Alternative approaches are required to model the intricate relationships between SNPs. Our proposed approach extends GWAS by combining deep learning stacked autoencoders (SAEs) and association rule mining (ARM) to identify epistatic interactions between SNPs. Following traditional GWAS quality control and association analysis, the most significant SNPs are selected and used in the subsequent analysis to investigate epistasis. SAERMA controls the classification results produced in the final fully connected multi-layer feedforward artificial neural network (MLP) by manipulating the interestingness measures, support and confidence, in the rule generation process. The best classification results were achieved with 204 SNPs compressed to 100 units (77% AUC, 77% SE, 68% SP, 53% Gini, logloss=0.58, and MSE=0.20), although it was possible to achieve 73% AUC (77% SE, 63% SP, 45% Gini, logloss=0.62, and MSE=0.21) with 50 hidden units - both supported by close model interpretation., 12 pages, 6 figures, 12 tables, 9 equations, journal more...
Published: 2020

42. Medical Data Stream Distribution Pattern Association Rule Mining Algorithm Based on Density Estimation

Author: Dong Li, Yanwei Wang, and Xiaofeng Li
Subjects: Data stream, General Computer Science, Association rule learning, distribution pattern, Computer science, 02 engineering and technology, mining, Stability (probability), Structural equation modeling, association rules, 020204 information systems, Histogram, density estimation, 0202 electrical engineering, electronic engineering, information engineering, Range (statistics), General Materials Science, Cluster analysis, medical data, Compound neural network, General Engineering, Density estimation, TK1-9971, Data redundancy, 020201 artificial intelligence & image processing, Electrical engineering. Electronics. Nuclear engineering, Algorithm
Abstract: The traditional data mining method is featured by no analysis over the data distribution and incomplete derived association rule. As a result, the data mining results have the deficiencies of large redundancy probability, large root-mean-square error of approximation (RMSEA) and long consumption time. To handle these issues, this paper proposes a medical data stream distribution pattern association rule mining algorithm based on density estimation. This paper collects medical data, selects the distance method to detect abnormal orphan data in the data stream, detects the duplicate data in the data stream by the similar field matching degree, and eliminates the abnormal data and the duplicate data. Then, the data stream density is estimated based on the histogram estimation samples. According to the data density estimation results, this paper analyzes the distribution of medical data stream from perspectives of concentration, dispersion and morphological characteristics of data distribution. Afterwards, the data distribution pattern association rule mining model is constructed based on compound neural network, data distribution parameters are entered into model’s clustering layer, and in-depth training is conducted over the BP (Back Propagation) neural network at the model’s mining layer. Meanwhile, all rules under the combination of hidden layer’s neuron activity value and corresponding output value, and all rules under the combination of hidden layer’s neuron activity value and corresponding input value are derived, so as to complete association rule mining of medical data stream distribution pattern. The experimental results show that the proposed algorithm has a contour curve closest to the true probability density curve; the dispersion degree of medical data is within a reasonable range, and the medical data has high stability; the data redundancy probability is smaller; the mining result’s RMSEA is small; data mining takes less time. more...
Published: 2019

43. Recommendations for Mobile Apps Based on the HITS Algorithm Combined With Association Rules

Author: Wei Li, Yiwen Zhang, Xiangliang Zhong, Qilin Wu, Dengcheng Yan, and Yuan Ting Yan
Subjects: app recommendation, 010302 applied physics, General Computer Science, Association rule learning, Computer science, Download, General Engineering, Mobile apps, data mining, 02 engineering and technology, HITS algorithm, 021001 nanoscience & nanotechnology, 01 natural sciences, association rules, World Wide Web, 0103 physical sciences, Recommender systems, General Materials Science, lcsh:Electrical engineering. Electronics. Nuclear engineering, Electrical and Electronic Engineering, 0210 nano-technology, lcsh:TK1-9971
Abstract: With the increasing popularity of intelligent devices, the mobile apps market has exploded. Due to a large number of candidate app services, it has become very difficult for users to choose the mobile apps that he/she wants to install. Therefore, it is crucial to improve users' experience and make personalized recommendations. In some cases, the traditional recommendation methods can be convenient, but they still have some shortcomings, resulting in inaccurate recommendations in general. To address this issue, this paper proposes a method for mobile app recommendations that are based on the Hyperlink-Induced Topic Search (HITS) algorithm combined with association rules. This method integrates the authority and hub scores into the candidate applications through the download and rating information, and it not only considers the importance of mobile apps in association rules but also takes the reliability factor of users into account. Experiments with the Huawei application market datasets show that the proposed method significantly improves the recommendation accuracies compared with the traditional methods. more...
Published: 2019
Full Text: View/download PDF

44. TRICE: Mining Frequent Itemsets by Iterative TRimmed Transaction LattICE in Sparse Big Data

Author: Hamayoun Shahwani, Ch. Muhammad Nadeem Faisal, Muhammad Umar Chaudhry, Muhammad Yasir, Mudassar Ahmad, Muhammad Ashraf, Shahzad Sarwar, and Muhammad Asif Habib
Subjects: big data applications, General Computer Science, Association rule learning, Computer science, business.industry, Big data, pattern recognition, General Engineering, frequent itemset mining, data mining, computer.software_genre, Association rules, Lattice (order), pervasive computing, Unsupervised learning, General Materials Science, Data mining, lcsh:Electrical engineering. Electronics. Nuclear engineering, business, Database transaction, computer, lcsh:TK1-9971
Abstract: Sparseness is often witnessed in big data emanating from a variety of sources, including IoT, pervasive computing, and behavioral data. Frequent itemset mining is the first and foremost step of association rule mining, which is a distinguished unsupervised machine learning problem. However, techniques for frequent itemset mining are least explored for sparse real-world data, showing somewhat comparable performance. On the contrary, the methods are adequately validated for dense data and stand apart from each other in terms of performance. Hence, there arises an immense need for evaluating these techniques as well as proposing new ones for large sparse real-world datasets. In this study, a novel method: Mining Frequent Itemsets by Iterative TRimmed Transaction lattICE (TRICE) is proposed. TRICE iteratively generates combinations of varying-sized trimmed subsets of $I$ , where $I$ denote the set of distinct items in a database. Extensive experiments are conducted to assess TRICE against HARPP, FP-Growth, optimized SaM, and optimized RElim algorithms. The experimental results show that TRICE outperforms all these algorithms both in terms of running time and memory consumption. TRICE maintains a substantial performance gap for all sparse real-world datasets on all minimum support thresholds. Moreover, assessment of memory use of optimized SaM and RElim algorithms has been performed for the first time. more...
Published: 2019

45. Content Recommendation Algorithm for Intelligent Navigator in Fog Computing Based IoT Environment

Author: Jiuzhi Lin, Fuhong Lin, Xingshuo An, Yutong Zhou, Ilsun You, and Xing Lü
Subjects: Service (systems architecture), Content recommendation, General Computer Science, Association rule learning, Computer science, Cloud computing, 02 engineering and technology, association rules, Internet of Vehicles, 0202 electrical engineering, electronic engineering, information engineering, General Materials Science, Relevance (information retrieval), Mobile technology, Edge computing, business.industry, General Engineering, 020206 networking & telecommunications, Traffic congestion, 020201 artificial intelligence & image processing, The Internet, lcsh:Electrical engineering. Electronics. Nuclear engineering, fog computing, Internet of Things, business, Algorithm, lcsh:TK1-9971
Abstract: With the development of the Internet and mobile technologies, the Internet of Things (IoT) era has arrived. Vehicle networking technology can not only facilitate people’s travel but also effectively alleviate traffic congestion. The development of fog computing technology provides unlimited possibilities for the Internet of Vehicles (IoV). Intelligent navigator is a very important part of human–computer interaction in IoV. It carries a large number of tasks of recommending content for users. In order to get more accurate recommendation content, we propose a weighted interest degree recommendation algorithm using association rules for intelligence in the IoV. First, the user data are analyzed to establish the association rule mining algorithm. Second, the user interest score is predicted by analyzing the relevance between user interests to recommend personalized service for the user. From the simulation results, we can see that the proposed algorithm can achieve higher recommendation accuracy. more...
Published: 2019

46. Publishing Anonymized Set-Valued Data via Disassociation towards Analysis

Author: Nancy Awad, Bechara Al Bouna, Laurent Philippe, Jean-François Couchot, Franche-Comté Électronique Mécanique, Thermique et Optique - Sciences et Technologies (UMR 6174) (FEMTO-ST), Université de Technologie de Belfort-Montbeliard (UTBM)-Ecole Nationale Supérieure de Mécanique et des Microtechniques (ENSMM)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Centre National de la Recherche Scientifique (CNRS), and Université Antonine (UA) more...
Subjects: ant colony clustering, Association rule learning, Computer Networks and Communications, Property (programming), Computer science, media_common.quotation_subject, knowledge extraction, 02 engineering and technology, Data publishing, [INFO.INFO-SE]Computer Science [cs]/Software Engineering [cs.SE], computer.software_genre, privacy, anonymization, Set (abstract data type), [INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing, association rules, [INFO.INFO-CR]Computer Science [cs]/Cryptography and Security [cs.CR], Knowledge extraction, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Cluster analysis, media_common, lcsh:T58.5-58.64, lcsh:Information technology, Probabilistic logic, Ambiguity, 16. Peace & justice, [INFO.INFO-MO]Computer Science [cs]/Modeling and Simulation, [INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA], utility, disassociation, 020201 artificial intelligence & image processing, [INFO.INFO-ET]Computer Science [cs]/Emerging Technologies [cs.ET], Data mining, [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC], computer
Abstract: Data publishing is a challenging task for privacy preservation constraints. To ensure privacy, many anonymization techniques have been proposed. They differ in terms of the mathematical properties they verify and in terms of the functional objectives expected. Disassociation is one of the techniques that aim at anonymizing of set-valued datasets (e.g., discrete locations, search and shopping items) while guaranteeing the confidentiality property known as k m -anonymity. Disassociation separates the items of an itemset in vertical chunks to create ambiguity in the original associations. In a previous work, we defined a new ant-based clustering algorithm for the disassociation technique to preserve some items associated together, called utility rules, throughout the anonymization process, for accurate analysis. In this paper, we examine the disassociated dataset in terms of knowledge extraction. To make data analysis easy on top of the anonymized dataset, we define neighbor datasets or in other terms datasets that are the result of a probabilistic re-association process. To assess the neighborhood notion set-valued datasets are formalized into trees and a tree edit distance (TED) is directly applied between these neighbors. Finally, we prove the faithfulness of the neighbors to knowledge extraction for future analysis, in the experiments. more...
Published: 2020
Full Text: View/download PDF

47. Standardizing interestingness measures for association rules

Author: Mateen Shaikh, Paul D. McNicholas, M. Luiza Antonie, and Thomas Brendan Murphy
Subjects: FOS: Computer and information sciences, Association rule learning, Computer science, Machine Learning (stat.ML), 02 engineering and technology, Association rules, computer.software_genre, Statistics - Applications, 01 natural sciences, Machine Learning (cs.LG), 010104 statistics & probability, Text categorization, Statistics - Machine Learning, 020204 information systems, Frequency patterns, 0202 electrical engineering, electronic engineering, information engineering, Applications (stat.AP), 0101 mathematics, Interestingness measures, business.industry, Computer Science Applications, Computer Science - Learning, Standardizations, Artificial intelligence, business, computer, Analysis, Natural language processing, Information Systems
Abstract: Interestingness measures provide information about association rules. The value of an interestingness measure is often interpreted relative to the overall range of the interestingness measure. However, properties of individual association rules can further restrict what value an interestingness measure can achieve. These additional constraints are not typically taken into account in analysis, potentially misleading the investigator. Considering the value of an interestingness measure relative to this further constrained range provides greater insight than the original range alone and can even alter researchers' impressions of the data. Standardizing interestingness measures takes these additional restrictions into account, resulting in values that provide a relative measure of the attainable values. We explore the impacts of standardizing interestingness measures on real and simulated data. Insight Research Centre Natural Sciences and Engineering Research Council of Canada Ontario Ministry of Research and Innovation more...
Published: 2018
Full Text: View/download PDF

48. Usage Apriori and clustering algorithms in WEKA tools to mining dataset of traffic accidents

Author: Faisal Mohammed Nafie Ali and Abdelmoneim Ali Mohamed Hamed
Subjects: Association rule learning, Computer Networks and Communications, Computer science, computer.software_genre, traffic accidents, lcsh:Telecommunication, Set (abstract data type), association rules, lcsh:TK5101-6720, 0502 economics and business, Expectation–maximization algorithm, Computer Science (miscellaneous), 0501 psychology and cognitive sciences, Electrical and Electronic Engineering, Cluster analysis, EM algorithm, Data mining, 050107 human factors, 050210 logistics & transportation, lcsh:T58.5-58.64, lcsh:Information technology, 05 social sciences, InformationSystems_DATABASEMANAGEMENT, Apriori, Computer Science Applications, ComputingMethodologies_PATTERNRECOGNITION, A priori and a posteriori, computer, clustering
Abstract: The aim of this study is finding approaches for investigating association rules mining algorithms and clustering to offer new rules from a broad set of discovered rules which taken from traffic accident data at Alghat Provence in KSA. Several tools are applying in data mining to extracting data. WEKA provides applications of learning algorithms that can efficiently execute any dataset. In WEKA tools, there are many algorithms used to mining data. Apriori and cluster are the first-rate and most famed algorithms. Apriori is the simple algorithm, which applied for mining of repeated the patterns from the transaction dataset to find frequent itemsets and association between various item sets. A cluster is a technique used to group a collection of items having similar features. Association rules applied to find the connection between data items in a transactional database. Association rules data mining algorithms used to discover frequent association. WEKA tools were used to analysing traffic dataset, which composed of 946 instances and 8 attributes. Apriori algorithm and EM cluster were implemented for traffic dataset to discover the factors, which causes accidents. Through the results, shows that the Apriori algorithm is better than the EM cluster algorithm. more...
Published: 2018

49. On Two Apriori-Based Rule Generators: Apriori in Prolog and Apriori in SQL

Author: Kao-Yi Shen, Hiroshi Sakai, and Michinori Nakata
Subjects: Apriori algorithm, SQL, Association rule learning, Computer science, apriori algorithm, InformationSystems_DATABASEMANAGEMENT, 02 engineering and technology, computer.software_genre, Human-Computer Interaction, Prolog, association rules, prolog, ComputingMethodologies_PATTERNRECOGNITION, Artificial Intelligence, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, A priori and a posteriori, 020201 artificial intelligence & image processing, Computer Vision and Pattern Recognition, Data mining, computer, rule generation, computer.programming_language
Abstract: This paper focuses on two Apriori-based rule generators. The first is the rule generator in Prolog and C, and the second is the one in SQL. They are namedApriori in PrologandApriori in SQL, respectively. Each rule generator is based on the Apriori algorithm. However, each rule generator has its own properties. Apriori in Prolog employs the equivalence classes defined by table data sets and follows the framework of rough sets. On the other hand, Apriori in SQL employs a search for rule generation and does not make use of equivalence classes. This paper clarifies the properties of these two rule generators and considers effective applications of each to existing data sets. more...
Published: 2018

50. Method of Association Rules Mining and Its Application in Analysis of Seawater Samples

Author: Xinhang Xu, Yonghong Liu, Hongtao Zhang, and Qiuhong Sun
Subjects: Fitness function, Association rule learning, lcsh:T58.5-58.64, Computer science, lcsh:T, lcsh:Information technology, Crossover, Photoelectric sensor, General Engineering, computer.software_genre, lcsh:Technology, Association Rules, Set (abstract data type), Immune Genetic Algorithm (IGA), Potential Data, Mutation (genetic algorithm), Genetic algorithm, A priori and a posteriori, Data mining, computer
Abstract: This paper aims to set up new rules for processing seawater quality monitoring data collected by photoelectric sensor network, and mine out the useful information contained in the data. For this purpose, the immune algorithm was introduced to the classical genetic algorithm, the fitness function was designed, and the crossover and mutation probabilities were adjusted, thus creating the adaptive immune genetic algorithm (IIGA). The new algorithm was described in details and applied in an actual case. Through the comparison between the IIGA, IGA and apriori algorithms, the author concluded that the IIGA not only shortened the mining time, but also ensured the operation accuracy. The research findings are of great importance to the association rules mining in various fields. more...
Published: 2018

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

278 results on '"association rules"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources