Language: english / Publisher: mdpi / Topic: algorithms and machine learning - Searchworks@Jio Institute Digital Library Search Results

1. Special Issue: "2022 and 2023 Selected Papers from Algorithms' Editorial Board Members".

Author: Werner, Frank
Subjects: EDITORIAL boards, ALGORITHMS, OPTIMIZATION algorithms, DIFFERENTIAL evolution, QUADRATIC assignment problem, MACHINE learning, TABU search algorithm
Abstract: This document is a special issue of the journal Algorithms, featuring selected papers from the journal's editorial board members from 2022 and 2023. The issue includes 16 research papers covering a range of topics such as game theory, fault detection in cellular networks, optimization algorithms, machine learning, cryptocurrency trading, and more. Each paper presents its own unique research findings and methodologies. The issue aims to showcase the diverse research interests and expertise of the journal's editorial board members. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

2. Role of Machine Learning in Resource Allocation Strategy over Vehicular Networks: A Survey.

Author: Nurcahyani I and Lee JW
Subjects: Resource Allocation, Algorithms, Machine Learning
Abstract: The increasing demand for smart vehicles with many sensing capabilities will escalate data traffic in vehicular networks. Meanwhile, available network resources are limited. The emergence of AI implementation in vehicular network resource allocation opens the opportunity to improve resource utilization to provide more reliable services. Accordingly, many resource allocation schemes with various machine learning algorithms have been proposed to dynamically manage and allocate network resources. This survey paper presents how machine learning is leveraged in the vehicular network resource allocation strategy. We focus our study on determining its role in the mechanism. First, we provide an analysis of how authors designed their scenarios to orchestrate the resource allocation strategy. Secondly, we classify the mechanisms based on the parameters they chose when designing the algorithms. Finally, we analyze the challenges in designing a resource allocation strategy in vehicular networks using machine learning. Therefore, a thorough understanding of how machine learning algorithms are utilized to offer a dynamic resource allocation in vehicular networks is provided in this study.
Published: 2021
Full Text: View/download PDF

3. Wearable ECG Device and Machine Learning for Heart Monitoring.

Author: Alimbayeva Z, Alimbayev C, Ozhikenov K, Bayanbay N, and Ozhikenova A
Subjects: Humans, Cardiovascular Diseases diagnosis, Monitoring, Physiologic instrumentation, Monitoring, Physiologic methods, Machine Learning, Wearable Electronic Devices, Electrocardiography methods, Electrocardiography instrumentation, Algorithms, Signal Processing, Computer-Assisted, Neural Networks, Computer
Abstract: With cardiovascular diseases (CVD) remaining a leading cause of mortality, wearable devices for monitoring cardiac activity have gained significant, renewed interest among the medical community. This paper introduces an innovative ECG monitoring system based on a single-lead ECG machine, enhanced using machine learning methods. The system only processes and analyzes ECG data, but it can also be used to predict potential heart disease at an early stage. The wearable device was built on the ADS1298 and a microcontroller STM32L151xD. A server module based on the architecture style of the REST API was designed to facilitate interaction with the web-based segment of the system. The module is responsible for receiving data in real time from the microcontroller and delivering this data to the web-based segment of the module. Algorithms for analyzing ECG signals have been developed, including band filter artifact removal, K-means clustering for signal segmentation, and PQRST analysis. Machine learning methods, such as isolation forests, have been employed for ECG anomaly detection. Moreover, a comparative analysis with various machine learning methods, including logistic regression, random forest, SVM, XGBoost, decision forest, and CNNs, was conducted to predict the incidence of cardiovascular diseases. Convoluted neural networks (CNN) showed an accuracy of 0.926, proving their high effectiveness for ECG data processing.
Published: 2024
Full Text: View/download PDF

4. Elbow Gesture Recognition with an Array of Inductive Sensors and Machine Learning.

Author: Abbasnia A, Ravan M, and K Amineh R
Subjects: Humans, Wearable Electronic Devices, Pattern Recognition, Automated methods, Signal Processing, Computer-Assisted, Male, Adult, Female, Gestures, Machine Learning, Elbow physiology, Algorithms
Abstract: This work presents a novel approach for elbow gesture recognition using an array of inductive sensors and a machine learning algorithm (MLA). This paper describes the design of the inductive sensor array integrated into a flexible and wearable sleeve. The sensor array consists of coils sewn onto the sleeve, which form an LC tank circuit along with the externally connected inductors and capacitors. Changes in the elbow position modulate the inductance of these coils, allowing the sensor array to capture a range of elbow movements. The signal processing and random forest MLA to recognize 10 different elbow gestures are described. Rigorous evaluation on 8 subjects and data augmentation, which leveraged the dataset to 1270 trials per gesture, enabled the system to achieve remarkable accuracy of 98.3% and 98.5% using 5-fold cross-validation and leave-one-subject-out cross-validation, respectively. The test performance was then assessed using data collected from five new subjects. The high classification accuracy of 94% demonstrates the generalizability of the designed system. The proposed solution addresses the limitations of existing elbow gesture recognition designs and offers a practical and effective approach for intuitive human-machine interaction.
Published: 2024
Full Text: View/download PDF

5. Prediction of Protein-DNA Interface Hot Spots Based on Empirical Mode Decomposition and Machine Learning.

Author: Fang Z, Li Z, Li M, Yue Z, and Li K
Subjects: Protein Binding, DNA-Binding Proteins chemistry, DNA-Binding Proteins genetics, DNA-Binding Proteins metabolism, Computational Biology methods, Binding Sites, Machine Learning, DNA genetics, DNA chemistry, DNA metabolism, Algorithms
Abstract: Protein-DNA complex interactivity plays a crucial role in biological activities such as gene expression, modification, replication and transcription. Understanding the physiological significance of protein-DNA binding interfacial hot spots, as well as the development of computational biology, depends on the precise identification of these regions. In this paper, a hot spot prediction method called EC-PDH is proposed. First, we extracted features of these hot spots' solid solvent-accessible surface area (ASA) and secondary structure, and then the mean, variance, energy and autocorrelation function values of the first three intrinsic modal components (IMFs) of these conventional features were extracted as new features via the empirical modal decomposition algorithm (EMD). A total of 218 dimensional features were obtained. For feature selection, we used the maximum correlation minimum redundancy sequence forward selection method (mRMR-SFS) to obtain an optimal 11-dimensional-feature subset. To address the issue of data imbalance, we used the SMOTE-Tomek algorithm to balance positive and negative samples and finally used cat gradient boosting (CatBoost) to construct our hot spot prediction model for protein-DNA binding interfaces. Our method performs well on the test set, with AUC, MCC and F1 score values of 0.847, 0.543 and 0.772, respectively. After a comparative evaluation, EC-PDH outperforms the existing state-of-the-art methods in identifying hot spots.
Published: 2024
Full Text: View/download PDF

6. Hybrid Anomaly Detection in Time Series by Combining Kalman Filters and Machine Learning Models.

Author: Puder A, Zink M, Seidel L, and Sax E
Subjects: Humans, Machine Learning, Algorithms
Abstract: Due to connectivity and automation trends, the medical device industry is experiencing increased demand for safety and security mechanisms. Anomaly detection has proven to be a valuable approach for ensuring safety and security in other industries, such as automotive or IT. Medical devices must operate across a wide range of values due to variations in patient anthropometric data, making anomaly detection based on a simple threshold for signal deviations impractical. For example, surgical robots directly contacting the patient's tissue require precise sensor data. However, since the deformation of the patient's body during interaction or movement is highly dependent on body mass, it is impossible to define a single threshold for implausible sensor data that applies to all patients. This also involves statistical methods, such as Z-score, that consider standard deviation. Even pure machine learning algorithms cannot be expected to provide the required accuracy simply due to the lack of available training data. This paper proposes using hybrid filters by combining dynamic system models based on expert knowledge and data-based models for anomaly detection in an operating room scenario. This approach can improve detection performance and explainability while reducing the computing resources needed on embedded devices, enabling a distributed approach to anomaly detection.
Published: 2024
Full Text: View/download PDF

7. Domain Adaptation Based on Semi-Supervised Cross-Domain Mean Discriminative Analysis and Kernel Transfer Extreme Learning Machine.

Author: Li X and Ma J
Subjects: Learning, Acclimatization, Algorithms, Machine Learning
Abstract: Good data feature representation and high precision classifiers are the key steps for pattern recognition. However, when the data distributions between testing samples and training samples do not match, the traditional feature extraction methods and classification models usually degrade. In this paper, we propose a domain adaptation approach to handle this problem. In our method, we first introduce cross-domain mean approximation (CDMA) into semi-supervised discriminative analysis (SDA) and design semi-supervised cross-domain mean discriminative analysis (SCDMDA) to extract shared features across domains. Secondly, a kernel extreme learning machine (KELM) is applied as a subsequent classifier for the classification task. Moreover, we design a cross-domain mean constraint term on the source domain into KELM and construct a kernel transfer extreme learning machine (KTELM) to further promote knowledge transfer. Finally, the experimental results from four real-world cross-domain visual datasets prove that the proposed method is more competitive than many other state-of-the-art methods.
Published: 2023
Full Text: View/download PDF

8. Robust Virtual Sensing of the Vehicle Sideslip Angle through the Cross-Combination of Multiple Filters Using a Decision Tree Algorithm.

Author: Atheupe GP, El Mrhasli Y, Emabou U, Monsuez B, Bordignon K, and Tapus A
Subjects: Reproducibility of Results, Decision Trees, Algorithms, Machine Learning
Abstract: This paper presents a state-of-the-art estimation technique by cross-combining a number n of filters for high-precision, reliable and robust vehicle sideslip angle state estimation, over a full range of vehicle operations irrespective of the driving mission and disruptions that may occur in the system. A machine-learning algorithm based on decision trees connects several filters together to switch between them according to the driving context, ensuring the best possible state estimate for relatively small and large sideslip angle values. In conjunction with the above-mentioned aspects, a seamless transition between different vehicle models is attained by observing the key parameters characterizing the lateral motion of the vehicle. The tests conducted using a prototype vehicle on a snow-covered track confirm the effectiveness and reliability of the proposed approach.
Published: 2023
Full Text: View/download PDF

9. Machine Learning-Based Anomaly Detection in NFV: A Comprehensive Survey.

Author: Zehra S, Faseeha U, Syed HJ, Samad F, Ibrahim AO, Abulfaraj AW, and Nagmeldin W
Subjects: Problem Solving, Technology, Algorithms, Machine Learning
Abstract: Network function virtualization (NFV) is a rapidly growing technology that enables the virtualization of traditional network hardware components, offering benefits such as cost reduction, increased flexibility, and efficient resource utilization. Moreover, NFV plays a crucial role in sensor and IoT networks by ensuring optimal resource usage and effective network management. However, adopting NFV in these networks also brings security challenges that must promptly and effectively address. This survey paper focuses on exploring the security challenges associated with NFV. It proposes the utilization of anomaly detection techniques as a means to mitigate the potential risks of cyber attacks. The research evaluates the strengths and weaknesses of various machine learning-based algorithms for detecting network-based anomalies in NFV networks. By providing insights into the most efficient algorithm for timely and effective anomaly detection in NFV networks, this study aims to assist network administrators and security professionals in enhancing the security of NFV deployments, thus safeguarding the integrity and performance of sensors and IoT systems.
Published: 2023
Full Text: View/download PDF

10. A Federated Learning-Inspired Evolutionary Algorithm: Application to Glucose Prediction.

Author: De Falco I, Della Cioppa A, Koutny T, Ubl M, Krcma M, Scafuri U, and Tarantino E
Subjects: Humans, Glucose, Knowledge, Privacy, Algorithms, Machine Learning
Abstract: In this paper, we propose an innovative Federated Learning-inspired evolutionary framework. Its main novelty is that this is the first time that an Evolutionary Algorithm is employed on its own to directly perform Federated Learning activity. A further novelty resides in the fact that, differently from the other Federated Learning frameworks in the literature, ours can efficiently deal at the same time with two relevant issues in Machine Learning, i.e., data privacy and interpretability of the solutions. Our framework consists of a master/slave approach in which each slave contains local data, protecting sensible private data, and exploits an evolutionary algorithm to generate prediction models. The master shares through the slaves the locally learned models that emerge on each slave. Sharing these local models results in global models. Being that data privacy and interpretability are very significant in the medical domain, the algorithm is tested to forecast future glucose values for diabetic patients by exploiting a Grammatical Evolution algorithm. The effectiveness of this knowledge-sharing process is assessed experimentally by comparing the proposed framework with another where no exchange of local models occurs. The results show that the performance of the proposed approach is better and demonstrate the validity of its sharing process for the emergence of local models for personal diabetes management, usable as efficient global models. When further subjects not involved in the learning process are considered, the models discovered by our framework show higher generalization capability than those achieved without knowledge sharing: the improvement provided by knowledge sharing is equal to about 3.03% for precision, 1.56% for recall, 3.17% for F1, and 1.56% for accuracy. Moreover, statistical analysis reveals the statistical superiority of model exchange with respect to the case of no exchange taking place.
Published: 2023
Full Text: View/download PDF

11. An Insight into the Machine-Learning-Based Fileless Malware Detection.

Author: Khalid O, Ullah S, Ahmad T, Saeed S, Alabbad DA, Aslam M, Buriro A, and Ahmad R
Subjects: Random Forest, Support Vector Machine, Logistic Models, Machine Learning, Algorithms
Abstract: In recent years, massive development in the malware industry changed the entire landscape for malware development. Therefore, cybercriminals became more sophisticated by advancing their development techniques from file-based to fileless malware. As file-based malware depends on files to spread itself, on the other hand, fileless malware does not require a traditional file system and uses benign processes to carry out its malicious intent. Therefore, it evades conventional detection techniques and remains stealthy. This paper briefly explains fileless malware, its life cycle, and its infection chain. Moreover, it proposes a detection technique based on feature analysis using machine learning for fileless malware detection. The virtual machine acquired the memory dumps upon executing the malicious and non-malicious samples. Then the necessary features are extracted using the Volatility memory forensics tool, which is then analyzed using machine learning classification algorithms. After that, the best algorithm is selected based on the k-fold cross-validation score. Experimental evaluation has shown that Random Forest outperforms other machine learning classifiers (Decision Tree, Support Vector Machine, Logistic Regression, K-Nearest Neighbor, XGBoost, and Gradient Boosting). It achieved an overall accuracy of 93.33% with a True Positive Rate (TPR) of 87.5% at zeroFalse Positive Rate (FPR) for fileless malware collected from five widely used datasets (VirusShare, AnyRun, PolySwarm, HatchingTriage, and JoESadbox).
Published: 2023
Full Text: View/download PDF

12. A Comprehensive Predictive-Learning Framework for Optimal Scheduling and Control of Smart Home Appliances Based on User and Appliance Classification.

Author: Shafqat W, Lee KT, and Kim DH
Subjects: Algorithms, Machine Learning
Abstract: Energy consumption is increasing daily, and with that comes a continuous increase in energy costs. Predicting future energy consumption and building an effective energy management system for smart homes has become essential for many industrialists to solve the problem of energy wastage. Machine learning has shown significant outcomes in the field of energy management systems. This paper presents a comprehensive predictive-learning based framework for smart home energy management systems. We propose five modules: classification, prediction, optimization, scheduling, and controllers. In the classification module, we classify the category of users and appliances by using k-means clustering and support vector machine based classification. We predict the future energy consumption and energy cost for each user category using long-term memory in the prediction module. We define objective functions for optimization and use grey wolf optimization and particle swarm optimization for scheduling appliances. For each case, we give priority to user preferences and indoor and outdoor environmental conditions. We define control rules to control the usage of appliances according to the schedule while prioritizing user preferences and minimizing energy consumption and cost. We perform experiments to evaluate the performance of our proposed methodology, and the results show that our proposed approach significantly reduces energy cost while providing an optimized solution for energy consumption that prioritizes user preferences and considers both indoor and outdoor environmental factors.
Published: 2022
Full Text: View/download PDF

13. RHSOFS: Feature Selection Using the Rock Hyrax Swarm Optimization Algorithm for Credit Card Fraud Detection System.

Author: Padhi BK, Chakravarty S, Naik B, Pattanayak RM, and Das H
Subjects: Algorithms, Machine Learning
Abstract: In recent years, detecting credit card fraud transactions has been a difficult task due to the high dimensions and imbalanced datasets. Selecting a subset of important features from a high-dimensional dataset has proven to be the most prominent approach for solving high-dimensional dataset issues, and the selection of features is critical for improving classification performance, such as the fraud transaction identification process. To contribute to the field, this paper proposes a novel feature selection (FS) approach based on a metaheuristic algorithm called Rock Hyrax Swarm Optimization Feature Selection (RHSOFS), inspired by the actions of rock hyrax swarms in nature, and implements supervised machine learning techniques to improve credit card fraud transaction identification approaches. This approach is used to select a subset of optimal relevant features from a high-dimensional dataset. In a comparative efficiency analysis, RHSOFS is compared with Differential Evolutionary Feature Selection (DEFS), Genetic Algorithm Feature Selection (GAFS), Particle Swarm Optimization Feature Selection (PSOFS), and Ant Colony Optimization Feature Selection (ACOFS) in a comparative efficiency analysis. The proposed RHSOFS outperforms existing approaches, such as DEFS, GAFS, PSOFS, and ACOFS, according to the experimental results. Various statistical tests have been used to validate the statistical significance of the proposed model.
Published: 2022
Full Text: View/download PDF

14. Classification of Normal and Malicious Traffic Based on an Ensemble of Machine Learning for a Vehicle CAN-Network.

Author: Alalwany E and Mahgoub I
Subjects: Reproducibility of Results, Machine Learning, Algorithms
Abstract: Connectivity and automation have expanded with the development of autonomous vehicle technology. One of several automotive serial protocols that can be used in a wide range of vehicles is the controller area network (CAN). The growing functionality and connectivity of modern vehicles make them more vulnerable to cyberattacks aimed at vehicular networks. The CAN bus protocol is vulnerable to numerous attacks, as it is lacking security mechanisms by design. It is crucial to design intrusion detection systems (IDS) with high accuracy to detect attacks on the CAN bus. In this paper, we design an effective machine learning-based IDS scheme for binary classification that utilizes eight supervised ML algorithms, along with ensemble classifiers. The scheme achieved a higher effectiveness score in detecting normal and abnormal activities when trained with normal and malicious CAN traffic datasets. Random Forest, Decision Tree, and Xtreme Gradient Boosting classifiers provided the most accurate results. Then we evaluated three ensemble methods, voting, stacking, and bagging, for this classification task. The ensemble classifiers achieved better accuracy than the individual models, since ensemble learning strategies have superior performance through a combination of multiple learning mechanisms. These mechanisms have a varied range of capabilities that improve the prediction reliability while lowering the possibility of classification errors. Our model outperformed the most recent study that used the same dataset, with an accuracy of 0.984.
Published: 2022
Full Text: View/download PDF

15. Developing an Improved Ensemble Learning Approach for Predictive Maintenance in the Textile Manufacturing Process.

Author: Hung YH
Subjects: Data Science, Cloud Computing, Automation, Machine Learning, Algorithms
Abstract: With the rapid development of digital transformation, paper forms are digitalized as electronic forms (e-Forms). Existing data can be applied in predictive maintenance (PdM) for the enabling of intelligentization and automation manufacturing. This study aims to enhance the utilization of collected e-Form data though machine learning approaches and cloud computing to predict and provide maintenance actions. The ensemble learning approach (ELA) requires less computation time and has a simple hardware requirement; it is suitable for processing e-form data with specific attributes. This study proposed an improved ELA to predict the defective class of product data from a manufacturing site's work order form. This study proposed the resource dispatching approach to arrange data with the corresponding emailing resource for automatic notification. This study's novelty is the integration of cloud computing and an improved ELA for PdM to assist the textile product manufacturing process. The data analytics results show that the improved ensemble learning algorithm has over 98% accuracy and precision for defective product prediction. The validation results of the dispatching approach show that data can be correctly transmitted in a timely manner to the corresponding resource, along with a notification being sent to users.
Published: 2022
Full Text: View/download PDF

16. A Review on Machine Learning Applications for Solar Plants.

Author: Engel E and Engel N
Subjects: Reproducibility of Results, Sunlight, Nonlinear Dynamics, Machine Learning, Algorithms
Abstract: A solar plant system has complex nonlinear dynamics with uncertainties due to variations in system parameters and insolation. Thereby, it is difficult to approximate these complex dynamics with conventional algorithms whereas Machine Learning (ML) methods yield the essential performance required. ML models are key units in recent sensor systems for solar plant design, forecasting, maintenance, and control to provide the best safety, reliability, robustness, and performance as compared to classical methods which are usually employed in the hardware and software of solar plants. Considering this, the goal of our paper is to explore and analyze ML technologies and their advantages and shortcomings as compared to classical methods for the design, forecasting, maintenance, and control of solar plants. In contrast with other review articles, our research briefly summarizes our intelligent, self-adaptive models for sizing, forecasting, maintenance, and control of a solar plant; sets benchmarks for performance comparison of the reviewed ML models for a solar plant's system; proposes a simple but effective integration scheme of an ML sensor solar plant system's implementation and outlines its future digital transformation into a smart solar plant based on the integrated cutting-edge technologies; and estimates the impact of ML technologies based on the proposed scheme on a solar plant value chain.
Published: 2022
Full Text: View/download PDF

17. Machine Learning and Swarm Optimization Algorithm in Temperature Compensation of Pressure Sensors.

Author: Wang H and Li J
Subjects: Temperature, Algorithms, Machine Learning
Abstract: The main temperature compensation method for MEMS piezoresistive pressure sensors is software compensation, which processes the sensor data using various algorithms to improve the output accuracy. However, there are few algorithms designed for sensors with specific ranges, most of which ignore the operating characteristics of the sensors themselves. In this paper, we propose three temperature compensation methods based on swarm optimization algorithms fused with machine learning for three different ranges of sensors and explore the partitioning ratio of the calibration dataset on Sensor A. The results show that different algorithms are suitable for pressure sensors of different ranges. An optimal compensation effect was achieved on Sensor A when the splitting ratio was 33.3%, where the zero-drift coefficient was 2.88 × 10 -7 /°C and the sensitivity temperature coefficient was 4.52 × 10 -6 /°C. The algorithms were compared with other algorithms in the literature to verify their superiority. The optimal segmentation ratio obtained from the experimental investigation is consistent with the sensor operating temperature interval and exhibits a strong innovation.
Published: 2022
Full Text: View/download PDF

18. A Survey on Medical Explainable AI (XAI): Recent Progress, Explainability Approach, Human Interaction and Scoring System.

Author: Sheu RK and Pardeshi MS
Subjects: Humans, Machine Learning, Algorithms
Abstract: The emerging field of eXplainable AI (XAI) in the medical domain is considered to be of utmost importance. Meanwhile, incorporating explanations in the medical domain with respect to legal and ethical AI is necessary to understand detailed decisions, results, and current status of the patient's conditions. Successively, we will be presenting a detailed survey for the medical XAI with the model enhancements, evaluation methods, significant overview of case studies with open box architecture, medical open datasets, and future improvements. Potential differences in AI and XAI methods are provided with the recent XAI methods stated as (i) local and global methods for preprocessing, (ii) knowledge base and distillation algorithms, and (iii) interpretable machine learning. XAI characteristics details with future healthcare explainability is included prominently, whereas the pre-requisite provides insights for the brainstorming sessions before beginning a medical XAI project. Practical case study determines the recent XAI progress leading to the advance developments within the medical field. Ultimately, this survey proposes critical ideas surrounding a user-in-the-loop approach, with an emphasis on human-machine collaboration, to better produce explainable solutions. The surrounding details of the XAI feedback system for human rating-based evaluation provides intelligible insights into a constructive method to produce human enforced explanation feedback. For a long time, XAI limitations of the ratings, scores and grading are present. Therefore, a novel XAI recommendation system and XAI scoring system are designed and approached from this work. Additionally, this paper encourages the importance of implementing explainable solutions into the high impact medical field.
Published: 2022
Full Text: View/download PDF

19. A Systematic Review of Time Series Classification Techniques Used in Biomedical Applications.

Author: Wang WK, Chen I, Hershkovich L, Yang J, Shetty A, Singh G, Jiang Y, Kotla A, Shang JZ, Yerrabelli R, Roghanizad AR, Shandhi MMH, and Dunn J
Subjects: Humans, Smartphone, Time Factors, Algorithms, Machine Learning
Abstract: Background: Digital clinical measures collected via various digital sensing technologies such as smartphones, smartwatches, wearables, and ingestible and implantable sensors are increasingly used by individuals and clinicians to capture the health outcomes or behavioral and physiological characteristics of individuals. Time series classification (TSC) is very commonly used for modeling digital clinical measures. While deep learning models for TSC are very common and powerful, there exist some fundamental challenges. This review presents the non-deep learning models that are commonly used for time series classification in biomedical applications that can achieve high performance. Objective: We performed a systematic review to characterize the techniques that are used in time series classification of digital clinical measures throughout all the stages of data processing and model building. Methods: We conducted a literature search on PubMed, as well as the Institute of Electrical and Electronics Engineers (IEEE), Web of Science, and SCOPUS databases using a range of search terms to retrieve peer-reviewed articles that report on the academic research about digital clinical measures from a five-year period between June 2016 and June 2021. We identified and categorized the research studies based on the types of classification algorithms and sensor input types. Results: We found 452 papers in total from four different databases: PubMed, IEEE, Web of Science Database, and SCOPUS. After removing duplicates and irrelevant papers, 135 articles remained for detailed review and data extraction. Among these, engineered features using time series methods that were subsequently fed into widely used machine learning classifiers were the most commonly used technique, and also most frequently achieved the best performance metrics (77 out of 135 articles). Statistical modeling (24 out of 135 articles) algorithms were the second most common and also the second-best classification technique. Conclusions: In this review paper, summaries of the time series classification models and interpretation methods for biomedical applications are summarized and categorized. While high time series classification performance has been achieved in digital clinical, physiological, or biomedical measures, no standard benchmark datasets, modeling methods, or reporting methodology exist. There is no single widely used method for time series model development or feature interpretation, however many different methods have proven successful.
Published: 2022
Full Text: View/download PDF

20. Modeling DTA by Combining Multiple-Instance Learning with a Private-Public Mechanism.

Author: Wang C, Chen Y, Zhao L, Wang J, and Wen N
Subjects: Drug Development, Drug Discovery, Proteins, Algorithms, Machine Learning
Abstract: The prediction of the strengths of drug-target interactions, also called drug-target binding affinities (DTA), plays a fundamental role in facilitating drug discovery, where the goal is to find prospective drug candidates. With the increase in the number of drug-protein interactions, machine learning techniques, especially deep learning methods, have become applicable for drug-target interaction discovery because they significantly reduce the required experimental workload. In this paper, we present a spontaneous formulation of the DTA prediction problem as an instance of multi-instance learning. We address the problem in three stages, first organizing given drug and target sequences into instances via a private-public mechanism, then identifying the predicted scores of all instances in the same bag, and finally combining all the predicted scores as the output prediction. A comprehensive evaluation demonstrates that the proposed method outperforms other state-of-the-art methods on three benchmark datasets.
Published: 2022
Full Text: View/download PDF

21. A Probability-Based Models Ranking Approach: An Alternative Method of Machine-Learning Model Performance Assessment.

Author: Gajda S and Chlebus M
Subjects: Logistic Models, Algorithms, Machine Learning
Abstract: Performance measures are crucial in selecting the best machine learning model for a given problem. Estimating classical model performance measures by subsampling methods like bagging or cross-validation has several weaknesses. The most important ones are the inability to test the significance of the difference, and the lack of interpretability. Recently proposed Elo-based Predictive Power (EPP)-a meta-measure of machine learning model performance, is an attempt to address these weaknesses. However, the EPP is based on wrong assumptions, so its estimates may not be correct. This paper introduces the Probability-based Ranking Model Approach (PMRA), which is a modified EPP approach with a correction that makes its estimates more reliable. PMRA is based on the calculation of the probability that one model achieves a better result than another one, using the Mixed Effects Logistic Regression model. The empirical analysis was carried out on a real mortgage credits dataset. The analysis included a comparison of how the PMRA and state-of-the-art k-fold cross-validation ranked the 49 machine learning models, an example application of a novel method in hyperparameters tuning problem, and a comparison of PMRA and EPP indications. PMRA gives the opportunity to compare a newly developed algorithm to state-of-the-art algorithms based on statistical criteria. It is the solution to select the best hyperparameters configuration and to formulate criteria for the continuation of the hyperparameters space search.
Published: 2022
Full Text: View/download PDF

22. Hyper-Parameter Optimization of Stacked Asymmetric Auto-Encoders for Automatic Personality Traits Perception.

Author: Jalaeian Zaferani E, Teshnehlab M, Khodadadian A, Heitzinger C, Vali M, Noii N, and Wick T
Subjects: Perception, Personality, Probability, Algorithms, Machine Learning
Abstract: In this work, a method for automatic hyper-parameter tuning of the stacked asymmetric auto-encoder is proposed. In previous work, the deep learning ability to extract personality perception from speech was shown, but hyper-parameter tuning was attained by trial-and-error, which is time-consuming and requires machine learning knowledge. Therefore, obtaining hyper-parameter values is challenging and places limits on deep learning usage. To address this challenge, researchers have applied optimization methods. Although there were successes, the search space is very large due to the large number of deep learning hyper-parameters, which increases the probability of getting stuck in local optima. Researchers have also focused on improving global optimization methods. In this regard, we suggest a novel global optimization method based on the cultural algorithm, multi-island and the concept of parallelism to search this large space smartly. At first, we evaluated our method on three well-known optimization benchmarks and compared the results with recently published papers. Results indicate that the convergence of the proposed method speeds up due to the ability to escape from local optima, and the precision of the results improves dramatically. Afterward, we applied our method to optimize five hyper-parameters of an asymmetric auto-encoder for automatic personality perception. Since inappropriate hyper-parameters lead the network to over-fitting and under-fitting, we used a novel cost function to prevent over-fitting and under-fitting. As observed, the unweighted average recall (accuracy) was improved by 6.52% (9.54%) compared to our previous work and had remarkable outcomes compared to other published personality perception works.
Published: 2022
Full Text: View/download PDF

23. Towards Trustworthy Energy Disaggregation: A Review of Challenges, Methods, and Perspectives for Non-Intrusive Load Monitoring.

Author: Kaselimi M, Protopapadakis E, Voulodimos A, Doulamis N, and Doulamis A
Subjects: Physical Phenomena, Reproducibility of Results, Signal Processing, Computer-Assisted, Algorithms, Machine Learning
Abstract: Non-intrusive load monitoring (NILM) is the task of disaggregating the total power consumption into its individual sub-components. Over the years, signal processing and machine learning algorithms have been combined to achieve this. Many publications and extensive research works are performed on energy disaggregation or NILM for the state-of-the-art methods to reach the desired performance. The initial interest of the scientific community to formulate and describe mathematically the NILM problem using machine learning tools has now shifted into a more practical NILM. Currently, we are in the mature NILM period where there is an attempt for NILM to be applied in real-life application scenarios. Thus, the complexity of the algorithms, transferability, reliability, practicality, and, in general, trustworthiness are the main issues of interest. This review narrows the gap between the early immature NILM era and the mature one. In particular, the paper provides a comprehensive literature review of the NILM methods for residential appliances only. The paper analyzes, summarizes, and presents the outcomes of a large number of recently published scholarly articles. Furthermore, the paper discusses the highlights of these methods and introduces the research dilemmas that should be taken into consideration by researchers to apply NILM methods. Finally, we show the need for transferring the traditional disaggregation models into a practical and trustworthy framework.
Published: 2022
Full Text: View/download PDF

24. CTTGAN: Traffic Data Synthesizing Scheme Based on Conditional GAN.

Author: Wang J, Yan X, Liu L, Li L, and Yu Y
Subjects: Algorithms, Machine Learning
Abstract: Most machine learning algorithms only have a good recognition rate on balanced datasets. However, in the field of malicious traffic identification, benign traffic on the network is far greater than malicious traffic, and the network traffic dataset is imbalanced, which makes the algorithm have a low identification rate for small categories of malicious traffic samples. This paper presents a traffic sample synthesizing model named Conditional Tabular Traffic Generative Adversarial Network (CTTGAN), which uses a Conditional Tabular Generative Adversarial Network (CTGAN) algorithm to expand the small category traffic samples and balance the dataset in order to improve the malicious traffic identification rate. The CTTGAN model expands and recognizes feature data, which meets the requirements of a machine learning algorithm for training and prediction data. The contributions of this paper are as follows: first, the small category samples are expanded and the traffic dataset is balanced; second, the storage cost and computational complexity are reduced compared to models using image data; third, discrete variables and continuous variables in traffic feature data are processed at the same time, and the data distribution is described well. The experimental results show that the recognition rate of the expanded samples is more than 0.99 in MLP, KNN and SVM algorithms. In addition, the recognition rate of the proposed CTTGAN model is better than the oversampling and undersampling schemes.
Published: 2022
Full Text: View/download PDF

25. Online Domain Adaptation for Rolling Bearings Fault Diagnosis with Imbalanced Cross-Domain Data.

Author: Chao KC, Chou CB, and Lee CH
Subjects: Acclimatization, Information Storage and Retrieval, Algorithms, Machine Learning
Abstract: Traditional machine learning methods rely on the training data and target data having the same feature space and data distribution. The performance may be unacceptable if there is a difference in data distribution between the training and target data, which is called cross-domain learning problem. In recent years, many domain adaptation methods have been proposed to solve this kind of problems and make much progress. However, existing domain adaptation approaches have a common assumption that the number of the data in source domain (labeled data) and target domain (unlabeled data) is matched. In this paper, the scenarios in real manufacturing site are considered, that the target domain data is much less than source domain data at the beginning, but the number of target domain data will increase as time goes by. A novel method is proposed for fault diagnosis of rolling bearing with online imbalanced cross-domain data. Finally, the proposed method which is tested on bearing dataset (CWRU) has achieved prediction accuracy of 95.89% with only 40 target samples. The results have been compared with other traditional methods. The comparisons show that the proposed online domain adaptation fault diagnosis method has achieved significant improvements. In addition, the deep transfer learning model by adaptive- network-based fuzzy inference system (ANFIS) is introduced to interpretation the results.
Published: 2022
Full Text: View/download PDF

26. Few-Shot Text Classification with Global-Local Feature Information.

Author: Wang D, Wang Z, Cheng L, and Zhang W
Subjects: Vocabulary, Algorithms, Machine Learning
Abstract: Meta-learning frameworks have been proposed to generalize machine learning models for domain adaptation without sufficient label data in computer vision. However, text classification with meta-learning is less investigated. In this paper, we propose SumFS to find global top-ranked sentences by extractive summary and improve the local vocabulary category features. The SumFS consists of three modules: (1) an unsupervised text summarizer that removes redundant information; (2) a weighting generator that associates feature words with attention scores to weight the lexical representations of words; (3) a regular meta-learning framework that trains with limited labeled data using a ridge regression classifier. In addition, a marine news dataset was established with limited label data. The performance of the algorithm was tested on THUCnews, Fudan, and marine news datasets. Experiments show that the SumFS can maintain or even improve accuracy while reducing input features. Moreover, the training time of each epoch is reduced by more than 50%.
Published: 2022
Full Text: View/download PDF

27. Integration of Digital Twin, Machine-Learning and Industry 4.0 Tools for Anomaly Detection: An Application to a Food Plant.

Author: Tancredi GP, Vignali G, and Bottani E
Subjects: Humans, Plants, Edible, Algorithms, Machine Learning
Abstract: This work describes a structured solution that integrates digital twin models, machine-learning algorithms, and Industry 4.0 technologies (Internet of Things in particular) with the ultimate aim of detecting the presence of anomalies in the functioning of industrial systems. The proposed solution has been designed to be suitable for implementation in industrial plants not directly designed for Industry 4.0 applications. More precisely, this manuscript delineates an approach for implementing three machine-learning algorithms into a digital twin environment and then applying them to a real plant. This paper is based on two previous studies in which the digital twin environment was first developed for the industrial plant under investigation, and then used for monitoring selected plant parameters. Findings from the previous studies are exploited in this work and advanced by implementing and testing the machine-learning algorithms. The results show that two out of the three machine-learning algorithms are effective enough in predicting anomalies, thus suggesting their implementation for enhancing the safety of employees working at industrial plants.
Published: 2022
Full Text: View/download PDF

28. Exploiting Concepts of Instance Segmentation to Boost Detection in Challenging Environments.

Author: Hashmi KA, Pagani A, Liwicki M, Stricker D, and Afzal MZ
Subjects: Face, Algorithms, Machine Learning
Abstract: In recent years, due to the advancements in machine learning, object detection has become a mainstream task in the computer vision domain. The first phase of object detection is to find the regions where objects can exist. With the improvements in deep learning, traditional approaches, such as sliding windows and manual feature selection techniques, have been replaced with deep learning techniques. However, object detection algorithms face a problem when performed in low light, challenging weather, and crowded scenes, similar to any other task. Such an environment is termed a challenging environment. This paper exploits pixel-level information to improve detection under challenging situations. To this end, we exploit the recently proposed hybrid task cascade network. This network works collaboratively with detection and segmentation heads at different cascade levels. We evaluate the proposed methods on three complex datasets of ExDark, CURE-TSD, and RESIDE, and achieve a mAP of 0.71, 0.52, and 0.43, respectively. Our experimental results assert the efficacy of the proposed approach.
Published: 2022
Full Text: View/download PDF

29. Feature Optimization Method of Material Identification for Loose Particles Inside Sealed Relays.

Author: Sun Z, Jiang A, Wang G, Zhang M, and Yan H
Subjects: Reproducibility of Results, Algorithms, Machine Learning
Abstract: Existing material identification for loose particles inside sealed relays focuses on the selection and optimization of classification algorithms, which ignores the features in the material dataset. In this paper, we propose a feature optimization method of material identification for loose particles inside sealed relays. First, for the missing value problem, multiple methods were used to process the material dataset. By comparing the identification accuracy achieved by a Random-Forest-based classifier (RF classifier) on the different processed datasets, the optimal direct-discarding method was obtained. Second, for the uneven data distribution problem, multiple methods were used to process the material dataset. By comparing the achieved identification accuracy, the optimal min-max standardization method was obtained. Then, for the feature selection problem, an innovative multi-index-fusion feature selection method was designed, and its superiority was verified through several tests. Test results show that the identification accuracy achieved by RF classifier on the dataset was improved from 59.63% to 63.60%. Test results of ten material verification datasets show that the identification accuracies achieved by RF classifier were greatly improved, with an average improvement of 3.01%. This strongly promotes research progress in loose particle material identification and is an important supplement to existing loose particle detection research. This is also the highest loose particle material identification accuracy achieved to in aerospace engineering, which has important practical value for improving the reliability of aerospace systems. Theoretically, it can be applied to feature optimization in machine learning.
Published: 2022
Full Text: View/download PDF

30. A Review on Federated Learning and Machine Learning Approaches: Categorization, Application Areas, and Blockchain Technology.

Author: Ogundokun, Roseline Oluwaseun, Misra, Sanjay, Maskeliunas, Rytis, and Damasevicius, Robertas
Subjects: BLOCKCHAINS, ARTIFICIAL intelligence, MACHINE learning, CONFERENCE papers, ALGORITHMS, SCIENCE publishing
Abstract: Federated learning (FL) is a scheme in which several consumers work collectively to unravel machine learning (ML) problems, with a dominant collector synchronizing the procedure. This decision correspondingly enables the training data to be distributed, guaranteeing that the individual device's data are secluded. The paper systematically reviewed the available literature using the Preferred Reporting Items for Systematic Review and Meta-analysis (PRISMA) guiding principle. The study presents a systematic review of appliable ML approaches for FL, reviews the categorization of FL, discusses the FL application areas, presents the relationship between FL and Blockchain Technology (BT), and discusses some existing literature that has used FL and ML approaches. The study also examined applicable machine learning models for federated learning. The inclusion measures were (i) published between 2017 and 2021, (ii) written in English, (iii) published in a peer-reviewed scientific journal, and (iv) Preprint published papers. Unpublished studies, thesis and dissertation studies, (ii) conference papers, (iii) not in English, and (iv) did not use artificial intelligence models and blockchain technology were all removed from the review. In total, 84 eligible papers were finally examined in this study. Finally, in recent years, the amount of research on ML using FL has increased. Accuracy equivalent to standard feature-based techniques has been attained, and ensembles of many algorithms may yield even better results. We discovered that the best results were obtained from the hybrid design of an ML ensemble employing expert features. However, some additional difficulties and issues need to be overcome, such as efficiency, complexity, and smaller datasets. In addition, novel FL applications should be investigated from the standpoint of the datasets and methodologies. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

31. Zero-Day Malware Detection and Effective Malware Analysis Using Shapley Ensemble Boosting and Bagging Approach.

Author: Kumar R and Subbiah G
Subjects: Computer Security, Data Collection, Software, Algorithms, Machine Learning
Abstract: Software products from all vendors have vulnerabilities that can cause a security concern. Malware is used as a prime exploitation tool to exploit these vulnerabilities. Machine learning (ML) methods are efficient in detecting malware and are state-of-art. The effectiveness of ML models can be augmented by reducing false negatives and false positives. In this paper, the performance of bagging and boosting machine learning models is enhanced by reducing misclassification. Shapley values of features are a true representation of the amount of contribution of features and help detect top features for any prediction by the ML model. Shapley values are transformed to probability scale to correlate with a prediction value of ML model and to detect top features for any prediction by a trained ML model. The trend of top features derived from false negative and false positive predictions by a trained ML model can be used for making inductive rules. In this work, the best performing ML model in bagging and boosting is determined by the accuracy and confusion matrix on three malware datasets from three different periods. The best performing ML model is used to make effective inductive rules using waterfall plots based on the probability scale of features. This work helps improve cyber security scenarios by effective detection of false-negative zero-day malware.
Published: 2022
Full Text: View/download PDF

32. A Systematic Literature Review on Distributed Machine Learning in Edge Computing.

Author: Filho CP, Marques E Jr, Chang V, Dos Santos L, Bernardini F, Pires PF, Ochi L, and Delicato FC
Subjects: Intelligence, Publications, Algorithms, Machine Learning
Abstract: Distributed edge intelligence is a disruptive research area that enables the execution of machine learning and deep learning (ML/DL) algorithms close to where data are generated. Since edge devices are more limited and heterogeneous than typical cloud devices, many hindrances have to be overcome to fully extract the potential benefits of such an approach (such as data-in-motion analytics). In this paper, we investigate the challenges of running ML/DL on edge devices in a distributed way, paying special attention to how techniques are adapted or designed to execute on these restricted devices. The techniques under discussion pervade the processes of caching, training, inference, and offloading on edge devices. We also explore the benefits and drawbacks of these strategies.
Published: 2022
Full Text: View/download PDF

33. MFDroid: A Stacking Ensemble Learning Framework for Android Malware Detection.

Author: Wang X, Zhang L, Zhao K, Ding X, and Yu M
Subjects: Privacy, Software, Algorithms, Machine Learning
Abstract: As Android is a popular a mobile operating system, Android malware is on the rise, which poses a great threat to user privacy and security. Considering the poor detection effects of the single feature selection algorithm and the low detection efficiency of traditional machine learning methods, we propose an Android malware detection framework based on stacking ensemble learning-MFDroid-to identify Android malware. In this paper, we used seven feature selection algorithms to select permissions, API calls, and opcodes, and then merged the results of each feature selection algorithm to obtain a new feature set. Subsequently, we used this to train the base learner, and set the logical regression as a meta-classifier, to learn the implicit information from the output of base learners and obtain the classification results. After the evaluation, the F1-score of MFDroid reached 96.0%. Finally, we analyzed each type of feature to identify the differences between malicious and benign applications. At the end of this paper, we present some general conclusions. In recent years, malicious applications and benign applications have been similar in terms of permission requests. In other words, the model of training, only with permission, can no longer effectively or efficiently distinguish malicious applications from benign applications.
Published: 2022
Full Text: View/download PDF

34. A Novel Framework for Generating Personalized Network Datasets for NIDS Based on Traffic Aggregation.

Author: Velarde-Alvarado P, Gonzalez H, Martínez-Peláez R, Mena LJ, Ochoa-Brust A, Moreno-García E, Félix VG, and Ostos R
Subjects: Research Design, Algorithms, Machine Learning
Abstract: In this paper, we addressed the problem of dataset scarcity for the task of network intrusion detection. Our main contribution was to develop a framework that provides a complete process for generating network traffic datasets based on the aggregation of real network traces. In addition, we proposed a set of tools for attribute extraction and labeling of traffic sessions. A new dataset with botnet network traffic was generated by the framework to assess our proposed method with machine learning algorithms suitable for unbalanced data. The performance of the classifiers was evaluated in terms of macro-averages of F 1-score (0.97) and the Matthews Correlation Coefficient (0.94), showing a good overall performance average.
Published: 2022
Full Text: View/download PDF

35. The Application of Machine Learning ICA-VMD in an Intelligent Diagnosis System in a Low SNR Environment.

Author: Lin SL
Subjects: Forecasting, Normal Distribution, Signal-To-Noise Ratio, Algorithms, Machine Learning
Abstract: This paper proposes a new method called independent component analysis-variational mode decomposition (ICA-VMD), which combines ICA and VMD. The purpose is to study the application of ICA-VMD in low signal-to-noise ratio (SNR) signal processing and data analysis. ICA is a very important method in the field of machine learning. It is an unsupervised learning algorithm that can dig out the independent factors hidden in the observation signal. The VMD method estimates each signal component by solving the frequency domain variational optimization problem, and it is very suitable for mechanical fault diagnosis. The advantage of ICA-VMD is that it requires two sensory cues to distinguish the original source from the unwanted noise. In the three cases studied here, the original source was first contaminated by white Gaussian noise. The three cases in this study are under different SNR conditions. The SNR in the first case is -6.46 dB, the SNR in the second case is -21.3728, and the SNR in the third case is -46.8177. The simulation results show that the ICA-VMD method can effectively recover the original source from the contaminated data. It is hoped that, in the future, there will be new discoveries and advances in science and technology to solve the noise interference problem through this method.
Published: 2021
Full Text: View/download PDF

36. Improved Accuracy in Predicting the Best Sensor Fusion Architecture for Multiple Domains.

Author: Molino-Minero-Re E, Aguileta AA, Brena RF, and Garcia-Ceja E
Subjects: Reproducibility of Results, Algorithms, Machine Learning
Abstract: Multi-sensor fusion intends to boost the general reliability of a decision-making procedure or allow one sensor to compensate for others' shortcomings. This field has been so prominent that authors have proposed many different fusion approaches, or "architectures" as we call them when they are structurally different, so it is now challenging to prescribe which one is better for a specific collection of sensors and a particular application environment, other than by trial and error. We propose an approach capable of predicting the best fusion architecture (from predefined options) for a given dataset. This method involves the construction of a meta-dataset where statistical characteristics from the original dataset are extracted. One challenge is that each dataset has a different number of variables (columns). Previous work took the principal component analysis's first k components to make the meta-dataset columns coherent and trained machine learning classifiers to predict the best fusion architecture. In this paper, we take a new route to build the meta-dataset. We use the Sequential Forward Floating Selection algorithm and a T transform to reduce the features and match them to a given number, respectively. Our findings indicate that our proposed method could improve the accuracy in predicting the best sensor fusion architecture for multiple domains.
Published: 2021
Full Text: View/download PDF

37. Novel Prediction of Diagnosis Effectiveness for Adaptation of the Spectral Kurtosis Technology to Varying Operating Conditions.

Author: Kolbe S, Gelman L, and Ball A
Subjects: Technology, Vibration, Algorithms, Machine Learning
Abstract: In this paper, two novel consistency vectors are proposed, which when combined with appropriate machine learning algorithms, can be used to adapt the Spectral Kurtosis technology for optimum gearbox damage diagnosis in varying operating conditions. Much of the existing research in the field is limited to test apparatus run in constant and carefully controlled operating conditions, and the authors have previously publicised that the Spectral Kurtosis technology requires adaptation to achieve the highest possible probabilities of correct diagnosis when a gearbox is run in non-stationary conditions of speed and load. However, the authors' previous adaptation has been computationally heavy using a brute-force approach unsuited to online use, and therefore, created the requirement to develop these two newly proposed vectors and allow computationally lighter techniques more suited to online condition monitoring. The new vectors are demonstrated and experimentally validated on vibration data collected from a gearbox run in multiple combinations of operating conditions; for the first time, the two consistency vectors are used to predict diagnosis effectiveness, with the comparison and proof of relative gains between the traditional and novel techniques discussed. Consistency calculations are computationally light and thus, many combinations of Spectral Kurtosis technology parameters can be evaluated on a dataset in a very short time. This study shows that machine learning can predict the total probability of correct diagnosis from the consistency values and this can quickly provide pre-adaptation/prediction of optimum Spectral Kurtosis technology parameters for a dataset. The full adaptation and damage evaluation process, which is computationally heavier, can then be undertaken on a much lower number of combinations of Spectral Kurtosis resolution and threshold.
Published: 2021
Full Text: View/download PDF

38. Exploring the Predictability of Temperatures in a Scaled Model of a Smarthome.

Author: Burns T, Fichthorn G, Ling J, Zehtabian S, Bacanlı SS, Bölöni L, and Turgut D
Subjects: Air Conditioning, Heating, Temperature, Algorithms, Machine Learning
Abstract: In modern smarthomes, temperature regulation is achieved through a mix of traditional and emergent technologies including air conditioning, heating, intelligent utilization of the effects of sun, wind, and shade as well as using stored heat and cold. To achieve the desired comfort for the inhabitants while minimizing environmental impact and cost, the home controller must predict how its actions will impact the temperature and other environmental factors in various parts of the home. The question we are investigating in this paper is whether the temperature values in different rooms in a home are predictable based on readings from sensors in the home. We are also interested in whether increased accuracy can be achieved by adding sensors to capture the state of doors and windows of the given room and/or the whole home, and what type of machine learning algorithms can take advantage of the additional information. As experimentation on real-world homes is highly expensive, we use ScaledHome, a 1:12 scale, IoT-enabled model of a smart home for data acquisition. Our experiments show that while additional data can improve the accuracy of the prediction, the type of machine learning models needs to be carefully adapted to the number of data features available.
Published: 2021
Full Text: View/download PDF

39. The Algorithm of a Game-Based System in the Relation between an Operator and a Technical Object in Management of E-Commerce Logistics Processes with the Use of Machine Learning.

Author: Miler RK, Kuriata A, Brzozowska A, Akoel A, and Kalinichenko A
Subjects: Bayes Theorem, Commerce, Computer Simulation, Humans, Algorithms, Machine Learning
Abstract: Machine learning (ML) is applied in various logistic processes utilizing innovative techniques (e.g., the use of drones for automated delivery in e-commerce). Early challenges showed the insufficient drones' steering capacity and cognitive gap related to the lack of theoretical foundation for controlling algorithms. The aim of this paper is to present a game-based algorithm of controlling behaviours in the relation between an operator (OP) and a technical object (TO), based on the assumption that the game is logistics-oriented and the algorithm is to support ML applied in e-commerce optimization management. Algebraic methods, including matrices, Lagrange functions, systems of differential equations, and set-theoretic notation, have been used as the main tools. The outcome is a model of a game-based optimization process in a two-element logistics system and an algorithm applied to find optimal steering strategies. The algorithm has been initially verified with the use of simulation based on a Bayesian network (BN) and a structured set of possible strategies (OP/TO) calculated with the use of QGeNie Modeller, finally prepared for Python. It has been proved the algorithm at this stage has no deadlocks and unforeseen loops and is ready to be challenged with the original big set of learning data from a drone-operating company (as the next stage of the planned research).
Published: 2021
Full Text: View/download PDF

40. MIND: A Multi-Source Data Fusion Scheme for Intrusion Detection in Networks.

Author: Anjum N, Latif Z, Lee C, Shoukat IA, and Iqbal U
Subjects: Algorithms, Machine Learning
Abstract: In recent years, there is an exponential explosion of data generation, collection, and processing in computer networks. With this expansion of data, network attacks have also become a congenital problem in complex networks. The resource utilization, complexity, and false alarm rates are major challenges in current Network Intrusion Detection Systems (NIDS). The data fusion technique is an emerging technology that merges data from multiple sources to form more certain, precise, informative, and accurate data. Moreover, most of the earlier intrusion detection models suffer from overfitting problems and lack optimal detection of intrusions. In this paper, we propose a multi-source data fusion scheme for intrusion detection in networks ( MIND ) , where data fusion is performed by the horizontal emergence of two datasets. For this purpose, the Hadoop MapReduce tool such as, Hive is used. In addition, a machine learning ensemble classifier is used for the fused dataset with fewer parameters. Finally, the proposed model is evaluated with a 10-fold-cross validation technique. The experiments show that the average accuracy , detection rate , false positive rate , true positive rate , and F-measure are 99.80% , 99.80% , 0.29% , 99.85% , and 99.82% respectively. Moreover, the results indicate that the proposed model is significantly effective in intrusion detection compared to other state-of-the-art methods.
Published: 2021
Full Text: View/download PDF

41. Diagnostic of Operation Conditions and Sensor Faults Using Machine Learning in Sucker-Rod Pumping Wells.

Author: Nascimento J, Maitelli A, Maitelli C, and Cavalcanti A
Subjects: Brazil, Humans, Algorithms, Machine Learning
Abstract: In sucker-rod pumping wells, due to the lack of an early diagnosis of operating condition or sensor faults, several problems can go unnoticed. These problems can increase downtime and production loss. In these wells, the diagnosis of operation conditions is carried out through downhole dynamometer cards, via pre-established patterns, with human visual effort in the operation centers. Starting with machine learning algorithms, several papers have been published on the subject, but it is still common to have doubts concerning the difficulty level of the dynamometer card classification task and best practices for solving the problem. In the search for answers to these questions, this work carried out sixty tests with more than 50,000 dynamometer cards from 38 wells in the Mossoró, RN, Brazil. In addition, it presented test results for three algorithms (decision tree, random forest and XGBoost), three descriptors (Fourier, wavelet and card load values), as well as pipelines provided by automated machine learning. Tests with and without the tuning of hypermeters, different levels of dataset balancing and various evaluation metrics were evaluated. The research shows that it is possible to detect sensor failures from dynamometer cards. Of the results that will be presented, 75% of the tests had an accuracy above 92% and the maximum accuracy was 99.84%.
Published: 2021
Full Text: View/download PDF

42. Image Sensors for Wave Monitoring in Shore Protection: Characterization through a Machine Learning Algorithm.

Author: Lay-Ekuakille A, Djungha Okitadiowo JP, Di Luccio D, Palmisano M, Budillon G, Benassai G, and Maggi S
Subjects: Monitoring, Physiologic, Video Recording, Algorithms, Machine Learning
Abstract: Waves propagating on the water surface can be considered as propagating in a dispersive medium, where gravity and surface tension at the air-water interface act as restoring forces. The velocity at which energy is transported in water waves is defined by the group velocity. The paper reports the use of video-camera observations to study the impact of water waves on an urban shore. The video-monitoring system consists of two separate cameras equipped with progressive RGB CMOS sensors that allow 1080p HDTV video recording. The sensing system delivers video signals that are processed by a machine learning technique. The scope of the research is to identify features of water waves that cannot be normally observed. First, a conventional modelling was performed using data delivered by image sensors together with additional data such as temperature, and wind speed, measured with dedicated sensors. Stealth waves are detected, as are the inverting phenomena encompassed in waves. This latter phenomenon can be detected only through machine learning. This double approach allows us to prevent extreme events that can take place in offshore and onshore areas.
Published: 2021
Full Text: View/download PDF

43. New Results on Radioactive Mixture Identification and Relative Count Contribution Estimation.

Author: Ayhan B and Kwan C
Subjects: Software, Algorithms, Machine Learning
Abstract: Detecting nuclear materials in mixtures is challenging due to low concentration, environmental factors, sensor noise, source-detector distance variations, and others. This paper presents new results on nuclear material identification and relative count contribution (also known as mixing ratio) estimation for mixtures of materials in which there are multiple isotopes present. Conventional and deep-learning-based machine learning algorithms were compared. Realistic simulated data using Gamma Detector Response and Analysis Software (GADRAS) were used in our comparative studies. It was observed that a deep learning approach is highly promising.
Published: 2021
Full Text: View/download PDF

44. On the Unfounded Enthusiasm for Soft Selective Sweeps III: The Supervised Machine Learning Algorithm That Isn't.

Author: Elhaik E and Graur D
Subjects: Evolution, Molecular, Genetic Drift, Humans, Adaptation, Physiological, Algorithms, Artificial Intelligence, Genetics, Population, Genome, Human, Machine Learning, Selection, Genetic
Abstract: In the last 15 years or so, soft selective sweep mechanisms have been catapulted from a curiosity of little evolutionary importance to a ubiquitous mechanism claimed to explain most adaptive evolution and, in some cases, most evolution. This transformation was aided by a series of articles by Daniel Schrider and Andrew Kern. Within this series, a paper entitled "Soft sweeps are the dominant mode of adaptation in the human genome" (Schrider and Kern, Mol. Biol. Evolut . 2017 , 34 (8), 1863-1877) attracted a great deal of attention, in particular in conjunction with another paper (Kern and Hahn, Mol. Biol. Evolut . 2018 , 35 (6), 1366-1371), for purporting to discredit the Neutral Theory of Molecular Evolution (Kimura 1968). Here, we address an alleged novelty in Schrider and Kern's paper, i.e., the claim that their study involved an artificial intelligence technique called supervised machine learning (SML). SML is predicated upon the existence of a training dataset in which the correspondence between the input and output is known empirically to be true. Curiously, Schrider and Kern did not possess a training dataset of genomic segments known a priori to have evolved either neutrally or through soft or hard selective sweeps. Thus, their claim of using SML is thoroughly and utterly misleading. In the absence of legitimate training datasets, Schrider and Kern used: (1) simulations that employ many manipulatable variables and (2) a system of data cherry-picking rivaling the worst excesses in the literature. These two factors, in addition to the lack of negative controls and the irreproducibility of their results due to incomplete methodological detail, lead us to conclude that all evolutionary inferences derived from so-called SML algorithms (e.g., S/HIC) should be taken with a huge shovel of salt.
Published: 2021
Full Text: View/download PDF

45. A Comparative Survey of Feature Extraction and Machine Learning Methods in Diverse Acoustic Environments.

Author: Bonet-Solà D and Alsina-Pagès RM
Subjects: Humans, Neural Networks, Computer, Normal Distribution, Reproducibility of Results, Acoustics, Algorithms, Artificial Intelligence, Machine Learning
Abstract: Acoustic event detection and analysis has been widely developed in the last few years for its valuable application in monitoring elderly or dependant people, for surveillance issues, for multimedia retrieval, or even for biodiversity metrics in natural environments. For this purpose, sound source identification is a key issue to give a smart technological answer to all the aforementioned applications. Diverse types of sounds and variate environments, together with a number of challenges in terms of application, widen the choice of artificial intelligence algorithm proposal. This paper presents a comparative study on combining several feature extraction algorithms (Mel Frequency Cepstrum Coefficients (MFCC), Gammatone Cepstrum Coefficients (GTCC), and Narrow Band (NB)) with a group of machine learning algorithms ( k -Nearest Neighbor (kNN), Neural Networks (NN), and Gaussian Mixture Model (GMM)), tested over five different acoustic environments. This work has the goal of detailing a best practice method and evaluate the reliability of this general-purpose algorithm for all the classes. Preliminary results show that most of the combinations of feature extraction and machine learning present acceptable results in most of the described corpora. Nevertheless, there is a combination that outperforms the others: the use of GTCC together with kNN, and its results are further analyzed for all the corpora.
Published: 2021
Full Text: View/download PDF

46. Multi-Scale Frequency Bands Ensemble Learning for EEG-Based Emotion Recognition.

Author: Shen F, Peng Y, Kong W, and Dai G
Subjects: Brain, Emotions, Humans, Algorithms, Electroencephalography, Machine Learning
Abstract: Emotion recognition has a wide range of potential applications in the real world. Among the emotion recognition data sources, electroencephalography (EEG) signals can record the neural activities across the human brain, providing us a reliable way to recognize the emotional states. Most of existing EEG-based emotion recognition studies directly concatenated features extracted from all EEG frequency bands for emotion classification. This way assumes that all frequency bands share the same importance by default; however, it cannot always obtain the optimal performance. In this paper, we present a novel multi-scale frequency bands ensemble learning (MSFBEL) method to perform emotion recognition from EEG signals. Concretely, we first re-organize all frequency bands into several local scales and one global scale. Then we train a base classifier on each scale. Finally we fuse the results of all scales by designing an adaptive weight learning method which automatically assigns larger weights to more important scales to further improve the performance. The proposed method is validated on two public data sets. For the "SEED IV" data set, MSFBEL achieves average accuracies of 82.75%, 87.87%, and 78.27% on the three sessions under the within-session experimental paradigm. For the "DEAP" data set, it obtains average accuracy of 74.22% for four-category classification under 5-fold cross validation. The experimental results demonstrate that the scale of frequency bands influences the emotion recognition rate, while the global scale that directly concatenating all frequency bands cannot always guarantee to obtain the best emotion recognition performance. Different scales provide complementary information to each other, and the proposed adaptive weight learning method can effectively fuse them to further enhance the performance.
Published: 2021
Full Text: View/download PDF

47. Elbow Motion Trajectory Prediction Using a Multi-Modal Wearable System: A Comparative Analysis of Machine Learning Techniques.

Author: Little K, K Pappachan B, Yang S, Noronha B, Campolo D, and Accoto D
Subjects: Adult, Biomechanical Phenomena, Electromyography, Female, Humans, Male, Range of Motion, Articular, Signal Processing, Computer-Assisted, Algorithms, Elbow physiology, Machine Learning, Wearable Electronic Devices
Abstract: Motion intention detection is fundamental in the implementation of human-machine interfaces applied to assistive robots. In this paper, multiple machine learning techniques have been explored for creating upper limb motion prediction models, which generally depend on three factors: the signals collected from the user (such as kinematic or physiological), the extracted features and the selected algorithm. We explore the use of different features extracted from various signals when used to train multiple algorithms for the prediction of elbow flexion angle trajectories. The accuracy of the prediction was evaluated based on the mean velocity and peak amplitude of the trajectory, which are sufficient to fully define it. Results show that prediction accuracy when using solely physiological signals is low, however, when kinematic signals are included, it is largely improved. This suggests kinematic signals provide a reliable source of information for predicting elbow trajectories. Different models were trained using 10 algorithms. Regularization algorithms performed well in all conditions, whereas neural networks performed better when the most important features are selected. The extensive analysis provided in this study can be consulted to aid in the development of accurate upper limb motion intention detection models.
Published: 2021
Full Text: View/download PDF

48. Smartphone Motion Sensor-Based Complex Human Activity Identification Using Deep Stacked Autoencoder Algorithm for Enhanced Smart Healthcare System.

Author: Alo UR, Nweke HF, Teh YW, and Murtaza G
Subjects: Delivery of Health Care, Humans, Algorithms, Human Activities, Machine Learning, Movement, Smartphone
Abstract: Human motion analysis using a smartphone-embedded accelerometer sensor provided important context for the identification of static, dynamic, and complex sequence of activities. Research in smartphone-based motion analysis are implemented for tasks, such as health status monitoring, fall detection and prevention, energy expenditure estimation, and emotion detection. However, current methods, in this regard, assume that the device is tightly attached to a pre-determined position and orientation, which might cause performance degradation in accelerometer data due to changing orientation. Therefore, it is challenging to accurately and automatically identify activity details as a result of the complexity and orientation inconsistencies of the smartphone. Furthermore, the current activity identification methods utilize conventional machine learning algorithms that are application dependent. Moreover, it is difficult to model the hierarchical and temporal dynamic nature of the current, complex, activity identification process. This paper aims to propose a deep stacked autoencoder algorithm, and orientation invariant features, for complex human activity identification. The proposed approach is made up of various stages. First, we computed the magnitude norm vector and rotation feature (pitch and roll angles) to augment the three-axis dimensions (3-D) of the accelerometer sensor. Second, we propose a deep stacked autoencoder based deep learning algorithm to automatically extract compact feature representation from the motion sensor data. The results show that the proposed integration of the deep learning algorithm, and orientation invariant features, can accurately recognize complex activity details using only smartphone accelerometer data. The proposed deep stacked autoencoder method achieved 97.13% identification accuracy compared to the conventional machine learning methods and the deep belief network algorithm. The results suggest the impact of the proposed method to improve a smartphone-based complex human activity identification framework.
Published: 2020
Full Text: View/download PDF

49. Analysis of Cattle Social Transitional Behaviour: Attraction and Repulsion.

Author: Xu H, Li S, Lee C, Ni W, Abbott D, Johnson M, Lea JM, Yuan J, and Campbell DLM
Subjects: Animals, Cattle, Cluster Analysis, Unsupervised Machine Learning, Algorithms, Machine Learning, Social Behavior
Abstract: Understanding social interactions in livestock groups could improve management practices, but this can be difficult and time-consuming using traditional methods of live observations and video recordings. Sensor technologies and machine learning techniques could provide insight not previously possible. In this study, based on the animals' location information acquired by a new cooperative wireless localisation system, unsupervised machine learning approaches were performed to identify the social structure of a small group of cattle yearlings (n=10) and the social behaviour of an individual. The paper first defined the affinity between an animal pair based on the ranks of their distance. Unsupervised clustering algorithms were then performed, including K-means clustering and agglomerative hierarchical clustering. In particular, K-means clustering was applied based on logical and physical distance. By comparing the clustering result based on logical distance and physical distance, the leader animals and the influence of an individual in a herd of cattle were identified, which provides valuable information for studying the behaviour of animal herds. Improvements in device robustness and replication of this work would confirm the practical application of this technology and analysis methodologies.
Published: 2020
Full Text: View/download PDF

50. Algorithmic Exploitation in Social Media Human Trafficking and Strategies for Regulation.

Author: Moore, Derek M.
Subjects: SOCIAL media, TRAFFIC regulations, HUMAN trafficking, THEMATIC analysis, MACHINE learning, RESEARCH personnel, EXPLOITATION of humans
Abstract: Human trafficking thrives in the shadows, and the rise of social media has provided traffickers with a powerful and unregulated tool. This paper delves into how these criminals exploit online platforms to target and manipulate vulnerable populations. A thematic analysis of existing research explores the tactics used by traffickers on social media, revealing how algorithms can be manipulated to facilitate exploitation. Furthermore, the paper examines the limitations of current regulations in tackling this online threat. The research underscores the urgent need for collaboration between governments and researchers to combat algorithmic exploitation. By harnessing data analysis and machine learning, proactive strategies can be developed to disrupt trafficking networks and protect those most at risk. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Journal

Region

Database

1,360 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources