414 results
Search Results
2. Wearable IoT Sensor Combining Deep Learning for Enhanced Human Activity Recognition in Indoor and Outdoor Settings
- Author
-
Mhalla, Ala, Favreau, Jean-Marie, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Habachi, Oussama, editor, Chalhoub, Gerard, editor, Elbiaze, Halima, editor, and Sabir, Essaid, editor
- Published
- 2024
- Full Text
- View/download PDF
3. Calorie Measurement and Food Recognition Using Machine Learning
- Author
-
Peerzade, Muskan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Rajagopal, Sridaran, editor, Popat, Kalpesh, editor, Meva, Divyakant, editor, and Bajeja, Sunil, editor
- Published
- 2024
- Full Text
- View/download PDF
4. Exploring Imaging Biomarkers for Early Detection of Alzheimer’s Disease Using Deep Learning: A Comprehensive Analysis
- Author
-
Sami, Nahid, Makkar, Aaisha, Meziane, Farid, Conway, Myra, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Santosh, KC, editor, Makkar, Aaisha, editor, Conway, Myra, editor, Singh, Ashutosh K., editor, Vacavant, Antoine, editor, Abou el Kalam, Anas, editor, Bouguelia, Mohamed-Rafik, editor, and Hegadi, Ravindra, editor
- Published
- 2024
- Full Text
- View/download PDF
5. Tuning of Hyperparameters and CNN Architecture to Detect Phone Usage During Driving
- Author
-
Bhardwaj, Nishant, Yadav, Ayushi, Daniel, Sunita, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Challa, Rama Krishna, editor, Aujla, Gagangeet Singh, editor, Mathew, Lini, editor, Kumar, Amod, editor, Kalra, Mala, editor, Shimi, S. L., editor, Saini, Garima, editor, and Sharma, Kanika, editor
- Published
- 2024
- Full Text
- View/download PDF
6. Deep Learning-Based Intelligent GUI Tool For Skin Disease Diagnosis System
- Author
-
Karmakar, Mithun, Mondal, Subhash, Nag, Amitava, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Dasgupta, Kousik, editor, Mukhopadhyay, Somnath, editor, Mandal, Jyotsna K., editor, and Dutta, Paramartha, editor
- Published
- 2024
- Full Text
- View/download PDF
7. Quantum-DL Integration for Precise Remote Sensing Image Classification
- Author
-
Duchal, Harshad, Salve, Selvin, Tamboli, Alfiya, Gosavi, Atharva, Ghorpade, Pradip, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Carette, Jacques, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Owoc, Mieczyslaw Lech, editor, Varghese Sicily, Felix Enigo, editor, Rajaram, Kanchana, editor, and Balasundaram, Prabavathy, editor
- Published
- 2024
- Full Text
- View/download PDF
8. Driver’s Distraction Detection via Hybrid CNN-LSTM
- Author
-
Hemashree, R., Vijay Anand, M., Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Carette, Jacques, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Owoc, Mieczyslaw Lech, editor, Varghese Sicily, Felix Enigo, editor, Rajaram, Kanchana, editor, and Balasundaram, Prabavathy, editor
- Published
- 2024
- Full Text
- View/download PDF
9. Energy-Efficient CNN Inferencing on GPUs with Dynamic Frequency Scaling
- Author
-
Drechsler, Rolf, Metz, Christopher A., Plump, Christina, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Bhattacharya, Abhishek, editor, Dutta, Soumi, editor, Dutta, Paramartha, editor, and Samanta, Debabrata, editor
- Published
- 2024
- Full Text
- View/download PDF
10. Human Facial Age Group Recognizer Using Assisted Bottleneck Transformer Encoder
- Author
-
Priadana, Adri, Nguyen, Duy-Linh, Vo, Xuan-Thuy, Jo, Kanghyun, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Irie, Go, editor, Shin, Choonsung, editor, Shibata, Takashi, editor, and Nakamura, Kazuaki, editor
- Published
- 2024
- Full Text
- View/download PDF
11. Minor Object Recognition from Drone Image Sequence
- Author
-
Nguyen, Duy-Linh, Vo, Xuan-Thuy, Priadana, Adri, Jo, Kang-Hyun, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Irie, Go, editor, Shin, Choonsung, editor, Shibata, Takashi, editor, and Nakamura, Kazuaki, editor
- Published
- 2024
- Full Text
- View/download PDF
12. The Potential of 1D-CNN for EEG Mental Attention State Detection
- Author
-
Velaga, NandaKiran, Singh, Deepak, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Chauhan, Naveen, editor, Yadav, Divakar, editor, Verma, Gyanendra K., editor, Soni, Badal, editor, and Lara, Jorge Morato, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Multi-model Chatbot and Image Classifier for Plant Disease Detection
- Author
-
Mittal, Sonia, Upadhyay, Tejal, Avasthi, Kanav, Singh, Aditya Anuj Shah, Pachchigar, Aditya, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Rajagopal, Sridaran, editor, Popat, Kalpesh, editor, Meva, Divyakant, editor, and Bajeja, Sunil, editor
- Published
- 2024
- Full Text
- View/download PDF
14. Privacy Preserving Elder Fall Detection Using Deep Learning
- Author
-
Iftikhar, Faseeh, Khan, Muhammad Faizan, Wang, Guojun, Wahid, Fazli, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Wang, Guojun, editor, Wang, Haozhe, editor, Min, Geyong, editor, Georgalas, Nektarios, editor, and Meng, Weizhi, editor
- Published
- 2024
- Full Text
- View/download PDF
15. Real-Time AI-Enabled Cyber-Physical System Based Cattle Disease Detection System
- Author
-
Balamurugan, K. S., Rajalakshmi, R., Pradhan, Chinmaya Kumar, Meerja, Khalim Amjad, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Challa, Rama Krishna, editor, Aujla, Gagangeet Singh, editor, Mathew, Lini, editor, Kumar, Amod, editor, Kalra, Mala, editor, Shimi, S. L., editor, Saini, Garima, editor, and Sharma, Kanika, editor
- Published
- 2024
- Full Text
- View/download PDF
16. Attention-Residual Convolutional Neural Network for Image Restoration Due to Bad Weather
- Author
-
Dasgupta, Madhuchhanda, Bandyopadhyay, Oishila, Chatterji, Sanjay, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Dasgupta, Kousik, editor, Mukhopadhyay, Somnath, editor, Mandal, Jyotsna K., editor, and Dutta, Paramartha, editor
- Published
- 2024
- Full Text
- View/download PDF
17. Convolutional Neural Network (CNN) to Reduce Construction Loss in JPEG Compression Caused by Discrete Fourier Transform (DFT)
- Author
-
Kunwar, Suman, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Zhao, Feng, editor, and Miao, Duoqian, editor
- Published
- 2024
- Full Text
- View/download PDF
18. An FPGA-Accelerated CNN with Parallelized Sum Pooling for Onboard Realtime Routing in Dynamic Low-Orbit Satellite Networks.
- Author
-
Kim, Hyeonwoo, Park, Juhyeon, Lee, Heoncheol, Won, Dongshik, and Han, Myonghun
- Subjects
REINFORCEMENT learning ,DEEP reinforcement learning ,CONVOLUTIONAL neural networks ,ROUTING algorithms ,GATE array circuits ,ORBITS of artificial satellites - Abstract
This paper addresses the problem of real-time onboard routing for dynamic low earth orbit (LEO) satellite networks. It is difficult to apply general routing algorithms to dynamic LEO networks due to the frequent changes in satellite topology caused by the disconnection between moving satellites. Deep reinforcement learning (DRL) models trained by various dynamic networks can be considered. However, since the inference process with the DRL model requires too long a computation time due to multiple convolutional layer operations, it is not practical to apply to a real-time on-board computer (OBC) with limited computing resources. To solve the problem, this paper proposes a practical co-design method with heterogeneous processors to parallelize and accelerate a part of the multiple convolutional layer operations on a field-programmable gate array (FPGA). The proposed method was tested with a real heterogeneous processor-based OBC and showed that the proposed method was about 3.10 times faster than the conventional method while achieving the same routing results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Prediction and Optimization Analysis of the Performance of an Office Building in an Extremely Hot and Cold Region.
- Author
-
Liu, Yunbo, Wang, Wanjiang, and Huang, Yumeng
- Abstract
The White Paper on Peak Carbon and Carbon Neutral Action 2022 states that China is to achieve peak carbon by 2030 and carbon neutrality by 2060. Based on the "3060 dual-carbon" goal, how to improve the efficiency of energy performance is an important prerequisite for building a low-carbon, energy-saving, green, and beautiful China. The office performance building studied in this paper is located in the urban area of Turpan, where the climate is characterized by an extremely hot summer environment and a cold winter environment. At the same time, the building is oriented east–west, with the main façade facing west, and the main façade consists of a large area of single-layer glass curtain wall, which is affected by western sunlight. As a result, there are serious problems with the building's energy consumption, which in turn leads to excessive carbon emissions and high life cycle costs for the building. To address the above problems, this paper analyzes and optimizes the following four dimensions. First, the article creates a Convolutional Neural Network (CNN) prediction model with Total Energy Use in Buildings (TEUI), Global Warming Potential (GWP), and Life Cycle Costs (LCC) as the performance objectives. After optimization, the R
2 of the three are 0.9908, 0.9869, and 0.9969, respectively, thus solving the problem of low accuracy of traditional prediction models. Next, the NSGA-II algorithm is used to optimize the three performance objectives, which are reduced by 41.94%, 40.61%, and 31.29%, respectively. Then, in the program decision stage, this paper uses two empowered Topsis methods to optimize this building performance problem. Finally, the article analyzes the variables using two sensitivity analysis methods. Through the above research, this paper provides a framework of optimization ideas for office buildings in extremely hot and cold regions while focusing on the four major aspects of machine learning, multi-objective optimization, decision analysis, and sensitivity analysis systematically and completely. For the development of office buildings in the region, whether in the early program design or in the later stages, energy-saving measures to optimize the design have laid the foundation of important guidelines. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
20. Lithium-Ion Battery SOH Estimation Method Based on Multi-Feature and CNN-BiLSTM-MHA.
- Author
-
Zhou, Yujie, Zhang, Chaolong, Zhang, Xulong, and Zhou, Ziheng
- Subjects
CONVOLUTIONAL neural networks ,THERMAL batteries ,ENERGY development ,CLEAN energy ,ELECTRIC vehicles - Abstract
Electric vehicles can reduce the dependence on limited resources such as oil, which is conducive to the development of clean energy. An accurate battery state of health (SOH) is beneficial for the safety of electric vehicles. A multi-feature and Convolutional Neural Network–Bidirectional Long Short-Term Memory–Multi-head Attention (CNN-BiLSTM-MHA)-based lithium-ion battery SOH estimation method is proposed in this paper. First, the voltage, energy, and temperature data of the battery in the constant current charging phase are measured. Then, based on the voltage and energy data, the incremental energy analysis (IEA) is performed to calculate the incremental energy (IE) curve. The IE curve features including IE, peak value, average value, and standard deviation are extracted and combined with the thermal features of the battery to form a complete multi-feature sequence. A CNN-BiLSTM-MHA model is set up to map the features to the battery SOH. Experiments were conducted using batteries with different charging currents, and the results showed that even if the nonlinearity of battery SOH degradation is significant, this method can still achieve a fast and accurate estimation of the battery SOH. The Mean Absolute Error (MAE) is 0.1982%, 0.1873%, 0.1652%, and 0.1968%, and the Root-Mean-Square Error (RMSE) is 0.2921%, 0.2997%, 0.2130%, and 0.2625%, respectively. The average Coefficient of Determination (R
2 ) is above 96%. Compared to the BiLSTM model, the training time is reduced by an average of about 36%. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
21. A NOVEL APPROACH USING MACHINE LEARNING TO DETECT AND CLASSIFY RICE PLANT DISEASES.
- Author
-
PATRA, H. PARTHASARATHI, SRIDHAR, GANDATTI, and SHAIK, LATEEFA
- Subjects
RICE diseases & pests ,MACHINE learning ,IMAGE processing ,ARTIFICIAL neural networks ,IMAGE segmentation ,DIAGNOSIS of plant diseases ,RICE yields - Abstract
Agriculture is the most important sector of the Indian economy. Rice cultivation plays an important role in many regions of India. Most farmers in India are fully dependent on rice. Early detection of diseases in rice plants plays an important role in yielding more. This paper proposes a solution to detect and classify rice plant diseases too early using automatic image processing techniques. Automatic detection uses image segmentation and neural networks for classification of plant leaves. It takes the image as input and applies techniques to that image, like pre-processing and segmentation, and then the input is given to the convolutional neural network in order to classify the disease. Most Indian farmers are not well educated to detect the disease of the plant before it gets damaged, which results in less production. Rice production has the main role in Indian economics, so adequate efforts are needed to improve it. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Development of a Wafer Defect Pattern Classifier Using Polar Coordinate System Transformed Inputs and Convolutional Neural Networks.
- Author
-
Kim, Moo Hyun and Kim, Tae Seon
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,FEATURE extraction ,CARTESIAN coordinates ,MACHINE learning - Abstract
Defect pattern analysis of wafer bin maps (WBMs) is an important means of identifying process problems. Recently, automated analysis methods using machine learning or deep learning have been studied as alternatives to manual classification by engineers. In this paper, we propose a method to improve the feature extraction performance of defect patterns by transforming the polar coordinate system instead of the existing WBM image input. To reduce the variability of the location representation, defect patterns in the Cartesian coordinate system, where the location of the distributed defect die is not constant, were converted to a polar coordinate system. The CNN classifier, which uses polar coordinate transformed input, achieved a classification accuracy of 91.3%, which is 4.8% better than the existing WBM image-based CNN classifier. Additionally, a tree-structured classifier model that sequentially connects binary classifiers achieved a classification accuracy of 94%. The method proposed in this paper is also applicable to the defect pattern classification of WBMs consisting of different die sizes than the training data. Finally, the paper proposes an automated pattern classification method that uses individual classifiers to learn defect types and then applies ensemble techniques for multiple defect pattern classification. This method is expected to reduce labor, time, and cost and enable objective labeling instead of relying on subjective judgments of engineers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. A scheme combining feature fusion and hybrid deep learning models for epileptic seizure detection and prediction.
- Author
-
Zhang, Jincan, Zheng, Shaojie, Chen, Wenna, Du, Ganqin, Fu, Qizhi, and Jiang, Hongwei
- Subjects
RECURRENT neural networks ,EPILEPSY ,DEEP learning ,DISCRETE wavelet transforms ,FEATURE extraction ,CONVOLUTIONAL neural networks - Abstract
Epilepsy is one of the most well-known neurological disorders globally, leading to individuals experiencing sudden seizures and significantly impacting their quality of life. Hence, there is an urgent necessity for an efficient method to detect and predict seizures in order to mitigate the risks faced by epilepsy patients. In this paper, a new method for seizure detection and prediction is proposed, which is based on multi-class feature fusion and the convolutional neural network-gated recurrent unit-attention mechanism (CNN-GRU-AM) model. Initially, the Electroencephalography (EEG) signal undergoes wavelet decomposition through the Discrete Wavelet Transform (DWT), resulting in six subbands. Subsequently, time–frequency domain and nonlinear features are extracted from each subband. Finally, the CNN-GRU-AM further extracts features and performs classification. The CHB-MIT dataset is used to validate the proposed approach. The results of tenfold cross validation show that our method achieved a sensitivity of 99.24% and 95.47%, specificity of 99.51% and 94.93%, accuracy of 99.35% and 95.16%, and an AUC of 99.34% and 95.15% in seizure detection and prediction tasks, respectively. The results show that the method proposed in this paper can effectively achieve high-precision detection and prediction of seizures, so as to remind patients and doctors to take timely protective measures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Leakage Detection in Water Distribution Systems Based on Logarithmic Spectrogram CNN for Continuous Monitoring.
- Author
-
Peng, Hao, Xu, Zhe, Huang, Qinglong, Qi, Liqiang, and Wang, Haitao
- Subjects
WATER pipelines ,WATER leakage ,WATER distribution ,CONVOLUTIONAL neural networks ,LEAK detection ,WATER conservation ,PHOTOACOUSTIC spectroscopy - Abstract
In the context of the Internet of Things, there is a growing demand for real-time monitoring of water distribution systems (WDS). Among the various leak detection methods, acoustic leak detection is considered to be a suitable method. However, existing methods are not very effective in environments with high daytime ambient noise. To address this issue, this paper conducted on-site data collection experiments and designed a monitoring system that combines traditional nighttime monitoring with daytime monitoring, combining water company pipeline inspections and repair work. A large number of daytime audio samples were collected. In this paper, the logarithmic spectrogram (log spectrogram) was used to represent the features of the leak signal. By comparing the features of the signal during day and night, noisy and quiet environments, and leak and normal signals, we identified the interfering frames that required noise reduction, and applied frame-level noise reduction processing to the signal. Based on this, a log PS-ResNet18 model was developed to identify leaks, and its performance was compared with other classification models [including traditional nighttime detection methods, random forests, XGBoost, and convolutional neural network (CNN)]. The results showed that the log PS-ResNet18 model had the best performance, with an all-day accuracy rate of 99.4% and a daytime accuracy rate of 99.3%. In addition, by conducting ablation experiments to explore the role and contribution of the log PS-ResNet18 and noise reduction methods in the model, the results showed that the log spectrogram and noise reduction methods increased the all-day accuracy rate by 18.8% and 23.2%, respectively, and by 24.7% when used together. In another practical application, the log PS-ResNet18 model achieved an all-day detection accuracy rate of 99.6%. This study demonstrated the applicability of the log spectrogram and CNN combination in daytime leak detection, overcoming research limitations in the field. This research presents the log PS-ResNet18 framework, which combines deep learning models and denoised logarithmic spectrograms to improve leak detection in water supply pipelines under daytime environmental noise. The research focuses on field data collection and analysis of cast iron pipes with different diameters in Hangzhou (HZ). The model was tested on cast iron pipes in Lishui (LS) and proved to be effective. The proposed method is highly versatile and can be applied to different regions and pipe materials after sufficient sample collection and model training validation. The research recommends a comprehensive leak monitoring solution that involves initial intelligent detection using front-end noise meters and secondary identification of suspicious audio signals using the log PS-ResNet18 model in the cloud. This enables water utility operators to respond quickly to pipeline leaks, leading to more efficient water resource conservation and improved water supply service quality. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. Deep Time Series Forecasting Models: A Comprehensive Survey.
- Author
-
Liu, Xinhe and Wang, Wenmin
- Subjects
DEEP learning ,ARTIFICIAL neural networks ,TIME series analysis ,CONVOLUTIONAL neural networks ,ARTIFICIAL intelligence ,LANGUAGE models - Abstract
Deep learning, a crucial technique for achieving artificial intelligence (AI), has been successfully applied in many fields. The gradual application of the latest architectures of deep learning in the field of time series forecasting (TSF), such as Transformers, has shown excellent performance and results compared to traditional statistical methods. These applications are widely present in academia and in our daily lives, covering many areas including forecasting electricity consumption in power systems, meteorological rainfall, traffic flow, quantitative trading, risk control in finance, sales operations and price predictions for commercial companies, and pandemic prediction in the medical field. Deep learning-based TSF tasks stand out as one of the most valuable AI scenarios for research, playing an important role in explaining complex real-world phenomena. However, deep learning models still face challenges: they need to deal with the challenge of large-scale data in the information age, achieve longer forecasting ranges, reduce excessively high computational complexity, etc. Therefore, novel methods and more effective solutions are essential. In this paper, we review the latest developments in deep learning for TSF. We begin by introducing the recent development trends in the field of TSF and then propose a new taxonomy from the perspective of deep neural network models, comprehensively covering articles published over the past five years. We also organize commonly used experimental evaluation metrics and datasets. Finally, we point out current issues with the existing solutions and suggest promising future directions in the field of deep learning combined with TSF. This paper is the most comprehensive review related to TSF in recent years and will provide a detailed index for researchers in this field and those who are just starting out. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. FusionNN: A Semantic Feature FusionModel Based onMultimodal forWeb Anomaly Detection.
- Author
-
Li Wang, Mingshan Xia, Hao Hu, Jianfang Li, Fengyao Hou, and Gang Chen
- Abstract
With the rapid development of the mobile communication and the Internet, the previous web anomaly detection and identificationmodels were built relying on security experts' empirical knowledge and attack features. Although this approach can achieve higher detection performance, it requires huge human labor and resources to maintain the feature library. In contrast, semantic feature engineering can dynamically discover new semantic features and optimize feature selection by automatically analyzing the semantic information contained in the data itself, thus reducing dependence on prior knowledge. However, current semantic features still have the problem of semantic expression singularity, as they are extracted from a single semantic mode such as word segmentation, character segmentation, or arbitrary semantic feature extraction. This paper extracts features of web requests from dual semantic granularity, and proposes a semantic feature fusion method to solve the above problems. The method first preprocesses web requests, and extracts word-level and character-level semantic features of URLs via convolutional neural network (CNN), respectively. By constructing three loss functions to reduce losses between features, labels and categories. Experiments on the HTTP CSIC 2010, Malicious URLs and HttpParams datasets verify the proposedmethod. Results show that compared withmachine learning, deep learningmethods and BERT model, the proposed method has better detection performance. And it achieved the best detection rate of 99.16% in the dataset HttpParams. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Skeleton-based 3D human pose estimation with low-resolution infrared array sensor using attention based CNN-BiGRU.
- Author
-
Chen, Jing, Chen, Deying, Jiang, Hao, Miao, Xiren, and Yin, Cunyi
- Abstract
Human sensing based on the low-resolution infrared sensor is widely used in hand gestures recognition, activity recognition, intrusion detection, etc. However, the information about humans acquired by the previous human sensing system using the infrared sensor is limited. In this paper, a human pose estimation system is proposed to realize the three-dimensional skeleton information acquisition by low-resolution infrared sensors. It is a difficult task to acquire human pose estimation with more rich human information from low-resolution infrared sensors. The system leverages the 8 × 8 pixels low-resolution infrared array sensor to collect the activity data and the Kinect v2 camera to capture the three-dimensional skeleton of the human body as annotations of the infrared data. The convolutional neural network-bidirectional gated recurrent unit model with attention mechanism (CNN-BiGRU-AM) model is employed for model training to effectively extract the characteristics of the infrared data from spatial and temporal dimensions. The attention mechanism (AM) can improve the ability of the model to capture important local information. The bone joint point data predicted by the model are utilized to draw the three-dimensional skeleton diagram. The k-means clustering algorithm is applied to eliminate the outliers that affect the overall visualization effect in the prediction. The accuracy and completeness of human pose estimation are measured by the euclidean distance between the real coordinates of the bone joint points obtained by Kinect v2 camera and the coordinates predicted by the model. The proportion of the number of predictions with euclidean distance less than a threshold 20 mm is 90.151%, representing the accuracy of human pose estimation. The experimental results show that three-dimensional skeleton information can be acquired accurately by the low-resolution infrared array sensor and the subtle difference within each activity can be observed through the 3D human pose to improve the effect of activity recognition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Significant wave height estimation from shipborne marine radar data using convolutional and self-attention network
- Author
-
Wang, Fupeng, Chu, Xiaoliang, and Zhang, Baoxue
- Published
- 2024
- Full Text
- View/download PDF
29. Food products pricing theory with application of machine learning and game theory approach.
- Author
-
Mamoudan, Mobina Mousapour, Mohammadnazari, Zahra, Ostadi, Ali, and Esfahbodi, Ali
- Subjects
CONVOLUTIONAL neural networks ,MICROECONOMICS ,MACHINE theory ,FOOD prices ,GAME theory ,PERISHABLE goods - Abstract
Demand for perishable food is sensitive to product prices and is affected by the prices of similar or alternative products. While brand loyalty influences the demand for products, determining a reasonable price requires a precise pricing strategy. In this paper, a pricing model for perishable food is presented in which the brand value of the product and the price of other manufacturers as competitors are considered. To this end, this study first predicts the price of competitors using a combination of optimized Neural Networks and presents an optimized model using a Genetic Algorithm. This algorithm combines a Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), and a Genetic Algorithm (GA). The proposed model is then used to merge with a game-theory model for the pricing of perishable foods. In this game-theory model, pricing approaches are developed based on identified prices of competitors. In the coordination contract game-theory model, Multi Retailer- one Supplier and Price-sensitive demand of Perishable product are developed with and without quantity discount contract. Obtained results indicate that independent procurement provides retailers with higher profit, while lower profit will be presented when coordination is not considered. Also, with coordination, the ordering cycle increases, and the ordering frequency decrease. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Exploring low-level statistical features of n-grams in phishing URLs: a comparative analysis with high-level features
- Author
-
Tashtoush, Yahya, Alajlouni, Moayyad, Albalas, Firas, and Darwish, Omar
- Published
- 2024
- Full Text
- View/download PDF
31. Joint Classification of Hyperspectral and LiDAR Data Based on Adaptive Gating Mechanism and Learnable Transformer.
- Author
-
Wang, Minhui, Sun, Yaxiu, Xiang, Jianhong, Sun, Rui, and Zhong, Yu
- Subjects
TRANSFORMER models ,CONVOLUTIONAL neural networks ,LIDAR ,DIGITAL elevation models ,TRANSFER matrix ,DATA fusion (Statistics) - Abstract
Utilizing multi-modal data, as opposed to only hyperspectral image (HSI), enhances target identification accuracy in remote sensing. Transformers are applied to multi-modal data classification for their long-range dependency but often overlook intrinsic image structure by directly flattening image blocks into vectors. Moreover, as the encoder deepens, unprofitable information negatively impacts classification performance. Therefore, this paper proposes a learnable transformer with an adaptive gating mechanism (AGMLT). Firstly, a spectral–spatial adaptive gating mechanism (SSAGM) is designed to comprehensively extract the local information from images. It mainly contains point depthwise attention (PDWA) and asymmetric depthwise attention (ADWA). The former is for extracting spectral information of HSI, and the latter is for extracting spatial information of HSI and elevation information of LiDAR-derived rasterized digital surface models (LiDAR-DSM). By omitting linear layers, local continuity is maintained. Then, the layer Scale and learnable transition matrix are introduced to the original transformer encoder and self-attention to form the learnable transformer (L-Former). It improves data dynamics and prevents performance degradation as the encoder deepens. Subsequently, learnable cross-attention (LC-Attention) with the learnable transfer matrix is designed to augment the fusion of multi-modal data by enriching feature information. Finally, poly loss, known for its adaptability with multi-modal data, is employed in training the model. Experiments in the paper are conducted on four famous multi-modal datasets: Trento (TR), MUUFL (MU), Augsburg (AU), and Houston2013 (HU). The results show that AGMLT achieves optimal performance over some existing models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Robust human locomotion and localization activity recognition over multisensory.
- Author
-
Khan, Danyal, Alonazi, Mohammed, Abdelhaq, Maha, Al Mudawi, Naif, Algarni, Asaad, Jalal, Ahmad, and Hui Liu
- Subjects
HUMAN locomotion ,HUMAN activity recognition ,HYBRID systems ,CONVOLUTIONAL neural networks ,DEEP learning - Abstract
Human activity recognition (HAR) plays a pivotal role in various domains, including healthcare, sports, robotics, and security. With the growing popularity of wearable devices, particularly Inertial Measurement Units (IMUs) and Ambient sensors, researchers and engineers have sought to take advantage of these advances to accurately and efficiently detect and classify human activities. This research paper presents an advanced methodology for human activity and localization recognition, utilizing smartphone IMU, Ambient, GPS, and Audio sensor data from two public benchmark datasets: the Opportunity dataset and the Extrasensory dataset. The Opportunity dataset was collected from 12 subjects participating in a range of daily activities, and it captures data from various body-worn and object-associated sensors. The Extrasensory dataset features data from 60 participants, including thousands of data samples from smartphone and smartwatch sensors, labeled with a wide array of human activities. Our study incorporates novel feature extraction techniques for signal, GPS, and audio sensor data. Specifically, for localization, GPS, audio, and IMU sensors are utilized, while IMU and Ambient sensors are employed for locomotion activity recognition. To achieve accurate activity classification, state-of-the-art deep learning techniques, such as convolutional neural networks (CNN) and long short-term memory (LSTM), have been explored. For indoor/outdoor activities, CNNs are applied, while LSTMs are utilized for locomotion activity recognition. The proposed system has been evaluated using the k-fold cross-validation method, achieving accuracy rates of 97% and 89% for locomotion activity over the Opportunity and Extrasensory datasets, respectively, and 96% for indoor/outdoor activity over the Extrasensory dataset. These results highlight the efficiency of our methodology in accurately detecting various human activities, showing its potential for real-world applications. Moreover, the research paper introduces a hybrid system that combines machine learning and deep learning features, enhancing activity recognition performance by leveraging the strengths of both approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Collapse Susceptibility Assessment in Taihe Town Based on Convolutional Neural Network and Information Value Method.
- Author
-
Li, Houlu, Hu, Bill X., Lin, Bo, Zhu, Sihong, Meng, Fanqi, and Li, Yufei
- Subjects
CONVOLUTIONAL neural networks ,INFORMATION networks ,EMERGENCY management ,FEATURE extraction - Abstract
The cause mechanism of collapse disasters is complex and there are many influencing factors. Convolutional Neural Network (CNN) has a strong feature extraction ability, which can better simulate the formation of collapse disasters and accurately predict them. Taihe town's collapse threatens roads, buildings, and people. In this paper, road distance, water distance, normalized vegetation index, platform curvature, profile curvature, slope, slope direction, and geological data are used as input variables. This paper generates collapse susceptibility zoning maps based on the information value method (IV) and CNN, respectively. The results show that the accuracy of the susceptibility assessment of the IV method and the CNN method is 85.1% and 87.4%, and the accuracy of the susceptibility assessment based on the CNN method is higher. The research results can provide some reference for the formulation of disaster prevention and control strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Capturing Visual Narratives: Employing GRU and Attention Mechanism in a Deep Learning Framework for Automatic Image Captioning.
- Author
-
Jaiswal, Sushma, Pallthadka, Harikumar, Chinhewadi, Rajesh P., and Jaiswal, Tarun
- Subjects
NATURAL language processing ,RECURRENT neural networks ,DEEP learning ,CONVOLUTIONAL neural networks ,COMPUTER vision ,IMAGE analysis ,CRANES (Birds) - Abstract
The goal of automatic image captioning in computer vision and natural language processing is to provide precise and insightful image captions. The paper provides a novel approach for automatic image captioning, which combines an attention mechanism with a deep learning model based on gated recurrent units (GRUs). Using the attention mechanism, which dynamically weighs the image's attributes, the model may focus on relevant portions of the image during the caption production process. This makes it easier to match related words in the generated captions with the features of the image. The output captions accurately depict word relationships and mimic the sequential structure of natural language through the application of recurrent neural networks of the GRU variety. With a large collection of images and captions, the network is taught to generate rational and contextually relevant descriptions for various image types. By assessing the proposed model with widely-used metrics such as BLEU, METEOR, ROUGE, and CIDEr, we demonstrate its ability to generate high-quality captions. The findings show that the approach outperforms baseline techniques, highlighting the advantages of combining GRU with an attention mechanism for image captioning. The method produces captions that are accurate and convey a deeper understanding of the visual content in the photos, making it highly applicable in real-world applications such as image interpretation, accessibility and content suggestion. [ABSTRACT FROM AUTHOR]
- Published
- 2024
35. 面向白内障识别的临床特征校准注意力网络.
- Author
-
章晓庆, 肖尊杰, 赵宇航, 巫晓, 东田理沙, and 刘江
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
36. Efficient FPGA Implementation of Convolutional Neural Networks and Long Short-Term Memory for Radar Emitter Signal Recognition.
- Author
-
Wu, Bin, Wu, Xinyu, Li, Peng, Gao, Youbing, Si, Jiangbo, and Al-Dhahir, Naofal
- Subjects
DEEP learning ,CONVOLUTIONAL neural networks ,FIELD programmable gate arrays ,MACHINE learning ,SIGNAL classification - Abstract
In recent years, radar emitter signal recognition has enjoyed a wide range of applications in electronic support measure systems and communication security. More and more deep learning algorithms have been used to improve the recognition accuracy of radar emitter signals. However, complex deep learning algorithms and data preprocessing operations have a huge demand for computing power, which cannot meet the requirements of low power consumption and high real-time processing scenarios. Therefore, many research works have remained in the experimental stage and cannot be actually implemented. To tackle this problem, this paper proposes a resource reuse computing acceleration platform based on field programmable gate arrays (FPGA), and implements a one-dimensional (1D) convolutional neural network (CNN) and long short-term memory (LSTM) neural network (NN) model for radar emitter signal recognition, directly targeting the intermediate frequency (IF) data of radar emitter signal for classification and recognition. The implementation of the 1D-CNN-LSTM neural network on FPGA is realized by multiplexing the same systolic array to accomplish the parallel acceleration of 1D convolution and matrix vector multiplication operations. We implemented our network on Xilinx XCKU040 to evaluate the effectiveness of our proposed solution. Our experiments show that the system can achieve 7.34 giga operations per second (GOPS) data throughput with only 5.022 W power consumption when the radar emitter signal recognition rate is 96.53%, which greatly improves the energy efficiency ratio and real-time performance of the radar emitter recognition system. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Evaluation of contemporary intrusion detection systems for internet of things environment.
- Author
-
Choudhary, Vandana, Tanwar, Sarvesh, and Choudhury, Tanupriya
- Abstract
Internet of Things (IoT) involves wide-ranging devices connected through the Internet with an aim to enable coherent communication amongst them without human intervention to realize profuse smart applications which inherently makes our life a lot easier and furthermore productive. These connected devices continuously sense and gather information from surroundings, thereby producing an immense amount of data that cater for big data analytics. In the current era, number of smart devices are increasing rapidly due to the magnificent features they offer. Moreover, public access to the Internet makes the system even more vulnerable to intrusions. Catastrophically, this has fascinated numerous cybercriminals who have turned the IoT ecosystem into a hotbed of illicit activities. Thereupon, implication of Intrusion Detection System (IDS) in IoT is apparent. The literature suggests a number of IDS to address intrusions/attacks in the discipline of IoT. In the current paper, besides Systematic Literature Review of the IDS for IoT environment, a deep learning model with aquila optimization is proposed to predict anomaly using IoTID20, UNSW-NB15–1 and UNSW_2018_IoT_Botnet_Full5pc_4 datasets. The hybrid model that we have developed, uses a combined network structure of convolutional neural network and aquila optimization algorithm. In all of the studies that were carried out, the swarm intelligence-driven deep learning strategy outperformed other, comparable approaches. Based on current findings, it is reasonable to draw the conclusion that the suggested technique offers an efficient method for early anomaly detection and contributes to viable control of anomaly in the IoT environment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Exploring the effects of pandemics on transportation through correlations and deep learning techniques.
- Author
-
Gamel, Samah A., Hassan, Esraa, El-Rashidy, Nora, and Talaat, Fatma M.
- Abstract
The COVID-19 pandemic has had a significant impact on human migration worldwide, affecting transportation patterns in cities. Many cities have issued "stay-at-home" orders during the outbreak, causing commuters to change their usual modes of transportation. For example, some transit/bus passengers have switched to driving or car-sharing. As a result, urban traffic congestion patterns have changed dramatically, and understanding these changes is crucial for effective emergency traffic management and control efforts. While previous studies have focused on natural disasters or major accidents, only a few have examined pandemic-related traffic congestion patterns. This paper uses correlations and machine learning techniques to analyze the relationship between COVID-19 and transportation. The authors simulated traffic models for five different networks and proposed a Traffic Prediction Technique (TPT), which includes an Impact Calculation Methodology that uses Pearson's Correlation Coefficient and Linear Regression, as well as a Traffic Prediction Module (TPM). The paper's main contribution is the introduction of the TPM, which uses Convolutional Neural Network to predict the impact of COVID-19 on transportation. The results indicate a strong correlation between the spread of COVID-19 and transportation patterns, and the CNN has a high accuracy rate in predicting these impacts. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Attention-based 3D convolutional recurrent neural network model for multimodal emotion recognition.
- Author
-
Yiming Du, Penghai Li, Longlong Cheng, Xuanwei Zhang, Mingji Li, and Fengzhou Li
- Subjects
CONVOLUTIONAL neural networks ,RECURRENT neural networks ,EMOTION recognition ,RECOGNITION (Psychology) ,FACIAL expression & emotions (Psychology) ,EMOTIONS ,LARGE-scale brain networks - Abstract
Introduction: Multimodal emotion recognition has become a hot topic in human-computer interaction and intelligent healthcare fields. However, combining information from different human different modalities for emotion computation is still challenging. Methods: In this paper, we propose a three-dimensional convolutional recurrent neural network model (referred to as 3FACRNN network) based on multimodal fusion and attention mechanism. The 3FACRNN network model consists of a visual network and an EEG network. The visual network is composed of a cascaded convolutional neural network--time convolutional network (CNNTCN). In the EEG network, the 3D feature building module was added to integrate band information, spatial information and temporal information of the EEG signal, and the band attention and self-attention modules were added to the convolutional recurrent neural network (CRNN). The former explores the effect of different frequency bands on network recognition performance, while the latter is to obtain the intrinsic similarity of different EEG samples. Results: To investigate the effect of different frequency bands on the experiment, we obtained the average attention mask for all subjects in different frequency bands. The distribution of the attention masks across the different frequency bands suggests that signals more relevant to human emotions may be active in the high frequency bands γ (31--50 Hz). Finally, we try to use the multi-task loss function Lc to force the approximation of the intermediate feature vectors of the visual and EEG modalities, with the aim of using the knowledge of the visual modalities to improve the performance of the EEG network model. The mean recognition accuracy and standard deviation of the proposed method on the two multimodal sentiment datasets DEAP and MAHNOB-HCI (arousal, valence) were 96.75 ± 1.75, 96.86 ± 1.33; 97.55 ± 1.51, 98.37 ± 1.07, better than those of the state-of-the-art multimodal recognition approaches. Discussion: The experimental results show that starting from the multimodal information, the facial video frames and electroencephalogram (EEG) signals of the subjects are used as inputs to the emotion recognition network, which can enhance the stability of the emotion network and improve the recognition accuracy of the emotion network. In addition, in future work, we will try to utilize sparse matrix methods and deep convolutional networks to improve the performance of multimodal emotion networks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
40. A Literature Review of Age Estimation using Dental Images.
- Author
-
Arethiya, Nikhil Jivraj, Sakhidas, Jainish Jitendra, Jain, Divya Ajay, Patil, Linesh Sachin, and Devmane, Vidyullata
- Subjects
LITERATURE reviews ,DECIDUOUS teeth ,COMPUTED tomography ,MOLARS ,CONVOLUTIONAL neural networks - Abstract
Odontology is the study of dental disease and structure. The human jaw consists of 32 teeth divided into four categories: incisors, canines, premolars, and molars. Every human has two sets of teeth in their life, a set of temporary teeth and a set of permanent teeth. To detect issues in the jaw, the dentist usually suggests scans such as OPG, CBCT, or CTs. OPG is a panoramic X-ray of the upper and lower jaws, whereas CBCT is a medical imaging procedure that uses divergent X-ray computed tomography. While OPG X-rays are generally preferred by dentists since they do not leave any radiation on the patient's anatomical tissues and are effective, easy, and quicker to analyze, CT scan images offer more detailed information than ordinary Xrays. In this paper, we'll go over many methods and models that other researchers in the same field have used to determine a person's age just by looking at scans or X-rays. The radiographs can also be used to identify other patterns, such as missing teeth, impacted teeth, the contour of the jaw, and the growth of the molar teeth. Some of these are also very important in the age estimation procedure. [ABSTRACT FROM AUTHOR]
- Published
- 2024
41. Rf-based fingerprinting for indoor localization: deep transfer learning approach.
- Author
-
Safwat, Rokaya, Shaaban, Eman, Al-Tabbakh, Shahinaz. M., and Emara, Karim
- Abstract
Transfer Learning (TL) has emerged as a powerful approach for improving the performance of Deep Learning systems in various domains by leveraging pre-trained models. It was proven that features learned by deep learning can smoothly be reused across similar domains. Deep transfer learning schemes compensate for limited training data via transfer learning of a rich data environment. This paper investigates the effectiveness of applying TL schemes in indoor localization. It proposes four deep TL models where the knowledge is transferred from the rich-measurement data source domain to multiple target domains with limited data measurements. The architecture of the source domain is based on Convolutional Neural Network (CNN), where the four deep TL models for the target domain are: standalone feature extractor, integrated feature extractor, selective fine-tuning, and weight initialization. We employed a dataset of RF fingerprinting measurement signals representing common interior conditions, including extremely crowded, medium cluttered, low cluttered, and open environments, to test the effectiveness of the proposed TL models. We measured the accuracy and computation time of target-domain models trained, with varied percentages of restricted data sizes: 40%, 30%, 20%, 15%, 10%, 5%, and 2.5%. The experimental results show that all TL models are effective in achieving significant improvement in accuracy when compared to non-transferred models, even with minimal training data size. However, the proper determination of the TL model and the amount of training data profoundly influence the performance results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. Leveraging two-level deep learning classifiers for 2D shape recognition to automatically solve geometry math word problems.
- Author
-
Boob, Archana and Radke, Mansi
- Abstract
In mathematics, closed-domain systems for Question Answering (QA) have shown a distinct advantage over open-domain systems, primarily due to their focused use of supporting knowledge bases. This advantage is particularly salient in the era of online and hybrid tutoring, where automatic QA systems have become vital in addressing complex mathematical problems. This paper focuses on the challenge of geometric shape recognition in math word problems (MWPs) accompanied by figures that aid in the solution process. Existing systems rely on manually inputted shape information, which is less efficient. In this work, a novel customized two-layer deep learning model ‘2DGeoShapeNet’ for 2D geometric shape recognition has been developed. At the first level, it recognizes images in broad categories such as circles, quadrilaterals, or triangles. At the second level, the subtypes of quadrilaterals and triangles are detected. The proposed 2D shape detection model is trained and tested on a newly created integrated dataset, ‘GeoCQT’ (Circle, Quadrilateral, and Triangle), consisting of 6K+ images. The proposed deep learning technique achieved 93.98% accuracy on the ‘GeoCQT’ dataset. The performance of the proposed techniques is also evaluated on other geometry math word problem solver datasets such as GeoS, Geometry3K, GeoQA, PGDP5K, and PGPS9K. The proposed technique is compared with the already-published work that employed traditional image processing techniques for 2D shape detection. Findings highlight the superiority of two-level deep learning classifiers in detecting geometric shapes, marking a significant advancement in automated geometry problem-solving. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. Deep Learning–Based Detection of Vehicle Axle Type with Images Collected via UAV.
- Author
-
Wang, Zhipeng, Zhu, Junqing, and Ma, Tao
- Subjects
CONVOLUTIONAL neural networks ,TRAFFIC flow - Abstract
The identification and quantification of vehicle axle type are essential to evaluate the operational status of road traffic. Uncrewed aerial vehicles (UAV) are helpful in obtaining information about vehicles in most road scenes. This paper proposed the collection of road vehicle information using UAVs with a high-resolution camera. The UAV flight scheme for optimal image quality acquisition was studied, and the collected UAV images were processed. An image data set was established with four vehicle types and nine vehicle axle types. Three state-of-the-art object-detection algorithm, namely, CenterNet, you only look once (YOLO)v7, and Detection Transformer (DTER), were used to train the data set, and their prediction performance was compared. YOLOv7 performed the best among the three algorithms with a mean average precision (MAP) of 97.1%. The YOLOv7 object-detection algorithm was combined with the DeepSORT object-tracking algorithm to achieve detection and statistics of vehicle axle type in traffic flow. The findings of this study help to quickly obtain basic information about the vehicles on the road. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Bengali-Sign: A Machine Learning-Based Bengali Sign Language Interpretation for Deaf and Non-Verbal People.
- Author
-
Raihan, Md. Johir, Labib, Mainul Islam, Jim, Abdullah Al Jaid, Tiang, Jun Jiat, Biswas, Uzzal, and Nahid, Abdullah-Al
- Subjects
CONVOLUTIONAL neural networks ,MACHINE learning ,SIGN language ,BENGALI language ,MOBILE apps - Abstract
Sign language is undoubtedly a common way of communication among deaf and non-verbal people. But it is not common among hearing people to use sign language to express feelings or share information in everyday life. Therefore, a significant communication gap exists between deaf and hearing individuals, despite both groups experiencing similar emotions and sentiments. In this paper, we developed a convolutional neural network–squeeze excitation network to predict the sign language signs and developed a smartphone application to provide access to the ML model to use it. The SE block provides attention to the channel of the image, thus improving the performance of the model. On the other hand, the smartphone application brings the ML model close to people so that everyone can benefit from it. In addition, we used the Shapley additive explanation to interpret the black box nature of the ML model and understand the models working from within. Using our ML model, we achieved an accuracy of 99.86% on the KU-BdSL dataset. The SHAP analysis shows that the model primarily relies on hand-related visual cues to predict sign language signs, aligning with human communication patterns. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. Intra-Pulse Modulation Recognition of Radar Signals Based on Efficient Cross-Scale Aware Network.
- Author
-
Liang, Jingyue, Luo, Zhongtao, and Liao, Renlong
- Subjects
CONVOLUTIONAL neural networks ,PARALLEL processing ,COMPUTATIONAL complexity ,IMAGE recognition (Computer vision) ,RADAR - Abstract
Radar signal intra-pulse modulation recognition can be addressed with convolutional neural networks (CNNs) and time–frequency images (TFIs). However, current CNNs have high computational complexity and do not perform well in low-signal-to-noise ratio (SNR) scenarios. In this paper, we propose a lightweight CNN known as the cross-scale aware network (CSANet) to recognize intra-pulse modulation based on three types of TFIs. The cross-scale aware (CSA) module, designed as a residual and parallel architecture, comprises a depthwise dilated convolution group (DDConv Group), a cross-channel interaction (CCI) mechanism, and spatial information focus (SIF). DDConv Group produces multiple-scale features with a dynamic receptive field, CCI fuses the features and mitigates noise in multiple channels, and SIF is aware of the cross-scale details of TFI structures. Furthermore, we develop a novel time–frequency fusion (TFF) feature based on three types of TFIs by employing image preprocessing techniques, i.e., adaptive binarization, morphological processing, and feature fusion. Experiments demonstrate that CSANet achieves higher accuracy with our TFF compared to other TFIs. Meanwhile, CSANet outperforms cutting-edge networks across twelve radar signal datasets, providing an efficient solution for high-precision recognition in low-SNR scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Transformer-CNN 特征跨注意力融合学习的行人重识别.
- Author
-
项 俊, 张金城, 江小平, and 侯建华
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
47. 基于 CNN-Transformer 的自动泊车车位感知算法.
- Author
-
王玉龙, 翁茂楠, 黄辉, and 覃小艺
- Abstract
Copyright of Automobile Technology is the property of Automobile Technology Editorial Office and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
48. Innovative Research on Intelligent Recognition of Winter Jujube Defects by Applying Convolutional Neural Networks.
- Author
-
Zhang, Jianjun, Wang, Weihui, and Che, Qinglun
- Subjects
CONVOLUTIONAL neural networks ,ARTIFICIAL intelligence ,JUJUBE (Plant) ,MANUAL labor ,SURFACE defects - Abstract
The current sorting process for winter jujubes relies heavily on manual labor, lacks uniform sorting standards, and is inefficient. Furthermore, existing devices have simple structures and can only be sorted based on size. This paper introduces a method for detecting surface defects on winter jujubes using convolutional neural networks (CNNs). According to the current situation in the winter jujube industry in Zhanhua District, Binzhou City, Shandong Province, China, we collected winter jujubes with different surface qualities in Zhanhua District; produced a winter jujube dataset containing 2000 winter jujube images; improved it based on the traditional AlexNet model; selected a total of four classical convolutional neural networks, AlexNet, VGG-16, Inception-V3, and ResNet-34, to conduct different learning rate comparison training experiments; and then took the accuracy rate, loss value, and F1-score of the validation set as evaluation indexes while analyzing and discussing the training results of each model. The experimental results show that the improved AlexNet model had the highest accuracy in the binary classification case, with an accuracy of 98% on the validation set; the accuracy of the Inception V3 model reached 97%. In the detailed classification case, the accuracy of the Inception V3 model was 95%. Different models have different performances and different hardware requirements, and different models can be used to build the system according to different needs. This study can provide a theoretical basis and technical reference for researching and developing winter jujube detection devices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
49. Detection of phishing URLs with deep learning based on GAN-CNN-LSTM network and swarm intelligence algorithms.
- Author
-
Albahadili, Abbas Jabr Saleh, Akbas, Ayhan, and Rahebi, Javad
- Abstract
Phishing attacks are one of the challenges of the Internet and its users. Phishing attacks are an example of social engineering attacks based on deceiving users. In phishing attacks, fake pages that are very similar to legitimate pages are created on the Internet. In phishing attacks, the victim is directed to fake pages, and their valuable information is stolen. Most of the targets of phishing attacks include online payment services, banking, and online sales, so the losses of these attacks are significant. One way to detect phishing attacks is to use machine learning and deep learning methods. The challenge of machine learning and deep learning methods is intelligent feature selection. The lack of feature extraction and intelligent feature selection reduces the accuracy of learning methods in detecting phishing attacks. This paper presents a combined method with deep learning, machine learning, and swarm intelligence algorithms to detect phishing attacks. In the first phase, the dataset is balanced by deep learning based on the GAN. In the second step, the convolutional neural network extracts the primary features from the links and code of web pages. In the third step, the white shark optimizer algorithm selects the essential features. In the last step, the LSTM neural network classifies the samples. The proposed method has been evaluated on ISCX-URL-2016 and Phishtank datasets for feature extraction and selection. The proposed method's accuracy, precision, and sensitivity in the ISCX-URL-2016 dataset are 97.94, 97.82, and 97.76%, respectively. In the Phishtank dataset, the proposed method has accuracy, precision, and sensitivity of 96.78, 95.67, and 95.71%. The proposed method is more accurate than LSTM, CNN, CNN-LSTM, CNN + GA, DNN, VAE-DNN, and AE-DNN methods in detecting phishing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. Intrapulse Modulation Radar Signal Recognition Using CNN with Second-Order STFT-Based Synchrosqueezing Transform.
- Author
-
Dong, Ning, Jiang, Hong, Liu, Yipeng, and Zhang, Jingtao
- Subjects
CONVOLUTIONAL neural networks ,SIGNAL classification ,FOURIER transforms ,SIGNAL-to-noise ratio ,RADAR ,PHOTOPLETHYSMOGRAPHY - Abstract
Intrapulse modulation classification of radar signals plays an important role in modern electronic reconnaissance, countermeasures, etc. In this paper, to improve the recognition rate at low signal-to-noise ratio (SNR), we propose a recognition method using the second-order short-time Fourier transform (STFT)-based synchrosqueezing transform (FSST2) combined with a modified convolution neural network, which we name MeNet. In particular, the radar signals are first preprocessed via the time–frequency analysis and STFT-based FSST2. Then, the informative features of the time–frequency images (TFIs) are deeply learned and classified through the MeNet with several specific convolutional blocks. The simulation results show that the overall recognition rate for seven types of intrapulse modulation radar signals can reach 95.6%, even when the SNR is −12 dB. Compared with other networks, the excellent recognition rate proves the superiority of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.