Author: "Kian Ming Lim" / Database: OpenAIRE - Searchworks@Jio Institute Digital Library Search Results

1. Text-to-image synthesis with self-supervised bi-stage generative adversarial network

Author: Yong Xuan Tan, Chin Poo Lee, Mai Neo, Kian Ming Lim, and Jit Yan Lim
Subjects: Artificial Intelligence, Signal Processing, Computer Vision and Pattern Recognition, Software
Published: 2023
Full Text: View/download PDF

2. Enhanced Text-to-Image Synthesis With Self-Supervision

Author: Yong Xuan Tan, Chin Poo Lee, Mai Neo, Kian Ming Lim, and Jit Yan Lim
Subjects: General Computer Science, General Engineering, General Materials Science, Electrical and Electronic Engineering
Published: 2023
Full Text: View/download PDF

3. 2SRS: Two-Stream Residual Separable Convolution Neural Network for Hyperspectral Image Classification

Author: Zharfan Zahisham, Kian Ming Lim, Voon Chet Koo, Yee Kit Chan, and Chin Poo Lee
Subjects: Electrical and Electronic Engineering, Geotechnical Engineering and Engineering Geology
Published: 2023
Full Text: View/download PDF

4. HGR-ViT: Hand Gesture Recognition with Vision Transformer

Author: Alqahtani, Chun Keat Tan, Kian Ming Lim, Roy Kwang Yang Chang, Chin Poo Lee, and Ali
Subjects: hand gesture recognition, sign language recognition, vision transformer, ViT, attention
Abstract: Hand gesture recognition (HGR) is a crucial area of research that enhances communication by overcoming language barriers and facilitating human-computer interaction. Although previous works in HGR have employed deep neural networks, they fail to encode the orientation and position of the hand in the image. To address this issue, this paper proposes HGR-ViT, a Vision Transformer (ViT) model with an attention mechanism for hand gesture recognition. Given a hand gesture image, it is first split into fixed size patches. Positional embedding is added to these embeddings to form learnable vectors that capture the positional information of the hand patches. The resulting sequence of vectors are then served as the input to a standard Transformer encoder to obtain the hand gesture representation. A multilayer perceptron head is added to the output of the encoder to classify the hand gesture to the correct class. The proposed HGR-ViT obtains an accuracy of 99.98%, 99.36% and 99.85% for the American Sign Language (ASL) dataset, ASL with Digits dataset, and National University of Singapore (NUS) hand gesture dataset, respectively.
Published: 2023
Full Text: View/download PDF

5. Fine-Tuned Temporal Dense Sampling with 1D Convolutional Neural Network for Human Action Recognition

Author: Ali, Kian Ming Lim, Chin Poo Lee, Kok Seang Tan, Ali Alqahtani, and Mohammed
Subjects: human action recognition, temporal dense sampling, 1D convolutional neural network (1D ConvNet), 1D-CNN, Inception-ResNet-V2
Abstract: Human action recognition is a constantly evolving field that is driven by numerous applications. In recent years, significant progress has been made in this area due to the development of advanced representation learning techniques. Despite this progress, human action recognition still poses significant challenges, particularly due to the unpredictable variations in the visual appearance of an image sequence. To address these challenges, we propose the fine-tuned temporal dense sampling with 1D convolutional neural network (FTDS-1DConvNet). Our method involves the use of temporal segmentation and temporal dense sampling, which help to capture the most important features of a human action video. First, the human action video is partitioned into segments through temporal segmentation. Each segment is then processed through a fine-tuned Inception-ResNet-V2 model, where max pooling is performed along the temporal axis to encode the most significant features as a fixed-length representation. This representation is then fed into a 1DConvNet for further representation learning and classification. The experiments on UCF101 and HMDB51 demonstrate that the proposed FTDS-1DConvNet outperforms the state-of-the-art methods, with a classification accuracy of 88.43% on the UCF101 dataset and 56.23% on the HMDB51 dataset.
Published: 2023
Full Text: View/download PDF

6. ComSense-CNN: acoustic event classification via 1D convolutional neural network with compressed sensing

Author: Pooi Shiang Tan, Kian Ming Lim, Cheah Heng Tan, Chin Poo Lee, and Lee Chung Kwek
Subjects: Signal Processing, Electrical and Electronic Engineering
Published: 2022
Full Text: View/download PDF

7. Text-to-image synthesis with self-supervised learning

Author: Yong Xuan Tan, Chin Poo Lee, Mai Neo, and Kian Ming Lim
Subjects: Artificial Intelligence, Signal Processing, Computer Vision and Pattern Recognition, Software
Published: 2022
Full Text: View/download PDF

8. 3D Shape Generation via Variational Autoencoder with Signed Distance Function Relativistic Average Generative Adversarial Network

Author: Ebenezer Akinyemi Ajayi, Kian Ming Lim, Siew-Chin Chong, and Chin Poo Lee
Subjects: Fluid Flow and Transfer Processes, Process Chemistry and Technology, 3D shape generation, variational autoencoder, generative adversarial network, signed distance function, relativistic average, General Engineering, General Materials Science, Instrumentation, Computer Science Applications
Abstract: 3D shape generation is widely applied in various industries to create, visualize, and analyse complex data, designs, and simulations. Typically, 3D shape generation uses a large dataset of 3D shapes as the input. This paper proposes a variational autoencoder with a signed distance function relativistic average generative adversarial network, referred to as 3D-VAE-SDFRaGAN, for 3D shape generation from 2D input images. Both the generative adversarial network (GAN) and variational autoencoder (VAE) algorithms are typical algorithms used to generate realistic 3D shapes. However, it is very challenging to train a stable 3D shape generation model using VAE-GAN. This paper proposes an efficient approach to stabilize the training process of VAE-GAN to generate high-quality 3D shapes. A 3D mesh-based shape is first generated using a 3D signed distance function representation by feeding a single 2D image into a 3D-VAE-SDFRaGAN network. The signed distance function is used to maintain inside–outside information in the implicit surface representation. In addition, a relativistic average discriminator loss function is employed as the training loss function. The polygon mesh surfaces are then produced via the marching cubes algorithm. The proposed 3D-VAE-SDFRaGAN is evaluated with the ShapeNet dataset. The experimental results indicate a notable enhancement in the qualitative performance, as evidenced by the visual comparison of the generated samples, as well as the quantitative performance evaluation using the chamfer distance metric. The proposed approach achieves an average chamfer distance score of 0.578, demonstrating superior performance compared to existing state-of-the-art models.
Published: 2023
Full Text: View/download PDF

9. Gait-CNN-ViT: Multi-Model Gait Recognition with Convolutional Neural Networks and Vision Transformer

Author: Jashila Nair Mogan, Chin Poo Lee, Kian Ming Lim, Mohammed Ali, and Ali Alqahtani
Subjects: Electrical and Electronic Engineering, Biochemistry, Instrumentation, Atomic and Molecular Physics, and Optics, Analytical Chemistry, deep learning, ensemble, gait, gait recognition
Abstract: Gait recognition, the task of identifying an individual based on their unique walking style, can be difficult because walking styles can be influenced by external factors such as clothing, viewing angle, and carrying conditions. To address these challenges, this paper proposes a multi-model gait recognition system that integrates Convolutional Neural Networks (CNNs) and Vision Transformer. The first step in the process is to obtain a gait energy image, which is achieved by applying an averaging technique to a gait cycle. The gait energy image is then fed into three different models, DenseNet-201, VGG-16, and a Vision Transformer. These models are pre-trained and fine-tuned to encode the salient gait features that are specific to an individual’s walking style. Each model provides prediction scores for the classes based on the encoded features, and these scores are then summed and averaged to produce the final class label. The performance of this multi-model gait recognition system was evaluated on three datasets, CASIA-B, OU-ISIR dataset D, and OU-ISIR Large Population dataset. The experimental results showed substantial improvement compared to existing methods on all three datasets. The integration of CNNs and ViT allows the system to learn both the pre-defined and distinct features, providing a robust solution for gait recognition even under the influence of covariates.
Published: 2023
Full Text: View/download PDF

10. Enhanced Traffic Sign Recognition with Ensemble Learning

Author: Xin Roy Lim, Chin Poo Lee, Kian Ming Lim, and Thian Song Ong
Subjects: Control and Optimization, Computer Networks and Communications, Instrumentation, traffic sign recognition, convolutional neural network, ensemble learning
Abstract: With the growing trend in autonomous vehicles, accurate recognition of traffic signs has become crucial. This research focuses on the use of convolutional neural networks for traffic sign classification, specifically utilizing pre-trained models of ResNet50, DenseNet121, and VGG16. To enhance the accuracy and robustness of the model, the authors implement an ensemble learning technique with majority voting, to combine the predictions of multiple CNNs. The proposed approach was evaluated on three different traffic sign datasets: the German Traffic Sign Recognition Benchmark (GTSRB), the Belgium Traffic Sign Dataset (BTSD), and the Chinese Traffic Sign Database (TSRD). The results demonstrate the efficacy of the ensemble approach, with recognition rates of 98.84% on the GTSRB dataset, 98.33% on the BTSD dataset, and 94.55% on the TSRD dataset.
Published: 2023
Full Text: View/download PDF

11. Sentiment Analysis With Ensemble Hybrid Deep Learning Model

Author: Kian Long Tan, Chin Poo Lee, Kian Ming Lim, and Kalaiarasi Sonai Muthu Anbananthen
Subjects: General Computer Science, General Engineering, General Materials Science, Electrical and Electronic Engineering
Published: 2022
Full Text: View/download PDF

12. Efficient-PrototypicalNet with self knowledge distillation for few-shot learning

Author: Jit Yan Lim, Chin Poo Lee, Shih Yin Ooi, and Kian Ming Lim
Subjects: Contextual image classification, business.industry, Computer science, Cognitive Neuroscience, Machine learning, computer.software_genre, Computer Science Applications, Task (computing), Artificial Intelligence, Metric (mathematics), Benchmark (computing), Feature (machine learning), Generalizability theory, Artificial intelligence, Performance improvement, Transfer of learning, business, computer
Abstract: The focus of recent few-shot learning research has been on the development of learning methods that can quickly adapt to unseen tasks with small amounts of data and low computational cost. In order to achieve higher performance in few-shot learning tasks, the generalizability of the method is essential to enable it generalize well from seen tasks to unseen tasks with limited number of samples. In this work, we investigate a new metric-based few-shot learning framework which transfers the knowledge from another effective classification model to produce well generalized embedding and improve the effectiveness in handling unseen tasks. The idea of our proposed Efficient-PrototypicalNet involves transfer learning, knowledge distillation, and few-shot learning. We employed a pre-trained model as a feature extractor to obtain useful features from tasks and decrease the task complexity. These features reduce the training difficulty in few-shot learning and increase the performance. Besides that, we further apply knowledge distillation to our framework and achieve extra performance improvement. The proposed Efficient-PrototypicalNet was evaluated on five benchmark datasets, i.e., Omniglot, miniImageNet, tieredImageNet, CIFAR-FS, and FC100. The proposed Efficient-PrototypicalNet achieved the state-of-the-art performance on most datasets in the 5-way K-shot image classification task, especially on the miniImageNet dataset.
Published: 2021
Full Text: View/download PDF

13. Three-dimensional shape generation via variational autoencoder generative adversarial network with signed distance function

Author: Ebenezer Akinyemi Ajayi, Kian Ming Lim, Siew-Chin Chong, and Chin Poo Lee
Subjects: General Computer Science, Electrical and Electronic Engineering
Abstract: Mesh-based 3-dimensional (3D) shape generation from a 2-dimensional (2D) image using a convolution neural network (CNN) framework is an open problem in the computer graphics and vision domains. Most existing CNN-based frameworks lack robust algorithms that can scale well without combining different shape parts. Also, most CNN-based algorithms lack suitable 3D data representations that can fit into CNN without modification(s) to produce high-quality 3D shapes. This paper presents an approach that integrates a variational autoencoder (VAE) and a generative adversarial network (GAN) called 3 dimensional variational autoencoder signed distance function generative adversarial network (3D-VAE-SDFGAN) to create a 3D shape from a 2D image that considerably improves scalability and visual quality. The proposed method only feeds a single 2D image into the network to produce a mesh-based 3D shape. The network encodes a 2D image of the 3D object into the latent representations, and implicit surface representations of 3D objects corresponding to those 2D images are subsequently generated. Hence, a signed distance function (SDF) is proposed to maintain object inside-outside information in the implicit surface representation. Polygon mesh surfaces are then produced using the marching cubes algorithm. The ShapeNet dataset was used in the experiments to evaluate the proposed 3D-VAE-SDFGAN. The experimental results show that 3D-VAE-SDFGAN outperforms other state-of-the-art models.
Published: 2023
Full Text: View/download PDF

14. Wearable sensor-based human activity recognition with ensemble learning: a comparison study

Author: Yee Jia Luwe, Chin Poo Lee, and Kian Ming Lim
Subjects: General Computer Science, Electrical and Electronic Engineering
Abstract: The spectacular growth of wearable sensors has provided a key contribution to the field of human activity recognition. Due to its effective and versatile usage and application in various fields such as smart homes and medical areas, human activity recognition has always been an appealing research topic in artificial intelligence. From this perspective, there are a lot of existing works that make use of accelerometer and gyroscope sensor data for recognizing human activities. This paper presents a comparative study of ensemble learning methods for human activity recognition. The methods include random forest, adaptive boosting, gradient boosting, extreme gradient boosting, and light gradient boosting machine (LightGBM). Among the ensemble learning methods in comparison, light gradient boosting machine and random forest demonstrate the best performance. The experimental results revealed that light gradient boosting machine yields the highest accuracy of 94.50% on UCI-HAR dataset and 100% on single accelerometer dataset while random forest records the highest accuracy of 93.41% on motion sense dataset.
Published: 2023
Full Text: View/download PDF

15. Speech emotion recognition with light gradient boosting decision trees machine

Author: Kah Liang Ong, Chin Poo Lee, Heng Siong Lim, and Kian Ming Lim
Subjects: General Computer Science, Electrical and Electronic Engineering
Abstract: Speech emotion recognition aims to identify the emotion expressed in the speech by analyzing the audio signals. In this work, data augmentation is first performed on the audio samples to increase the number of samples for better model learning. The audio samples are comprehensively encoded as the frequency and temporal domain features. In the classification, a light gradient boosting machine is leveraged. The hyperparameter tuning of the light gradient boosting machine is performed to determine the optimal hyperparameter settings. As the speech emotion recognition datasets are imbalanced, the class weights are regulated to be inversely proportional to the sample distribution where minority classes are assigned higher class weights. The experimental results demonstrate that the proposed method outshines the state-of-the-art methods with 84.91% accuracy on the emo-DB dataset, 67.72% on the Ryerson audio-visual database of emotional speech and song (RAVDESS) dataset, and 62.94% on the interactive emotional dyadic motion capture (IEMOCAP) dataset.
Published: 2023
Full Text: View/download PDF

16. Wearable Sensor-Based Human Activity Recognition with Hybrid Deep Learning Model

Author: Yee Jia Luwe, Chin Poo Lee, and Kian Ming Lim
Subjects: Human-Computer Interaction, human activity recognition, convolutional neural network, long short-term memory, wearable sensor, Computer Networks and Communications, Communication
Abstract: It is undeniable that mobile devices have become an inseparable part of human’s daily routines due to the persistent growth of high-quality sensor devices, powerful computational resources and massive storage capacity nowadays. Similarly, the fast development of Internet of Things technology has motivated people into the research and wide applications of sensors, such as the human activity recognition system. This results in substantial existing works that have utilized wearable sensors to identify human activities with a variety of techniques. In this paper, a hybrid deep learning model that amalgamates a one-dimensional Convolutional Neural Network with a bidirectional long short-term memory (1D-CNN-BiLSTM) model is proposed for wearable sensor-based human activity recognition. The one-dimensional Convolutional Neural Network transforms the prominent information in the sensor time series data into high level representative features. Thereafter, the bidirectional long short-term memory encodes the long-range dependencies in the features by gating mechanisms. The performance evaluation reveals that the proposed 1D-CNN-BiLSTM outshines the existing methods with a recognition rate of 95.48% on the UCI-HAR dataset, 94.17% on the Motion Sense dataset and 100% on the Single Accelerometer dataset.
Published: 2022
Full Text: View/download PDF

17. Gait-ViT: Gait Recognition with Vision Transformer

Author: Jashila Nair Mogan, Chin Poo Lee, Kian Ming Lim, and Kalaiarasi Sonai Muthu
Subjects: Biometry, gait, gait recognition, deep learning, transformers, vision transformer, vit, attention, Neural Networks, Computer, Electrical and Electronic Engineering, Biochemistry, Instrumentation, Gait, Atomic and Molecular Physics, and Optics, Analytical Chemistry, Pattern Recognition, Automated
Abstract: Identifying an individual based on their physical/behavioral characteristics is known as biometric recognition. Gait is one of the most reliable biometrics due to its advantages, such as being perceivable at a long distance and difficult to replicate. The existing works mostly leverage Convolutional Neural Networks for gait recognition. The Convolutional Neural Networks perform well in image recognition tasks; however, they lack the attention mechanism to emphasize more on the significant regions of the image. The attention mechanism encodes information in the image patches, which facilitates the model to learn the substantial features in the specific regions. In light of this, this work employs the Vision Transformer (ViT) with an attention mechanism for gait recognition, referred to as Gait-ViT. In the proposed Gait-ViT, the gait energy image is first obtained by averaging the series of images over the gait cycle. The images are then split into patches and transformed into sequences by flattening and patch embedding. Position embedding, along with patch embedding, are applied on the sequence of patches to restore the positional information of the patches. Subsequently, the sequence of vectors is fed to the Transformer encoder to produce the final gait representation. As for the classification, the first element of the sequence is sent to the multi-layer perceptron to predict the class label. The proposed method obtained 99.93% on CASIA-B, 100% on OU-ISIR D and 99.51% on OU-LP, which exhibit the ability of the Vision Transformer model to outperform the state-of-the-art methods.
Published: 2022

18. Hand Gesture Recognition via Lightweight VGG16 and Ensemble Classifier

Author: Lee Chung Kwek, Kian Ming Lim, Edmond Ewe, and Chin Poo Lee
Subjects: Fluid Flow and Transfer Processes, Process Chemistry and Technology, General Engineering, General Materials Science, Instrumentation, sign language recognition, hand gesture recognition, convolutional neural network (CNN), ensemble classifier, lightweight VGG16, random forest, transfer learning, Computer Science Applications
Abstract: Gesture recognition has been studied for a while within the fields of computer vision and pattern recognition. A gesture can be defined as a meaningful physical movement of the fingers, hands, arms, or other parts of the body with the purpose to convey information for the environment interaction. For instance, hand gesture recognition (HGR) can be used to recognize sign language which is the primary means of communication by the deaf and mute. Vision-based HGR is critical in its application; however, there are challenges that will need to be overcome such as variations in the background, illuminations, hand orientation and size and similarities among gestures. The traditional machine learning approach has been widely used in vision-based HGR in recent years but the complexity of its processing has been a major challenge—especially on the handcrafted feature extraction. The effectiveness of the handcrafted feature extraction technique was not proven across various datasets in comparison to deep learning techniques. Therefore, a hybrid network architecture dubbed as Lightweight VGG16 and Random Forest (Lightweight VGG16-RF) is proposed for vision-based hand gesture recognition. The proposed model adopts feature extraction techniques via the convolutional neural network (CNN) while using the machine learning method to perform classification. Experiments were carried out on publicly available datasets such as American Sign Language (ASL), ASL Digits and NUS Hand Posture dataset. The experimental results demonstrate that the proposed model, a combination of lightweight VGG16 and random forest, outperforms other methods.
Published: 2022
Full Text: View/download PDF

19. Design and Development of a Drone Based Hyperspectral Imaging System

Author: Yee Kit Chan, Voon Chet Koo, Muhammad Zharfan Adli Zahisham, Kian Ming Lim, Tee Connie, Chee Siong Lim, Yang-Lang Chang, Yang Ping Lee, and Haryati Abidin
Published: 2022
Full Text: View/download PDF

20. Advances in Vision-Based Gait Recognition: From Handcrafted to Deep Learning

Author: Jashila Nair Mogan, Chin Poo Lee, and Kian Ming Lim
Subjects: Biometry, Deep Learning, Humans, Walking, Electrical and Electronic Engineering, Biochemistry, Instrumentation, Gait, Atomic and Molecular Physics, and Optics, Algorithms, Analytical Chemistry
Abstract: Identifying people’s identity by using behavioral biometrics has attracted many researchers’ attention in the biometrics industry. Gait is a behavioral trait, whereby an individual is identified based on their walking style. Over the years, gait recognition has been performed by using handcrafted approaches. However, due to several covariates’ effects, the competence of the approach has been compromised. Deep learning is an emerging algorithm in the biometrics field, which has the capability to tackle the covariates and produce highly accurate results. In this paper, a comprehensive overview of the existing deep learning-based gait recognition approach is presented. In addition, a summary of the performance of the approach on different gait datasets is provided.
Published: 2022

21. COVID-19 Diagnosis on Chest Radiographs with Enhanced Deep Neural Networks

Author: Kian Ming Lim and Chin Poo Lee
Subjects: COVID-19, deep neural networks, chest X-ray, chest radiograph, DenseNet, fine-tuning, pre-trained, CNN, Clinical Biochemistry
Abstract: The COVID-19 pandemic has caused a devastating impact on the social activity, economy and politics worldwide. Techniques to diagnose COVID-19 cases by examining anomalies in chest X-ray images are urgently needed. Inspired by the success of deep learning in various tasks, this paper evaluates the performance of four deep neural networks in detecting COVID-19 patients from their chest radiographs. The deep neural networks studied include VGG16, MobileNet, ResNet50 and DenseNet201. Preliminary experiments show that all deep neural networks perform promisingly, while DenseNet201 outshines other models. Nevertheless, the sensitivity rates of the models are below expectations, which can be attributed to several factors: limited publicly available COVID-19 images, imbalanced sample size for the COVID-19 class and non-COVID-19 class, overfitting or underfitting of the deep neural networks and that the feature extraction of pre-trained models does not adapt well to the COVID-19 detection task. To address these factors, several enhancements are proposed, including data augmentation, adjusted class weights, early stopping and fine-tuning, to improve the performance. Empirical results on DenseNet201 with these enhancements demonstrate outstanding performance with an accuracy of 0.999%, precision of 0.9899%, sensitivity of 0.98%, specificity of 0.9997% and F1-score of 0.9849% on the COVID-Xray-5k dataset.
Published: 2022

22. Convolutional neural network with spatial pyramid pooling for hand gesture recognition

Author: Yong Soon Tan, Cheng-Yaw Low, Chin Poo Lee, Kian Ming Lim, and Connie Tee
Subjects: 0209 industrial biotechnology, American Sign Language, Computer science, Speech recognition, Pooling, 02 engineering and technology, Convolutional neural network, language.human_language, 020901 industrial engineering & automation, Artificial Intelligence, Gesture recognition, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), language, 020201 artificial intelligence & image processing, Pyramid (image processing), Representation (mathematics), Software, Gesture
Abstract: Hand gesture provides a means for human to interact through a series of gestures. While hand gesture plays a significant role in human–computer interaction, it also breaks down the communication barrier and simplifies communication process between the general public and the hearing-impaired community. This paper outlines a convolutional neural network (CNN) integrated with spatial pyramid pooling (SPP), dubbed CNN–SPP, for vision-based hand gesture recognition. SPP is discerned mitigating the problem found in conventional pooling by having multi-level pooling stacked together to extend the features being fed into a fully connected layer. Provided with inputs of varying sizes, SPP also yields a fixed-length feature representation. Extensive experiments have been conducted to scrutinize the CNN–SPP performance on two well-known American sign language (ASL) datasets and one NUS hand gesture dataset. Our empirical results disclose that CNN–SPP prevails over other deep learning-driven instances.
Published: 2020
Full Text: View/download PDF

23. Recent Advances in Traffic Sign Recognition: Approaches and Datasets

Author: Xin Roy Lim, Chin Poo Lee, Kian Ming Lim, Thian Song Ong, Ali Alqahtani, and Mohammed Ali
Subjects: Electrical and Electronic Engineering, Biochemistry, Instrumentation, Atomic and Molecular Physics, and Optics, Analytical Chemistry
Abstract: Autonomous vehicles have become a topic of interest in recent times due to the rapid advancement of automobile and computer vision technology. The ability of autonomous vehicles to drive safely and efficiently relies heavily on their ability to accurately recognize traffic signs. This makes traffic sign recognition a critical component of autonomous driving systems. To address this challenge, researchers have been exploring various approaches to traffic sign recognition, including machine learning and deep learning. Despite these efforts, the variability of traffic signs across different geographical regions, complex background scenes, and changes in illumination still poses significant challenges to the development of reliable traffic sign recognition systems. This paper provides a comprehensive overview of the latest advancements in the field of traffic sign recognition, covering various key areas, including preprocessing techniques, feature extraction methods, classification techniques, datasets, and performance evaluation. The paper also delves into the commonly used traffic sign recognition datasets and their associated challenges. Additionally, this paper sheds light on the limitations and future research prospects of traffic sign recognition.
Published: 2023
Full Text: View/download PDF

24. A Survey of Sentiment Analysis: Approaches, Datasets, and Future Research

Author: Kian Long Tan, Chin Poo Lee, and Kian Ming Lim
Subjects: Fluid Flow and Transfer Processes, Process Chemistry and Technology, General Engineering, General Materials Science, Instrumentation, Computer Science Applications
Abstract: Sentiment analysis is a critical subfield of natural language processing that focuses on categorizing text into three primary sentiments: positive, negative, and neutral. With the proliferation of online platforms where individuals can openly express their opinions and perspectives, it has become increasingly crucial for organizations to comprehend the underlying sentiments behind these opinions to make informed decisions. By comprehending the sentiments behind customers’ opinions and attitudes towards products and services, companies can improve customer satisfaction, increase brand reputation, and ultimately increase revenue. Additionally, sentiment analysis can be applied to political analysis to understand public opinion toward political parties, candidates, and policies. Sentiment analysis can also be used in the financial industry to analyze news articles and social media posts to predict stock prices and identify potential investment opportunities. This paper offers an overview of the latest advancements in sentiment analysis, including preprocessing techniques, feature extraction methods, classification techniques, widely used datasets, and experimental results. Furthermore, this paper delves into the challenges posed by sentiment analysis datasets and discusses some limitations and future research prospects of sentiment analysis. Given the importance of sentiment analysis, this paper provides valuable insights into the current state of the field and serves as a valuable resource for both researchers and practitioners. The information presented in this paper can inform stakeholders about the latest advancements in sentiment analysis and guide future research in the field.
Published: 2023
Full Text: View/download PDF

25. RoBERTa-GRU: A Hybrid Deep Learning Model for Enhanced Sentiment Analysis

Author: Kian Long Tan, Chin Poo Lee, and Kian Ming Lim
Subjects: Fluid Flow and Transfer Processes, Process Chemistry and Technology, General Engineering, General Materials Science, sentiment analysis, deep learning, Transformer, RoBERTa, GRU, Instrumentation, Computer Science Applications
Abstract: This paper proposes a novel hybrid model for sentiment analysis. The model leverages the strengths of both the Transformer model, represented by the Robustly Optimized BERT Pretraining Approach (RoBERTa), and the Recurrent Neural Network, represented by Gated Recurrent Units (GRU). The RoBERTa model provides the capability to project the texts into a discriminative embedding space through its attention mechanism, while the GRU model captures the long-range dependencies of the embedding and addresses the vanishing gradients problem. To overcome the challenge of imbalanced datasets in sentiment analysis, this paper also proposes the use of data augmentation with word embeddings by over-sampling the minority classes. This enhances the representation capacity of the model, making it more robust and accurate in handling the sentiment classification task. The proposed RoBERTa-GRU model was evaluated on three widely used sentiment analysis datasets: IMDb, Sentiment140, and Twitter US Airline Sentiment. The results show that the model achieved an accuracy of 94.63% on IMDb, 89.59% on Sentiment140, and 91.52% on Twitter US Airline Sentiment. These results demonstrate the effectiveness of the proposed RoBERTa-GRU hybrid model in sentiment analysis.
Published: 2023
Full Text: View/download PDF

26. Herb Classification with Convolutional Neural Network

Author: Chin Poo Lee, Jia Wei Tan, and Kian Ming Lim
Subjects: food.ingredient, Computer science, business.industry, Machine learning, computer.software_genre, Convolutional neural network, Medical services, food, Herb, Softmax function, Artificial intelligence, business, computer, Max pooling, Dropout (neural networks)
Abstract: Herbs are plants with savory or aromatic properties that are widely used for flavoring, food, medicine or perfume. The worldwide use of herbal products for healthcare has increased tremendously over the past decades. The plethora of herb species makes recognizing the herbs remains a challenge. This has spurred great interests among the researchers on pursuing artificial intelligent methods for herb classification. This paper presents a convolutional neural network (CNN) for herb classification. The proposed CNN consists of two convolution layers, two max pooling layers, a fully-connected layer and a softmax layer. The ReLU activation function and dropout regularization are leveraged to improve the performance of the proposed CNN. A dataset with 4067 herb images was collected for the evaluation purposes. The proposed CNN model achieves an accuracy of above 93% despite the fact that some herbs are visually similar.
Published: 2021
Full Text: View/download PDF

27. 1D Convolutional Neural Network with Long Short-Term Memory for Human Activity Recognition

Author: Chin Poo Lee, Kian Ming Lim, and Jia Xin Goh
Subjects: Early stopping, Computer science, business.industry, Dimensionality reduction, Pattern recognition, Overfitting, ENCODE, Convolutional neural network, Activity recognition, ComputingMethodologies_PATTERNRECOGNITION, Classifier (linguistics), Softmax function, Artificial intelligence, business
Abstract: Human activity recognition aims to determine the actions or behavior of a person based on the time series data. In recent year, more large human activity recognition datasets are available as it can be collected in easier and cheaper ways. In this work, a 1D Convolutional Neural Network with Long Short-Term Memory Network for human activity recognition is proposed. The 1D Convolutional Neural Network is employed to learn high-level representative features from the accelerometer and gyroscope signal data. The Long Short-Term Memory network is then used to encode the temporal dependencies of the features. The final classification is performed with a softmax classifier. The proposed 1D Convolutional Neural Network with Long Short-Term Memory Network is evaluated on MotionSense, UCI-HAR, and USC-HAD datasets. The class distributions of these datasets are imbalanced. In view of this, adjusted class weight is proposed to mitigate the imbalanced class issue. Furthermore, early stopping is utilized to reduce the overfitting in the training. The proposed method achieved promising performance on MotionSense, UCI-HAR, and USC-HAD datasets, with F1-score of 98.14%, 91.04%, and 76.42%, respectively.
Published: 2021
Full Text: View/download PDF

28. Stacked Bidirectional Long Short-Term Memory for Stock Market Analysis

Author: Chin Poo Lee, Kian Ming Lim, and Jing Yee Lim
Subjects: Stock market prediction, Artificial neural network, Mean squared error, Computer science, business.industry, Deep learning, computer.software_genre, Task (computing), Empirical research, Task analysis, Stock market, Data mining, Artificial intelligence, business, computer
Abstract: Stock market prediction is a difficult task as it is extremely complex and volatile. Researchers are exploring methods to obtain good performance in stock market prediction. In this paper, we propose a Stacked Bidirectional Long Short-Term Memory (SBLSTM) network for stock market prediction. The proposed SBLSTM stacks three bidirectional LSTM networks to form a deep neural network model that can gain better prediction performance in the stock price forecasting. Unlike LSTM-based methods, the proposed SBLSTM uses bidirectional LSTM layers to obtain the temporal information in both forward and backward directions. In this way, the long-term dependencies from the past and future stock market values are encapsulated. The performance of the proposed SBLSTM is evaluated on six datasets collected from Yahoo Finance. Additionally, the proposed SBLSTM is compared with the state-of-the-art methods using root mean square error. The empirical studies on six datasets demonstrates that the proposed SBLSTM outperforms the state-of-the-art methods.
Published: 2021
Full Text: View/download PDF

29. Cryptocurrency Price Prediction with Convolutional Neural Network and Stacked Gated Recurrent Unit

Author: Chuen Yik Kang, Chin Poo Lee, and Kian Ming Lim
Subjects: Information Systems and Management, cryptocurrency price prediction, price prediction, convolutional neural network, gated recurrent unit, CNN, GRU, Bitcoin, Ethereum, Ripple, Computer Science Applications, Information Systems
Abstract: Virtual currencies have been declared as one of the financial assets that are widely recognized as exchange currencies. The cryptocurrency trades caught the attention of investors as cryptocurrencies can be considered as highly profitable investments. To optimize the profit of the cryptocurrency investments, accurate price prediction is essential. In view of the fact that the price prediction is a time series task, a hybrid deep learning model is proposed to predict the future price of the cryptocurrency. The hybrid model integrates a 1-dimensional convolutional neural network and stacked gated recurrent unit (1DCNN-GRU). Given the cryptocurrency price data over the time, the 1-dimensional convolutional neural network encodes the data into a high-level discriminative representation. Subsequently, the stacked gated recurrent unit captures the long-range dependencies of the representation. The proposed hybrid model was evaluated on three different cryptocurrency datasets, namely Bitcoin, Ethereum, and Ripple. Experimental results demonstrated that the proposed 1DCNN-GRU model outperformed the existing methods with the lowest RMSE values of 43.933 on the Bitcoin dataset, 3.511 on the Ethereum dataset, and 0.00128 on the Ripple dataset.
Published: 2022
Full Text: View/download PDF

30. Facial Emotion Recognition Using Transfer Learning of AlexNet

Author: Chin Poo Lee, Kian Ming Lim, and Sarmela A-P Raja Sekaran
Subjects: Facial expression, Contextual image classification, Computer science, business.industry, Deep learning, Feature extraction, Feature (machine learning), Pattern recognition, Artificial intelligence, business, Transfer of learning, Facial recognition system, Convolutional neural network
Abstract: In recent years, facial emotion recognition (FER) has become a prevalent research topic as it can be applied in various areas. The existing FER approaches include handcrafted feature-based methods (HCF) and deep learning methods (DL). HCF methods rely on how good the manual feature extractor can perform. The manually extracted features may be exposed to bias as it depends on the researcher’s prior knowledge of the domain. In contrast, DL methods, especially Convolutional Neural Network (CNN), are good at performing image classification. The downfall of DL methods is that they require extensive data to train and perform recognition efficiently. Hence, we propose a deep learning method based on transfer learning of pre-trained AlexNet architecture for FER. We perform full model finetuning on the Alexnet, which was previously trained on the Imagenet dataset, using emotion datasets. The proposed model is trained and tested on two widely used facial expression datasets, namely extended Cohn-Kanade (CK+) dataset and FER dataset. The proposed framework outperforms the existing state-of-the-art methods in facial emotion recognition by achieving the accuracy of 99.44% and 70.52% for the CK+ dataset and the FER dataset.
Published: 2021
Full Text: View/download PDF

31. Enhanced AlexNet with Super-Resolution for Low-Resolution Face Recognition

Author: Kian Ming Lim, Chin Poo Lee, and Jin Chyuan Tan
Subjects: business.industry, Computer science, Deep learning, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Normalization (image processing), Pattern recognition, Overfitting, Facial recognition system, Regularization (mathematics), Visualization, Artificial intelligence, business, Dropout (neural networks)
Abstract: With the advancement in deep learning, high-resolution face recognition has achieved outstanding performance that makes it widely adopted in many real-world applications. Face recognition plays a vital role in visual surveillance systems. However, the images captured by the security cameras are at low resolution causing the performance of the low-resolution face recognition relatively inferior. In view of this, we propose an enhanced AlexNet with Super-Resolution and Data Augmentation (SRDA-AlexNet) for low-resolution face recognition. Firstly, image super-resolution improves the quality of the low-resolution images to high-resolution images. Subsequently, data augmentation is applied to generate variations of the images for larger data size. An enhanced AlexNet with batch normalization and dropout regularization is then used for feature extraction. The batch normalization aims to reduce the internal covariate shift by normalizing the input distributions of the mini-batches. Apart from that, the dropout regularization improves the generalization capability and alleviates the overfitting of the model. The extracted features are then classified using k-Nearest Neighbors method for low-resolution face recognition. Empirical results demonstrate that the proposed SRDA-AlexNet outshines the methods in comparison.
Published: 2021
Full Text: View/download PDF

32. Visually Similar Handwritten Chinese Character Recognition with Convolutional Neural Network

Author: Wei Han Liu, Chin Poo Lee, and Kian Ming Lim
Subjects: Early stopping, Computer science, Character (computing), business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Process (computing), Pattern recognition, Overfitting, Convolutional neural network, ComputingMethodologies_PATTERNRECOGNITION, Handwriting recognition, Artificial intelligence, Chinese characters, business, Dropout (neural networks)
Abstract: Computer vision has penetrated many domains, for instance, security, sports, health and medicine, agriculture, transportation, manufacturing, retail, and so like. One of the computer vision tasks is character recognition. In this work, a visually similar handwritten Chinese character dataset is collected. Subsequently, an enhanced convolutional neural network is proposed for the recognition of visually similar handwritten Chinese characters. The convolutional neural network is enhanced by the dropout regularization and early stopping mechanism to reduce the overfitting problem. The Adam optimizer is also leveraged to accelerate and optimize the training process of the convolutional neural network. The empirical results demonstrate that the enhanced convolutional neural network achieves a 97% accuracy, thus corroborate it has better discriminating power in visually similar handwritten Chinese character recognition.
Published: 2021
Full Text: View/download PDF

33. Traffic Sign Recognition with Convolutional Neural Network

Author: Chin Poo Lee, Zhong Bo Ng, and Kian Ming Lim
Subjects: Normalization (statistics), business.industry, Computer science, Computer Science::Neural and Evolutionary Computation, Stability (learning theory), Pattern recognition, Convolutional neural network, ComputingMethodologies_PATTERNRECOGNITION, Computer Science::Computer Vision and Pattern Recognition, Multilayer perceptron, Traffic sign recognition, Artificial intelligence, business, Traffic sign, Dropout (neural networks)
Abstract: Traffic sign recognition is a computer vision technique to recognize the traffic signs put on the road. In this paper, a traffic sign dataset with approximately 5000 images is collected. This paper presents an ablation analysis of Multilayer Perceptron and Convolutional Neural Networks in traffic sign recognition. The ablation analysis studies the effects of different architectures of Multilayer Perceptron and Convolutional Neural Networks, batch normalization, and dropout. A total of 8 different models are reviewed and their performance is studied. The experimental results demonstrate that Convolutional Neural Networks outperform Multilayer Perceptron in general. Leveraging dropout layer and batch normalization is effective in improving the stability of the model and achieved 98.62% accuracy in traffic sign recognition.
Published: 2021
Full Text: View/download PDF

34. Fake News Detection with Hybrid CNN-LSTM

Author: Kian Long Tan, Kian Ming Lim, and Chin Poo Lee
Subjects: Sequence, business.industry, Computer science, Feature extraction, Overfitting, Machine learning, computer.software_genre, Convolutional neural network, Regularization (mathematics), Information and Communications Technology, Social media, Artificial intelligence, Memory model, business, computer
Abstract: In the past decades, information and communication technology has developed rapidly. Therefore, social media has become the main platform for people to share and spread information to others. Although social media has brought a lot of convenience to people, fake news also spread more rapidly than before. This situation has brought a destructive impact to people. In view of this, we propose a hybrid model of Convolutional Neural Network and Long Short-Term Memory for fake news detection. The Convolutional Neural Network model plays the role of extracting representative high-level sequence features whereas the Long Short-Term Memory model encodes the long-term dependencies of the sequence features. Two regularization techniques are applied to reduce the model complexity and to mitigate the overfitting problem. The empirical results demonstrate that the proposed Convolutional Neural Network -Long Short-Term Memory model yields the highest F1-score on four fake news datasets.
Published: 2021
Full Text: View/download PDF

35. FN-Net: A Deep Convolutional Neural Network for Fake News Detection

Author: Chin Poo Lee, Kian Long Tan, and Kian Ming Lim
Subjects: Computer science, business.industry, Overfitting, ENCODE, Machine learning, computer.software_genre, Convolutional neural network, Empirical research, Norm (artificial intelligence), Information and Communications Technology, Social media, Artificial intelligence, Gradient descent, business, computer
Abstract: Information and communication technology has evolved rapidly over the past decades, with a substantial development being the emergence of social media. It is the new norm that people share their information instantly and massively through social media platforms. The downside of this is that fake news also spread more rapidly and diffuse deeper than before. This has caused a devastating impact on people who are misled by fake news. In the interest of mitigating this problem, fake news detection is crucial to help people differentiate the authenticity of the news. In this research, an enhanced convolutional neural network (CNN) model, referred to as Fake News Net (FN-Net) is devised for fake news detection. The FN-Net consists of more pairs of convolution and max pooling layers to better encode the high-level features at different granularities. Besides that, two regularization techniques are incorporated into the FN-Net to address the overfitting problem. The gradient descent process of FN-Net is also accelerated by the Adam optimizer. The empirical studies on four datasets demonstrate that FN-Net outshines the original CNN model.
Published: 2021
Full Text: View/download PDF

36. New phase of lead chalcogenide alloy: Ternary alloy PbSrSe2 for future thermoelectric application

Author: Lay Chen Low, Yee Hui Robin Chang, Yik Seng Yong, Thong Leng Lim, Tiem Leong Yoon, and Kian Ming Lim
Subjects: General Materials Science
Published: 2022
Full Text: View/download PDF

37. Stock Market Prediction using Ensemble of Deep Neural Networks

Author: Chin Poo Lee, Lu Sin Chong, and Kian Ming Lim
Subjects: 0209 industrial biotechnology, Stock market prediction, Computer science, business.industry, 02 engineering and technology, Overfitting, Machine learning, computer.software_genre, Ensemble learning, Convolutional neural network, Task (computing), 020901 industrial engineering & automation, 0202 electrical engineering, electronic engineering, information engineering, Deep neural networks, 020201 artificial intelligence & image processing, Stock market, Artificial intelligence, Time series, business, computer
Abstract: Stock market prediction has been a challenging task for machine due to time series analysis is needed. In recent years, deep neural networks have been widely applied in many financial time series tasks. Typically, deep neural networks require huge amount of data samples to train a good model. However, the data samples for stock market is limited which caused the networks prone to overfitting. In view of this, this paper leverages deep neural networks with ensemble learning to address this problem. We propose ensemble of Convolutional Neural Network (CNN), Long Short Term Memory (LSTM), and 1DConvNet with LSTM (Conv1DLSTM) to predict the stock market price, named EnsembleDNNs. The performance of the proposed EnsembleDNNs is evaluated with stock market of several companies. The experiment results show encouraging performance as compared to other baselines.
Published: 2020
Full Text: View/download PDF

38. Acoustic Event Detection with MobileNet and 1D-Convolutional Neural Network

Author: Chin Poo Lee, Kian Ming Lim, Pooi Shiang Tan, and Cheah Heng Tan
Subjects: 0209 industrial biotechnology, Computer science, Event (computing), business.industry, Deep learning, Pattern recognition, 02 engineering and technology, Overfitting, Convolutional neural network, Convolution, 020901 industrial engineering & automation, Feature (computer vision), 0202 electrical engineering, electronic engineering, information engineering, Spectrogram, 020201 artificial intelligence & image processing, Artificial intelligence, business, Energy (signal processing), Dropout (neural networks), Sound wave
Abstract: Sound waves are a form of energy produced by a vibrating object that travels through the medium that can be heard. Generally, the sound is used in human communication, music, alert, and so on. Furthermore, it also helps us to understand what are the events that occurring in the moment, and thereby, provide us hints to understand what is happening around us. This has prompt researchers to study on how humans understand what event is occurring based on the sound waves. In recent years, researchers also study on how to equip the machine with this ability, i.e. acoustic event detection. This study focuses on the acoustic event detection which leverage both frequency spectrogram technique and deep learning methods. Initially, a spectrogram image is generated from the acoustic data by using the frequency spectrogram technique. Then, the generated frequency spectrogram is fed into a pre-trained MobileNet model to extract robust features representations. In this work, 1 Dimensional Convolutional Neural Network (1D-CNN) is adopted to train a model for acoustic event detection. The feature representations are extracted from a pre-trained MobileNet. The proposed 1D-CNN consist of several alternatives of convolution and pooling layers. The last pooling layer is flattened and fed into a fully connected layer to classify the events. Dropout is employed to prevent overfitting. The proposed frequency spectrogram with pre-trained MobileNet and 1D-CNN is then evaluated with three datasets, which are Soundscapes1, Soundscapes2, and UrbanSound8k. From the experimental results, the proposed method obtained 81, 86, and 70 F1-score, for Soundscapes1, Soundscapes2, and UrbanSound8k, respectively.
Published: 2020
Full Text: View/download PDF

39. Human Action Recognition with Sparse Autoencoder and Histogram of Oriented Gradients

Author: Pooi Shiang Tan, Chin Poo Lee, and Kian Ming Lim
Subjects: 0209 industrial biotechnology, business.industry, Computer science, Deep learning, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Pattern recognition, 02 engineering and technology, Filter (signal processing), Autoencoder, Grayscale, 020901 industrial engineering & automation, Histogram of oriented gradients, Hausdorff distance, Region of interest, Histogram, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business
Abstract: This paper presents a video-based human action recognition method leveraging deep learning model. Prior to the filtering phase, the input images are pre-processed by converting them into grayscale images. Thereafter, the region of interest that contains human performing action are cropped out by a pre-trained pedestrian detector. Next, the region of interest will be resized and passed as the input image to the filtering phase. In this phase, the filter kernels are trained using Sparse Autoencoder on the natural images. After obtaining the filter kernels, convolution operation is performed in the input image and the filter kernels. The filtered images are then passed to the feature extraction phase. The Histogram of Oriented Gradients descriptor is used to encode the local and global texture information of the filtered images. Lastly, in the classification phase, a Modified Hausdorff Distance is applied to classify the test sample to its nearest match based on the histograms. The performance of the deep learning algorithm is evaluated on three benchmark datasets, namely Weizmann Action Dataset, CAD-60 Dataset and Multimedia University (MMU) Human Action Dataset. The experimental results show that the proposed deep learning algorithm outperforms other methods on the Weizmann Dataset, CAD-60 Dataset and MMU Human Action Dataset with recognition rates of 100%, 88.24% and 99.5% respectively.
Published: 2020
Full Text: View/download PDF

40. Expert System for University Program Recommendation

Author: Chin Poo Lee, Yong Ee Low, Kian Ming Lim, and Zhong Bo Ng
Subjects: 0209 industrial biotechnology, Secondary education, Computer science, business.industry, media_common.quotation_subject, Applied psychology, 02 engineering and technology, computer.software_genre, Expert system, Holland Codes, 020901 industrial engineering & automation, Personality type, Knowledge base, 0202 electrical engineering, electronic engineering, information engineering, Personality, 020201 artificial intelligence & image processing, Personality test, Inference engine, business, computer, media_common
Abstract: Deciding on the most suitable university program to pursue after completing secondary education is a major step for school-leavers as it sets the tone of the career path they will embark on in the future. Undeniably, deciding on the university program is a sophisticated problem that involves a broad range of factors, such as geographic location, overall cost, campus security, personality, etc. Most of these factors, for instance, geographic location and campus security only have temporary impact on the school-leavers while they are attending to university. Nevertheless, the personality factor appears to be the long-term and the most crucial determinant of the success in their professional life. When choosing the university program, school-leavers often make the mistake of following friends' footsteps or adhering to parents' decisions, only to realize years later that it is not what suits their personality or what they want. Considering these challenges, this work studies on the Holland's Personality Test to examine the personality of school-leavers. There are six personality types in Holland's Personality Test, represented as RIASEC code or Holland code. On top of that, this work also compiles a comprehensive list of university programs with their corresponding Holland code. An expert system is then engineered to encode the information into the knowledge base. Upon taking the Holland Personality Test, the expert system will update the facts list based on the answers given. Subsequently, the inference engine activates the production rules if the conditions are fulfilled. Ultimately, the personality type and recommended university program are presented to the school-leavers.
Published: 2020
Full Text: View/download PDF

41. Food Recognition with ResNet-50

Author: Kian Ming Lim, Zharfan Zahisham, and Chin Poo Lee
Subjects: 0209 industrial biotechnology, Computer science, business.industry, Process (engineering), Deep learning, Cognitive neuroscience of visual object recognition, 02 engineering and technology, Machine learning, computer.software_genre, Convolutional neural network, Residual neural network, Field (computer science), Food recognition, 020901 industrial engineering & automation, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer
Abstract: Object recognition has spurred much attention in recent years. The fact that computers are now able to detect and recognize objects has made Artificial Intelligence field, especially machine learning grow very rapidly. The proposed framework uses Deep Convolutional Neural Network (DCNN) that is based on ResNet 50 architecture. Due to the limited computational resources to train the whole model, the ResNet model is imitated and the pre-trained weights are imported. Thereafter, the last few layers of the model are trained on three datasets that have been acquired online. This process is called fine-tuning a pre-trained model. It is one of the most common approaches in building a DCNN architecture. The dataset that was used to evaluate the performance of the model are ETHZ-FOOD101, UECFOOD100 and UECFOOD256. The parameter setting and results of the proposed method are also presented in this paper.
Published: 2020
Full Text: View/download PDF

42. Isolated sign language recognition using Convolutional Neural Network hand modelling and Hand Energy Image

Author: Chin Poo Lee, Kian Ming Lim, Alan W. C. Tan, and Shing Chiang Tan
Subjects: Ground truth, Computer Networks and Communications, Computer science, business.industry, 020207 software engineering, Pattern recognition, 02 engineering and technology, Sign language, Tracking (particle physics), Convolutional neural network, Hardware and Architecture, Position (vector), 0202 electrical engineering, electronic engineering, information engineering, Media Technology, Artificial intelligence, Representation (mathematics), business, Software, Energy (signal processing)
Abstract: This paper presents an isolated sign language recognition system that comprises of two main phases: hand tracking and hand representation. In the hand tracking phase, an annotated hand dataset is used to extract the hand patches to pre-train Convolutional Neural Network (CNN) hand models. The hand tracking is performed by the particle filter that combines hand motion and CNN pre-trained hand models into a joint likelihood observation model. The predicted hand position corresponds to the location of the particle with the highest joint likelihood. Based on the predicted hand position, a square hand region centered around the predicted position is segmented and serves as the input to the hand representation phase. In the hand representation phase, a compact hand representation is computed by averaging the segmented hand regions. The obtained hand representation is referred to as “Hand Energy Image (HEI)”. Quantitative and qualitative analysis show that the proposed hand tracking method is able to predict the hand positions that are closer to the ground truth. Similarly, the proposed HEI hand representation outperforms other methods in the isolated sign language recognition.
Published: 2019
Full Text: View/download PDF

43. DeepScene: Scene classification via convolutional neural network with spatial pyramid pooling

Author: Pui Sin Yee, Kian Ming Lim, and Chin Poo Lee
Subjects: Artificial Intelligence, General Engineering, Computer Science Applications
Published: 2022
Full Text: View/download PDF

44. A four dukkha state-space model for hand tracking

Author: Kian Ming Lim, Shing Chiang Tan, and Alan W. C. Tan
Subjects: State-space representation, American Sign Language, Computer science, business.industry, Cognitive Neuroscience, 020206 networking & telecommunications, 02 engineering and technology, Sign language, language.human_language, Computer Science Applications, Artificial Intelligence, Dukkha, 0202 electrical engineering, electronic engineering, information engineering, language, 020201 artificial intelligence & image processing, Artificial intelligence, business, Gesture
Abstract: In this paper, we propose a hand tracking method which was inspired by the notion of the four dukkha: birth, aging, sickness and death (BASD) in Buddhism. Based on this philosophy, we formalize the hand tracking problem in the BASD framework, and apply it to hand track hand gestures in isolated sign language videos. The proposed BASD method is a novel nature-inspired computational intelligence method which is able to handle complex real-world tracking problem. The proposed BASD framework operates in a manner similar to a standard state-space model, but maintains multiple hypotheses and integrates hypothesis update and propagation mechanisms that resemble the effect of BASD. The survival of the hypothesis relies upon the strength, aging and sickness of existing hypotheses, and new hypotheses are birthed by the fittest pairs of parent hypotheses. These properties resolve the sample impoverishment problem of the particle filter. The estimated hand trajectories show promising results for the American sign language.
Published: 2017
Full Text: View/download PDF

45. Hand gesture recognition via enhanced densely connected convolutional neural network

Author: Kian Ming Lim, Yong Soon Tan, and Chin Poo Lee
Subjects: 0209 industrial biotechnology, Network architecture, Training set, Computer science, business.industry, Deep learning, Speech recognition, Feature extraction, Supervised learning, General Engineering, 02 engineering and technology, Sign language, Convolutional neural network, Computer Science Applications, 020901 industrial engineering & automation, Artificial Intelligence, Gesture recognition, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), 020201 artificial intelligence & image processing, Artificial intelligence, business, Gesture
Abstract: Hand gesture recognition (HGR) serves as a fundamental way of communication and interaction for human being. While HGR can be applied in human computer interaction (HCI) to facilitate user interaction, it can also be utilized for bridging the language barrier. For instance, HGR can be utilized to recognize sign language, which is a visual language represented by hand gestures and used by the deaf and mute all over the world as a primary way of communication. Hand-crafted approach for vision-based HGR typically involves multiple stages of specialized processing, such as hand-crafted feature extraction methods, which are usually designed to deal with particular challenges specifically. Hence, the effectiveness of the system and its ability to deal with varied challenges across multiple datasets are heavily reliant on the methods being utilized. In contrast, deep learning approach such as convolutional neural network (CNN), adapts to varied challenges via supervised learning. However, attaining satisfactory generalization on unseen data is not only dependent on the architecture of the CNN, but also dependent on the quantity and variety of the training data. Therefore, a customized network architecture dubbed as enhanced densely connected convolutional neural network (EDenseNet) is proposed for vision-based hand gesture recognition. The modified transition layer in EDenseNet further strengthens feature propagation, by utilizing bottleneck layer to propagate the features being reused to all the feature maps in a bottleneck manner, and the following Conv layer smooths out the unwanted features. Differences between EDenseNet and DenseNet are discerned, and its performance gains are scrutinized in the ablation study. Furthermore, numerous data augmentation techniques are utilized to attenuate the effect of data scarcity, by increasing the quantity of training data, and enriching its variety to further improve generalization. Experiments have been carried out on multiple datasets, namely one NUS hand gesture dataset and two American Sign Language (ASL) datasets. The proposed EDenseNet obtains 98.50% average accuracy without augmented data, and 99.64% average accuracy with augmented data, outperforming other deep learning driven instances in both settings, with and without augmented data.
Published: 2021
Full Text: View/download PDF

46. Review on Vision-Based Gait Recognition: Representations, Classification Schemes and Datasets

Author: Kian Ming Lim, Chin Poo Lee, and Alan W. C. Tan
Subjects: Multidisciplinary, Biometrics, business.industry, Computer science, 010401 analytical chemistry, Feature extraction, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Pattern recognition, 02 engineering and technology, 01 natural sciences, Motion (physics), 0104 chemical sciences, Set (abstract data type), Range (mathematics), ComputingMethodologies_PATTERNRECOGNITION, Gait (human), 0202 electrical engineering, electronic engineering, information engineering, Benchmark (computing), 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Representation (mathematics), business
Abstract: Gait has unique advantage at a distance when other biometrics cannot be used since they are at too low resolution or obscured, as commonly observed in visual surveillance systems. This paper provides a survey of the technical advancements in vision-based gait recognition. A wide range of publications are discussed in this survey embracing different perspectives of the research in this area, including gait feature extraction, classification schemes and standard gait databases. There are two major groups of the state-of-the-art techniques in characterizing gait: Model-based and motion-free. The model-based approach obtains a set of body or motion parameters via human body or motion modeling. The model-free approach, on the other hand, derives a description of the motion without assuming any model. Each major category is further organized into several subcategories based on the nature of gait representation. In addition, some widely used classification schemes and benchmark databases for evaluating performance are also discussed.
Published: 2017
Full Text: View/download PDF

47. Block-based histogram of optical flow for isolated sign language recognition

Author: Alan W. C. Tan, Kian Ming Lim, and Shing Chiang Tan
Subjects: business.industry, Feature vector, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Histogram matching, Optical flow, 02 engineering and technology, Sign language, 020204 information systems, Histogram, Signal Processing, 0202 electrical engineering, electronic engineering, information engineering, Media Technology, 020201 artificial intelligence & image processing, Computer vision, Computer Vision and Pattern Recognition, Artificial intelligence, Electrical and Electronic Engineering, business, Image histogram, Sign (mathematics), Mathematics, Gesture
Abstract: A normalized histogram of optical flow as a hand representation of the sign language.Block-based histogram provides spatial information and local translation invariant.Block-based histogram of optical flow enables sign language length invariance. In this paper, we propose a block-based histogram of optical flow (BHOF) to generate hand representation in sign language recognition. Optical flow of the sign language video is computed in a region centered around the location of the detected hand position. The hand patches of optical flow are segmented into M spatial blocks, where each block is a cuboid of a segment of a frame across the entire sign gesture video. The histogram of each block is then computed and normalized by its sum. The feature vector of all blocks are then concatenated as the BHOF sign gesture representation. The proposed method provides a compact scale-invariant representation of the sign language. Furthermore, block-based histogram encodes spatial information and provides local translation invariance in the extracted optical flow. Additionally, the proposed BHOF also introduces sign language length invariancy into its representation, and thereby, produce promising recognition rate in signer independent problems.
Published: 2016
Full Text: View/download PDF

48. A feature covariance matrix with serial particle filter for isolated sign language recognition

Author: Shing Chiang Tan, Alan W. C. Tan, and Kian Ming Lim
Subjects: American Sign Language, Computer science, business.industry, Covariance matrix, Feature extraction, General Engineering, 020207 software engineering, 02 engineering and technology, Sign language, language.human_language, Computer Science Applications, Artificial Intelligence, Gesture recognition, 0202 electrical engineering, electronic engineering, information engineering, Feature (machine learning), language, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, business, Particle filter, Sign (mathematics), Gesture
Abstract: A fusion of median and mode filtering for better background model.A serial particle filter that can better detect and track the object of interest.A novel covariance matrix feature for isolated sign language representation. As is widely recognized, sign language recognition is a very challenging visual recognition problem. In this paper, we propose a feature covariance matrix based serial particle filter for isolated sign language recognition. At the preprocessing stage, the fusion of the median and mode filters is employed to extract the foreground and thereby enhances hand detection. We propose to serially track the hands of the signer, as opposed to tracking both hands at the same time, to reduce the misdirection of target objects. Subsequently, the region around the tracked hands is extracted to generate the feature covariance matrix as a compact representation of the tracked hand gesture, and thereby reduce the dimensionality of the features. In addition, the proposed feature covariance matrix is able to adapt to new signs due to its ability to integrate multiple correlated features in a natural way, without any retraining process. The experimental results show that the hand trajectories as obtained through the proposed serial hand tracking are closer to the ground truth. The sign gesture recognition based on the proposed methods yields a 87.33% recognition rate for the American Sign Language. The proposed hand tracking and feature extraction methodology is an important milestone in the development of expert systems designed for sign language recognition, such as automated sign language translation systems.
Published: 2016
Full Text: View/download PDF

49. AI-based targeted advertising system

Author: Kian Ming Lim, Siti Fatimah Abdul Razak, Tew Jia Yu, and Chin Poo Lee
Subjects: Control and Optimization, Biometrics, Computer Networks and Communications, Computer science, Gender recognition, Cognitive neuroscience of visual object recognition, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Vehicle recognition, Object recognition, Facial recognition system, Object detection, Hardware and Architecture, Human–computer interaction, Signal Processing, Targeted advertising, Age estimation, Electrical and Electronic Engineering, Information Systems
Abstract: The most common technology used in targeted advertising is facial recognition and vehicle recognition. Even though there are existing systems serving for the targeting purposes, most propose limited functionalities and the system performance is normally unknown. This paper presents an intelligent targeted advertising system with multiple functionalities, namely facial recognition for gender and age, vehicle recognition, and multiple object detection. The main purpose is to improve the effectiveness of outdoor advertising through biometrics approaches and machine learning technology. Machine learning algorithms are implemented for higher recognition accuracy and hence achieved better targeted advertising effect.
Published: 2019

50. Gait recognition using histograms of temporal gradients

Author: Jashila Nair Mogan, Kian Ming Lim, and Chin Poo Lee
Subjects: History, Pixel, Computer science, business.industry, Frame (networking), Pattern recognition, Computer Science Applications, Education, Set (abstract data type), Euclidean distance, Gait (human), Histogram, Feature (machine learning), Stage (hydrology), Artificial intelligence, business
Abstract: In this paper, we present a gait recognition method using convolutional features and histograms of temporal gradients. The method comprises three stages, namely the convolutional stage, temporal gradient stage and classification stage. In the convolutional stage, the video frames are convolved with a set of pre-learned filters. This is followed by the temporal gradient stage. In this stage, the gradient of each convolved frame in time axis is calculated. Unlike histograms of oriented gradients that accumulate the gradients in the spatial domain, the proposed histogram of temporal gradients encodes the gradients in the spatial and temporal domain. The histogram of temporal gradients captures the gradient patterns of every pixel over the temporal axis throughout the video sequence. By doing so, the feature encodes both spatial and temporal information in the gait cycle. Finally, in the classification stage, a majority voting classification with Euclidean distance is performed for gait recognition. Experimental results show that the proposed method outperforms the state-of-the-art methods.
Published: 2020
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

64 results on '"Kian Ming Lim"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources