47 results on '"Shuhaida Ismail"'
Search Results
2. Short-Term Forecasting of Daily Confirmed COVID-19 Cases in Malaysia Using RF-SSA Model
- Author
-
Shazlyn Milleana Shaharudin, Shuhaida Ismail, Noor Artika Hassan, Mou Leong Tan, and Nurul Ainina Filza Sulaiman
- Subjects
COVID-19 ,eigentriples ,forecasting ,recurrent forecasting ,singular spectrum analysis ,trend ,Public aspects of medicine ,RA1-1270 - Abstract
Novel coronavirus (COVID-19) was discovered in Wuhan, China in December 2019, and has affected millions of lives worldwide. On 29th April 2020, Malaysia reported more than 5,000 COVID-19 cases; the second highest in the Southeast Asian region after Singapore. Recently, a forecasting model was developed to measure and predict COVID-19 cases in Malaysia on daily basis for the next 10 days using previously-confirmed cases. A Recurrent Forecasting-Singular Spectrum Analysis (RF-SSA) is proposed by establishing L and ET parameters via several tests. The advantage of using this forecasting model is it would discriminate noise in a time series trend and produce significant forecasting results. The RF-SSA model assessment was based on the official COVID-19 data released by the World Health Organization (WHO) to predict daily confirmed cases between 30th April and 31st May, 2020. These results revealed that parameter L = 5 (T/20) for the RF-SSA model was indeed suitable for short-time series outbreak data, while the appropriate number of eigentriples was integral as it influenced the forecasting results. Evidently, the RF-SSA had over-forecasted the cases by 0.36%. This signifies the competence of RF-SSA in predicting the impending number of COVID-19 cases. Nonetheless, an enhanced RF-SSA algorithm should be developed for higher effectivity of capturing any extreme data changes.
- Published
- 2021
- Full Text
- View/download PDF
3. Predictive Modelling of Statistical Downscaling Based on Hybrid Machine Learning Model for Daily Rainfall in East-Coast Peninsular Malaysia
- Author
-
Nurul Ainina Filza Sulaiman, Shazlyn Milleana Shaharudin, Shuhaida Ismail, Nurul Hila Zainuddin, Mou Leong Tan, and Yusri Abd Jalil
- Subjects
statistical downscaling ,machine learning model ,Support Vector Classification ,Support Vector Regression ,Artificial Neural Networks ,Relevant Vector Machine ,Mathematics ,QA1-939 - Abstract
In recent years, climate change has demonstrated the volatility of unexpected events such as typhoons, flooding, and tsunamis that affect people, ecosystems and economies. As a result, the importance of predicting future climate has become even direr. The statistical downscaling approach was introduced as a solution to provide high-resolution climate projections. An effective statistical downscaling scheme aimed to be developed in this study is a two-phase machine learning technique for daily rainfall projection in the east coast of Peninsular Malaysia. The proposed approaches will counter the emerging issues. First, Principal Component Analysis (PCA) based on a symmetric correlation matrix is applied in order to rectify the issue of selecting predictors for a two-phase supervised model and help reduce the dimension of the supervised model. Secondly, two-phase machine learning techniques are introduced with a predictor selection mechanism. The first phase is a classification using Support Vector Classification (SVC) that determines dry and wet days. Subsequently, regression estimates the amount of rainfall based on the frequency of wet days using Support Vector Regression (SVR), Artificial Neural Networks (ANNs) and Relevant Vector Machines (RVMs). The comparison between hybridization models’ outcomes reveals that the hybrid of SVC and RVM reproduces the most reasonable daily rainfall prediction and considers high-precipitation extremes. The hybridization model indicates an improvement in predicting climate change predictions by establishing a relationship between the predictand and predictors.
- Published
- 2022
- Full Text
- View/download PDF
4. Statistical Modeling of RPCA-FCM in Spatiotemporal Rainfall Patterns Recognition
- Author
-
Siti Mariana Che Mat Nor, Shazlyn Milleana Shaharudin, Shuhaida Ismail, Sumayyah Aimi Mohd Najib, Mou Leong Tan, and Norhaiza Ahmad
- Subjects
principal component analysis ,robust principal component analysis ,rainfall patterns ,Tukey’s biweight correlation ,spatiotemporal ,Meteorology. Climatology ,QC851-999 - Abstract
This study was conducted to identify the spatiotemporal torrential rainfall patterns of the East Coast of Peninsular Malaysia, as it is the region most affected by the torrential rainfall of the Northeast Monsoon season. Dimension reduction, such as the classical Principal Components Analysis (PCA) coupled with the clustering approach, is often applied to reduce the dimension of the data while simultaneously performing cluster partitions. However, the classical PCA is highly insensitive to outliers, as it assigns equal weights to each set of observations. Hence, applying the classical PCA could affect the cluster partitions of the rainfall patterns. Furthermore, traditional clustering algorithms only allow each element to exclusively belong to one cluster, thus observations within overlapping clusters of the torrential rainfall datasets might not be captured effectively. In this study, a statistical model of torrential rainfall pattern recognition was proposed to alleviate these issues. Here, a Robust PCA (RPCA) based on Tukey’s biweight correlation was introduced and the optimum breakdown point to extract the number of components was identified. A breakdown point of 0.4 at 85% cumulative variance percentage efficiently extracted the number of components to avoid low-frequency variations or insignificant clusters on a spatial scale. Based on the extracted components, the rainfall patterns were further characterized based on cluster solutions attained using Fuzzy C-means clustering (FCM) to allow data elements to belong to more than one cluster, as the rainfall data structure permits this. Lastly, data generated using a Monte Carlo simulation were used to evaluate the performance of the proposed statistical modeling. It was found that the proposed RPCA-FCM performed better using RPCA-FCM compared to the classical PCA coupled with FCM in identifying the torrential rainfall patterns of Peninsular Malaysia’s East Coast.
- Published
- 2022
- Full Text
- View/download PDF
5. Support Vector Machine and Recurrent Neural Network Algorithm for Rainfall Forecasting.
- Author
-
Nur Syahira Jafri, Shuhaida Ismail, Aida Nabilah Sadon, Nur'aina A. Rahman, and Shazlyn Milleana Shaharuddin
- Published
- 2022
- Full Text
- View/download PDF
6. Application of Box-Jenkins, Artificial Neural Network and Support Vector Machine Model for Water Level Prediction.
- Author
-
Intan Syazwani Noorain, Shuhaida Ismail, Aida Nabilah Sadon, and Suhaila Mohd Yasin
- Published
- 2022
- Full Text
- View/download PDF
7. Comparative Performance of Various Imputation Methods for River Flow Data.
- Author
-
Nur Aliaa Dalila A. Muhaime, Muhammad Amirul Arifin, Shuhaida Ismail, and Shazlyn Milleana Shaharuddin
- Published
- 2022
- Full Text
- View/download PDF
8. An adjustment degree of fitting on fuzzy linear regression model toward manufacturing income
- Author
-
Nurfarawahida Ramly, Mohd Saifullah Rusiman, Shuhaida Ismail, Suparman, Firdaus Mohamad Hamzah, Ozlem Gurunlu Alma, MÜ, Fen Fakültesi, İstatistik Bölümü, and Gürünlü Alma, Özlem
- Subjects
Information Systems and Management ,Artificial Intelligence ,Control and Systems Engineering ,Degree of fitting ,Mean square error ,Electrical and Electronic Engineering ,Fuzzy linear regression ,Manufacturing income ,Multiple linear regression - Abstract
Regression analysis is a popular tool used in data analysis, whereas fuzzy regression is usually used for analyzing uncertain and imprecise data. In the industrial area, the company usually has problems in predicting the future manufacturing income. Therefore, a new approach model is needed to solve the future company prediction income. This article analyzed the manufacturing income by using the multiple linear regression (MLR) model and fuzzy linear regression (FLR) model proposed by Tanaka and Zolfaghari, involving 9 explanatory variables. In order to find the optimum of the FLR model, the degree of fitting (H) was adjusted between 0 to 1. The performance of three methods has been measured by using mean square error (MSE), mean absolute error (MAE) and mean absolute percentage error (MAPE). The analysis proved that FLR with Zolfaghari’s model with the degree of fitting of 0.025 outperformed the MLR and FLR with Tanaka’s model with the smallest error value. In conclusion, the manufacturing income is directly proportional to 6 independent variables. Furthermore, the manufacturing income is inversely proportional to 3 independent variables. This model is suitable in predicting future manufacturing income.
- Published
- 2023
9. Spatial Torrential Rainfall Modelling in Pattern Analysis Based on Robust PCA Approach
- Author
-
Azman Azid, Mohd Saiful Samsudin, Siti Mariana Che Mat Nor, Mou Leong Tan, Shazlyn Milleana Shaharudin, and Shuhaida Ismail
- Subjects
symbols.namesake ,Climatology ,symbols ,Environmental Chemistry ,Pearson product-moment correlation coefficient ,General Environmental Science ,Mathematics - Published
- 2021
- Full Text
- View/download PDF
10. A RPCA-Based Tukey's Biweight for Clustering Identification on Extreme Rainfall Data
- Author
-
Shuhaida Ismail, Siti Mariana Che Mat Nor, Shazlyn Milleana Shaharudin, and Kismiantini Kismiantini
- Subjects
Identification (information) ,Ecology ,business.industry ,Pattern recognition ,Artificial intelligence ,Environmental Science (miscellaneous) ,Cluster analysis ,business ,Pollution ,Nature and Landscape Conservation ,Mathematics - Published
- 2021
- Full Text
- View/download PDF
11. Prediction of Epidemic Trends in COVID-19 with Mann-Kendall and Recurrent Forecasting-Singular Spectrum Analysis
- Author
-
Shuhaida Ismail, Shazlyn Milleana Shaharudin, Azman Azid, Mohd Saiful Samsudin, Muhamad Afdal Ahmad Basri, and Mou Leong Tan
- Subjects
Multidisciplinary ,Coronavirus disease 2019 (COVID-19) ,05 social sciences ,Outbreak ,World health ,Test (assessment) ,Trend analysis ,Mann kendall ,Geography ,0502 economics and business ,Statistics ,050211 marketing ,Christian ministry ,Singular spectrum analysis ,050203 business & management - Abstract
Novel coronavirus also known as COVID-19 was first discovered in Wuhan, China by end of 2019. Since then, the virus has claimed millions of lives worldwide. In 29th April 2020, there were more than 5,000 outbreak cases in Malaysia as reported by the Ministry of Health Malaysia (MOHE). This study aims to evaluate the trend analysis of the COVID-19 outbreak using Mann-Kendall test, and predict the future cases of COVID-19 in Malaysia using Recurrent Forecasting-Singular Spectrum Analysis (RF-SSA) model. The RF-SSA model was developed to measure and predict daily COVID-19 cases in Malaysia for the coming 10 days using previously-confi rmed cases. A Singular Spectrum Analysis-based forecasting model that discriminates noise in a time series trend is introduced. The RF-SSA model assessment is based on the World Health Organization (WHO) offi cial COVID-19 data to predict the daily confi rmed cases after 29th April until 9th May, 2020. The preliminary results of Mann-Kendall test showed a declining trend pattern for new cases during Restricted Movement Order (RMO) 3 compared to RMO1, RMO2 and RMO4, with a dramatic increase in the COVID-19 outbreak during RMO1. Overall, the RF-SSA has over-forecasted the cases by 0.36%. This indicates RF-SSA s competence to predict the impending number of COVID-19 cases. The proposed model predicted that Malaysia would hit single digit in daily confirmed cased of COVID-19 by early-June 2020. These findings have proven the capability of RF-SSA model in apprehending the trend and predict the cases of COVID-19 with high accuracy. Nevertheless, enhanced RF-SSA algorithm should to be developed for higher effectivity in capturing any extreme data changes. © 2021 Penerbit Universiti Kebangsaan Malaysia. All rights reserved.
- Published
- 2021
- Full Text
- View/download PDF
12. K-means clustering analysis and multiple linear regression model on household income in Malaysia
- Author
-
Gan Pei Yee, Mohd Saifullah Rusiman, Shuhaida Ismail, Suparman Suparman, Firdaus Mohamad Hamzah, and Muhammad Ammar Shafi
- Subjects
Information Systems and Management ,Silhouette analysis ,Artificial Intelligence ,Control and Systems Engineering ,Household income ,K-means clustering ,Mean square error ,Electrical and Electronic Engineering ,Multiple linear regression - Abstract
Household income plays a significant role in determining a country's socioeconomic standing. This measure is often used by the government to formulate the federal budget and policies that are most appropriate for national development. In spite of this, Malaysia's current economic circumstances continue to be characterized by income disparity. Therefore, this shortcoming can be addressed by analyzing the household income survey (HIS) conducted by Department of Statistics Malaysia (DoSM). In this study, the hybrid model is proposed where K-means and multiple linear regression (MLR) for clustering and predicting household income in Malaysia. Based on the experimental results, the K-means clustering analysis in conjunction with the MLR model outperformed the MLR model without clustering with a smaller mean square error. As a result, clustering analysis results in a more accurate estimate of household income because it reduces the variation between households. It is important that household income information reflect the concern of policymakers about the impact of universal and targeted interventions on different socioeconomic groups.
- Published
- 2023
- Full Text
- View/download PDF
13. Comparison of singular spectrum analysis forecasting algorithms for student’s academic performance during COVID-19 outbreak
- Author
-
Shuhaida Ismail, Nor Ain Maisarah Samsudin, Shazlyn Milleana Shaharudin, Muhammad Fakhrullah Mohd Fuad, and Muhammad Fareezuan Zulfikri
- Subjects
2019-20 coronavirus outbreak ,Point (typography) ,Coronavirus disease 2019 (COVID-19) ,Computer Networks and Communications ,Mechanical Engineering ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,Online learning ,Computer Graphics and Computer-Aided Design ,Computational Theory and Mathematics ,Artificial Intelligence ,Control and Systems Engineering ,Electrical and Electronic Engineering ,Spectrum analysis ,Grading (education) ,Algorithm ,Singular spectrum analysis ,Civil and Structural Engineering - Abstract
Due to the spread of COVID-19 that hit Malaysia, all academic activities at educational institutions including universities had to be carried out via online learning However, the effectiveness of online learning is remains unanswered Besides, online learning may have a significant impact if continued in the upcoming academic sessions Therefore, the core of this study is to predict the academic performance of undergraduate students at one of the public universities in Malaysia by using Recurrent Forecasting-Singular Spectrum Analysis (RF-SSA) and Vector Forecasting-Singular Spectrum Analysis (VF-SSA) The key concept of the predictive model is to improve the efficiency of different types of forecast model in SSA by using two parameters which are window length (L) and number of leading components (r) The forecasting approaches in SSA model was based on the Grading Point Assessments (GPA) for undergraduate students from Faculty Science and Mathematics, UPSI via online classes during COVID-19 outbreak The experiment revealed that parameter L= 11 (T/20) has the best prediction result for RF-SSA model with RMSE value of 0 19 as compared to VF-SSA of 0 30 This signifies the competency of RF-SSA in predicting the students’ academic performances based on GPA for the upcoming semester Nonetheless, an RF-SSA algorithm should be developed for higher effectivity of obtaining more data sets including more respondents from various universities in Malaysia © 2021 Muhammad Fakhrullah Mohd Fuad et al
- Published
- 2021
- Full Text
- View/download PDF
14. A hybrid model of self-organizing maps (SOM) and least square support vector machine (LSSVM) for time-series forecasting.
- Author
-
Shuhaida Ismail, Ani Shabri, and Ruhaidah Samsudin
- Published
- 2011
- Full Text
- View/download PDF
15. Predictive Modelling of Covid-19 Cases in Malaysia based on Recurrent Forecasting-Singular Spectrum Analysis Approach
- Author
-
Shazlyn Milleana Shaharudin, Nur Syarafina Mohamed, Shuhaida Ismail, Nurul Aininafilzasulaiman, and Mou Leong Tan
- Subjects
2019-20 coronavirus outbreak ,Coronavirus disease 2019 (COVID-19) ,Computer science ,Science and engineering ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,Statistics ,Computer Science (miscellaneous) ,Electrical and Electronic Engineering ,Singular spectrum analysis ,Predictive modelling ,World health ,Southeast asia - Abstract
Covid-19 (novel coronavirus) was discovered in Wuhan, China in December 2019, and has since affected millions’ lives worldwide By 10th April 2020, Malaysia reported more than 4,000 outbreak cases, the highest in Southeast Asia Recently, a forecasting model was developed to measure and predict daily Covid-19 cases in Malaysia for the coming 10 days using previously-confirmed cases A Singular Spectrum Analysis-based forecasting model that discriminates noise in a time series trend is introduced The key concept of the proposed model, RF-SSA, is improving the efficiency of recurrent SSA by establishing L and r parameters via several tests The RF-SSA model assessment is based on the World Health Organization’s official Covid-19 data to predict the daily confirmed cases after 10th April until 20th April, 2020 These results show that the parameter L= 4 (T/20) for RF-SSA model was suitable for short time series outbreak data and the appropriate number of eigentriples to obtain is important as it influences the forecasting result Evidently, the RF-SSA has over-forecasted the cases by 0 36% This indicates RF-SSA’s competence to predict the impending number of Covid-19 cases Nevertheless, enhanced RF-SSA algorithm should to be developed for higher effectivity in capturing any extreme data changes © 2020, World Academy of Research in Science and Engineering All rights reserved
- Published
- 2020
- Full Text
- View/download PDF
16. Comparison of Feedforward Neural Network with Different Training Algorithms for Bitcoin Price Forecasting
- Author
-
Azme Khamis, Eng Chuen Loh, Aida Mustapha, and Shuhaida Ismail
- Subjects
Multidisciplinary ,business.industry ,Computer science ,Training (meteorology) ,Feedforward neural network ,Artificial intelligence ,business - Abstract
Bitcoin is the most popular cryptocurrency with the highest market value. It was said to have potential in changing the way of trading in future. However, Bitcoin price prediction is a hard task and difficult for investors to make decision. This is caused by nonlinearity property of the Bitcoin price. Hence, a better forecasting method are essential to minimize the risk from inaccuracy decision. The aim of this paper is to compare two different training algorithms which are Levenberg-Marquardt (LM) backpropagation algorithm and Scaled Conjugate Gradient (SCG) backpropagation algorithm using Feedforward Neural Network (FNN) to forecast the Bitcoin price. After obtaining the forecasting result, forecast accuracy measurement will be carried out to identify the best model to forecast Bitcoin price. The result showed that the performance of Bitcoin price forecasting increased after the application of FNN – LM model. It is proven that Levenberg-Marquardt backpropagation algorithm is better compared to Scaled Conjugate Gradient backpropagation when forecasting Bitcoin price using FNN. The resulting model provides new insights into Bitcoin forecasting using FNN – LM model which directly benefits the investors and economists in lowering the risk of making wrong decision when it comes to invest in Bitcoin. Keywords: Bitcoin Price; Artificial Neural Network; Forecasting
- Published
- 2020
- Full Text
- View/download PDF
17. Comparative Analysis of Statistical and Machine Learning Methods for Classification of Match Outcomes in Association Football
- Author
-
Syazira Zulkifli, Aida Binti Mustapha, Shuhaida Ismail, and Nazim Razali
- Published
- 2022
- Full Text
- View/download PDF
18. Long Short-Term vs Gated Recurrent Unit Recurrent Neural Network For Google Stock Price Prediction
- Author
-
Aida Nabilah Sadon, Shazlyn Milleana Shaharudin, Shuhaida Ismail, and Nur Syahira Jafri
- Subjects
Recurrent neural network ,Mean absolute percentage error ,Series (mathematics) ,Mean squared error ,business.industry ,Computer science ,Deep learning ,Statistics ,Artificial intelligence ,Time series ,business ,Term (time) ,Data modeling - Abstract
Deep Learning has proven its powerful performance in many fields as it is the sub-component of Artificial Intelligence. The use of traditional statistics methods in forecasting time series are less practicality and gives less valuable prediction. The aim of this study is to propose Recurrent Neural Network (RNN) model that suitable for forecasting Google Stock Price time series data. In this study, RNN with Long Short-Term (LSTM) and Gated Recurrent Unit (GRU) architectures are proposed as predictive models known as RNN-LSTM (2), RNN-LSTM (3), RNN-GRU (2), and RNN-GRU (3). The experimental results revealed that RNN-GRU (3) was the best model with lowest error measurements of Root Mean Square Error (RMSE), Median Absolute Percentage Error (MdAPE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Mean Directional Accuracy (MDA). The proposed model showed its capability and applicability in predicting the future values of Google stock price data with good accuracy and it can be used to predict multi-step ahead values. Evident from this analysis, it is proven that the proposed RNN-GRU (3) provides a promising alternative technique in forecasting time-series data.
- Published
- 2021
- Full Text
- View/download PDF
19. Comparisons Of Data Mining Classification Algorithms For Customers' Shopping Intention In E-Commerce
- Author
-
Shuhaida Ismail, Kek Zhi Xuan, Nur Aliaa Dalila A. Muhaime, and Intan Syazwani Noorain
- Subjects
Computer science ,business.industry ,E-commerce ,computer.software_genre ,Purchasing ,Term (time) ,Random forest ,Support vector machine ,Statistical classification ,Selection (linguistics) ,Data mining ,business ,computer ,Consumer behaviour - Abstract
Online shopping provides an excellent opportunity and platform for today's traditional businesses. The application of data mining for online shopping provides helps in understanding consumer behaviour, purchasing patterns and increased customer experience. The aim of this study is to identify the potential factor that affects customer to purchase on e-commerce and classify the potential customer by using various single and ensemble method classification algorithms. The experimental results revealed that Random Forest has the best performance in term of accuracy, precision, recall, F1, and Receiver Operating Characteristic Curve. However, Logistic Regression has the lowest computational time as compares to other algorithms. The resulting model provides a different selection of model into classification model of a potential customer which directly benefit the company to select the suitable model in lowering the cost and time when providing more personalized customer experience for the customer.
- Published
- 2021
- Full Text
- View/download PDF
20. Sentiment Analysis of Snapchat Application's Reviews
- Author
-
Shuhaida Ismail, Weng Hao Wong, Shazlyn Milleana Shaharudin, Mohd Helmy Abd Wahab, Muhammad Amirul Arifin, and Siti Salwa Abdullah Make
- Subjects
Recall ,Computer science ,business.industry ,Sentiment analysis ,Subject (documents) ,Machine learning ,computer.software_genre ,Random forest ,Statistical classification ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial intelligence ,business ,computer ,Multinomial naive bayes - Abstract
Sentiment analysis is a process of extracting opinion and subjectivity knowledge from user generated text content without the need to monitor the reviews manually. It can help to obtain an overview on performance of a product or subject based on the reviews from users. The aim of this study is to classify the Snapchat application's reviews into different polarities which are positive, neutral or negative. Next, the most frequent words are identified. Furthermore, Multinomial Naive Bayes and Random Forest classification algorithm are used to predict the user's rating. The performances of the classification models are evaluated using accuracy, precision, recall and F1-score. The results showed majority of the Snapchat users had a positive experience with the total of 6037 positive reviews. Based on the performance measures, Multinomial Naive Bayes classification algorithm performed slightly better than the Random Forest classification algorithm in predicting the rating of Snapchat application. Overall, both of the classification algorithms have average performance in predicting user's rating for Snapchat application.
- Published
- 2021
- Full Text
- View/download PDF
21. Gender Voiceprint Identification using Machine Learning Algorithms
- Author
-
Shuhaida Ismail, Nan Mad Sahar, Jia Kian Ong, Kim Gaik Tay, Aida Nabilah Sadon, and Nur'Aina A Rahman
- Subjects
Computer science ,business.industry ,Feature extraction ,Machine learning ,computer.software_genre ,Term (time) ,Random forest ,Support vector machine ,Identification (information) ,Naive Bayes classifier ,Statistical classification ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial intelligence ,business ,Algorithm ,computer ,Selection (genetic algorithm) - Abstract
Voiceprint identification is a popular and highly develop potential technology that is used to identify characteristic of a person or subject. However, selection of classification algorithms behind the voiceprint identification system have to be optimized in order to maximize the performance and capability of the model. In this study, five classification algorithms were presented which are Logistic Regression (LR), K-Nearest Neighbour (KNN), Support Vector Machine (SVM), Naive Bayes (NB) and Random Forest (RF). The performances of these algorithms were evaluated before and after the application of feature extraction technique. The result showed that SVM have the best performance in term of accuracy and AUC either with or without PCA. However, NB was identified as the fastest algorithm in term of computational time. The results also showed that feature extraction techniques have insignificant improvement over the classification of gender voiceprint dataset.
- Published
- 2021
- Full Text
- View/download PDF
22. Electricity Consumption Forecasting Using Adaptive Neuro-Fuzzy Inference System (ANFIS)
- Author
-
Kim Gaik Tay, Shuhaida Ismail, Hassan Muwafaq, and Pauline Ong
- Subjects
Consumption (economics) ,Multivariate statistics ,Adaptive neuro fuzzy inference system ,business.industry ,Computer science ,Energy management ,020209 energy ,020208 electrical & electronic engineering ,Univariate ,02 engineering and technology ,Machine learning ,computer.software_genre ,Fuzzy logic ,0202 electrical engineering, electronic engineering, information engineering ,Artificial intelligence ,Electricity ,Electrical and Electronic Engineering ,Time series ,business ,computer - Abstract
Universiti Tun Hussein Onn Malaysia (UTHM) is a developing Malaysian Technical University. There is a great development of UTHM since its formation in 1993. Therefore, it is crucial to have accurate future electricity consumption forecasting for its future energy management and saving. Even though there are previous works of electricity consumption forecasting using Adaptive Neuro-Fuzzy Inference System (ANFIS), but most of their data are multivariate data. In this study, we have only univariate data of UTHM electricity consumption from January 2009 to December 2018 and wish to forecast 2019 consumption. The univariate data was converted to multivariate and ANFIS was chosen as it carries both advantages of Artificial Neural Network (ANN) and Fuzzy Inference System (FIS). ANFIS yields the MAPE between actual and predicted electricity consumption of 0.4002% which is relatively low if compared to previous works of UTHM electricity forecasting using time series model (11.14%), and first-order fuzzy time series (5.74%), and multiple linear regression (10.62%).
- Published
- 2019
- Full Text
- View/download PDF
23. Empirical Mode Decomposition Couple with Artificial Neural Network for Water Level Prediction
- Author
-
Shuhaida Ismail, Azme Khamis, and Eng Chuen Loh
- Subjects
Flood myth ,Artificial neural network ,Computer science ,Mode (statistics) ,computer.software_genre ,Hilbert–Huang transform ,Variable (computer science) ,Architecture ,Decomposition (computer science) ,Decomposition method (constraint satisfaction) ,Data mining ,Time series ,computer ,Civil and Structural Engineering - Abstract
Natural disaster brings massive destruction towards properties and human being and flood is one of them. In order for the government to take earlier action to reduce the damages, an accurate flood prediction is necessary. In Malaysia, Kelantan is categorized as a high flood risk area, thus this study focuses on Kelantan flood prediction. This study is to investigate the effect of decomposition for water level prediction by applying Artificial Neural Network (ANN) forecasting model. In this study, Empirical Mode Decomposition (EMD) is used as the decomposition method. The best Intrinsic Mode Function (IMF) for each input variable is selected using correlation-based selection method. The results showed that the performance of hybrid EMD and ANN is superior compared to other models, especially classic ANN model. The reason for this outcome is that through decomposition methods, ANN is able to capture more in-depth information of the Kelantan hydrological time series data. The resulting model provides new insights for government and hydrologist in Kelantan to have better prediction towards flood occurrence.
- Published
- 2019
- Full Text
- View/download PDF
24. Electricity Consumption Forecasting Using Nonlinear Autoregressive with External (Exogeneous) Input Neural Network
- Author
-
Pauline Ong, Kim Gaik Tay, Hassan Muwafaq, and Shuhaida Ismail
- Subjects
Consumption (economics) ,Nonlinear autoregressive exogenous model ,Electric power system ,Mean absolute percentage error ,Autoregressive model ,Artificial neural network ,Computer science ,business.industry ,Econometrics ,Univariate ,Electricity ,Electrical and Electronic Engineering ,business - Abstract
Forecasting is prediction of future values based on historical data. Electricity consumption forecasting is crucial for utility company to plan for future power system generation. Even though there are previous works of electricity consumption forecasting using Artificial Neural Network (ANN), but most of their data is multivariate data. In this study, we have only univariate data of electricity consumption from January 2009 to December 2018 and wish to do a prediction for a year ahead. On top of that, our data consist of autoregressive component, hence Nonlinear Autoregressive with External (Exogeneous) Input (NARX) Neural Network Time Series from Matlab R2018b was used. It gives the mean absolute percentage error (MAPE) between actual and predicted electricity consumption of 1.38%.
- Published
- 2019
- Full Text
- View/download PDF
25. Short-Term Forecasting of Daily Confirmed COVID-19 Cases in Malaysia Using RF-SSA Model
- Author
-
Noor Artika Hassan, Mou Leong Tan, Shazlyn Milleana Shaharudin, Shuhaida Ismail, and Nurul Ainina Filza Sulaiman
- Subjects
China ,2019-20 coronavirus outbreak ,Coronavirus disease 2019 (COVID-19) ,Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) ,window length ,forecasting ,Southeast asian ,World health ,03 medical and health sciences ,0302 clinical medicine ,0502 economics and business ,Statistics ,Humans ,030212 general & internal medicine ,Singular spectrum analysis ,Original Research ,Singapore ,SARS-CoV-2 ,Spectrum Analysis ,05 social sciences ,Malaysia ,Public Health, Environmental and Occupational Health ,COVID-19 ,eigentriples ,singular spectrum analysis ,Term (time) ,Geography ,trend ,050211 marketing ,Public Health ,Spectrum analysis ,Public aspects of medicine ,RA1-1270 ,recurrent forecasting - Abstract
Novel coronavirus (COVID-19) was discovered in Wuhan, China in December 2019, and has affected millions of lives worldwide. On 29th April 2020, Malaysia reported more than 5,000 COVID-19 cases; the second highest in the Southeast Asian region after Singapore. Recently, a forecasting model was developed to measure and predict COVID-19 cases in Malaysia on daily basis for the next 10 days using previously-confirmed cases. A Recurrent Forecasting-Singular Spectrum Analysis (RF-SSA) is proposed by establishing L and ET parameters via several tests. The advantage of using this forecasting model is it would discriminate noise in a time series trend and produce significant forecasting results. The RF-SSA model assessment was based on the official COVID-19 data released by the World Health Organization (WHO) to predict daily confirmed cases between 30th April and 31st May, 2020. These results revealed that parameter L = 5 (T/20) for the RF-SSA model was indeed suitable for short-time series outbreak data, while the appropriate number of eigentriples was integral as it influenced the forecasting results. Evidently, the RF-SSA had over-forecasted the cases by 0.36%. This signifies the competence of RF-SSA in predicting the impending number of COVID-19 cases. Nonetheless, an enhanced RF-SSA algorithm should be developed for higher effectivity of capturing any extreme data changes.
- Published
- 2021
- Full Text
- View/download PDF
26. Comparison of daily rainfall forecasting using multilayer perceptron neural network model
- Author
-
Suhaila Mohd Yasin, Mazwin Arleena Masngut, Aida Mustapha, and Shuhaida Ismail
- Subjects
Artificial neural network ,Information Systems and Management ,Coefficient of determination ,Forecast error ,Mean squared error ,Computer science ,Mean absolute error ,Autoregressive ,Autoregressive model ,Artificial Intelligence ,Control and Systems Engineering ,Daily rainfall ,Statistics ,Autoregressive integrated moving average ,Electrical and Electronic Engineering ,Forecasting performance measurement ,Multilayer perceptron neural network - Abstract
Rainfall is important in predicting weather forecast particularly to the agriculture sector and also in environment which gives great contribution towards the economy of the nation. Thus, it is important for the hydrologists to forecast daily rainfall in order to help the other people in the agriculture sector to proceed with their harvesting schedules accordingly and to make sure the results of their crops would be satisfying. This study is set to forecast the daily rainfall future value using ARIMA model and Artificial Neural Network (ANN) model. Both method is evaluated by using Mean Absolute Error (MAE), Mean Forecast Error (MFE), Root Mean Squared Error (RMSE) and coefficient of determination (R ). The results showed that ANN model has outperformed results than ARIMA model. The results also showed ANN has under-forecast the daily rainfall data by 2.21% compare to ARIMA with over-forecast of -3.34%. From this study, it shows that the ANN (6,4,1) model produces better results of MAE (8.4208), MFE (2.2188), RMSE (34.6740) and R (0.9432) compared to ARIMA model. This has proved that ANN model has outperformed ARIMA model in predicting daily rainfall values.
- Published
- 2020
27. A comparative study of different imputation methods for daily rainfall data in east-coast Peninsular Malaysia
- Author
-
Shazlyn Milleana Shaharudin, Mou Leong Tan, Siti Mariana Che Mat Nor, Nurul Hila Zainuddin, and Shuhaida Ismail
- Subjects
Control and Optimization ,MCMC ,Mean squared error ,Computer Networks and Communications ,02 engineering and technology ,Root mean square ,symbols.namesake ,Missing value ,Linear regression ,Statistics ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,Imputation (statistics) ,Electrical and Electronic Engineering ,Instrumentation ,Nearest neighbor ,Mathematics ,NIPALS ,020206 networking & telecommunications ,Markov chain Monte Carlo ,Non-linear iterative partial least squares ,Missing data ,Random forest ,Hardware and Architecture ,Control and Systems Engineering ,symbols ,020201 artificial intelligence & image processing ,Replace by mean ,Information Systems - Abstract
Rainfall data are the most significant values in hydrology and climatology modelling. However, the datasets are prone to missing values due to various issues. This study aspires to impute the rainfall missing values by using various imputation method such as Replace by Mean, Nearest Neighbor, Random Forest, Non-linear Interactive Partial Least-Square (NIPALS) and Markov Chain Monte Carlo (MCMC). Daily rainfall datasets from 48 rainfall stations across east-coast Peninsular Malaysia were used in this study. The dataset were then fed into Multiple Linear Regression (MLR) model. The performance of abovementioned methods were evaluated using Root Mean Square Method (RMSE), Mean Absolute Error (MAE) and Nash-Sutcliffe Efficiency Coefficient (CE). The experimental results showed that RF coupled with MLR (RF-MLR) approach was attained as more fitting for satisfying the missing data in east-coast Peninsular Malaysia.
- Published
- 2020
- Full Text
- View/download PDF
28. An efficient method to improve the clustering performance using hybrid robust principal component analysis-spectral biclustering in rainfall patterns identification
- Author
-
Shuhaida Ismail, Norhaiza Ahmad, Shazlyn Milleana Shaharudin, and Siti Mariana Che Mat Nor
- Subjects
0106 biological sciences ,Information Systems and Management ,010504 meteorology & atmospheric sciences ,Computer science ,Principal component analysis ,01 natural sciences ,Biclustering ,Correlation ,Cluster analysis ,Artificial Intelligence ,Cluster (physics) ,Electrical and Electronic Engineering ,Tukey's biweight correlation ,Robust principal component analysis ,0105 earth and related environmental sciences ,business.industry ,Pattern recognition ,Spectral biclustering ,Control and Systems Engineering ,Skewness ,Outlier ,Artificial intelligence ,business ,010606 plant biology & botany - Abstract
In this study, hybrid RPCA-spectral biclustering model is proposed in identifying the Peninsular Malaysia rainfall pattern. This model is a combination between Robust Principal Component Analysis (RPCA) and bi-clustering in order to overcome the skewness problem that existed in the Peninsular Malaysia rainfall data. The ability of Robust PCA is more resilient to outlier given that it assesses every observation and downweights the ones which deviate from the data center compared to classical PCA. Meanwhile, two way-clustering able to simultaneously cluster along two variables and exhibit a high correlation compared to one-way cluster analysis. The experimental results showed that the best cumulative percentage of variation in between 65% - 70% for both Robust and classical PCA. Meanwhile, the number of clusters has improved from six disjointed cluster in Robust PCA-kMeans to eight disjointed cluster for the proposed model. Further analysis shows that the proposed model has smaller variation with the values of 0.0034 compared to 0.030 in Robust PCA-kMeans model. Evident from this analysis, it is proven that the proposed RPCA-spectral biclustering model is predominantly acclimatized to the identifying rainfall patterns in Peninsular Malaysia due to the small variation of the clustering result.
- Published
- 2019
29. A Hybrid of Multiple Linear Regression Clustering Model with Support Vector Machine for Colorectal Cancer Tumor Size Prediction
- Author
-
Muhamad Ghazali Kamardan, Muhammad Ammar Shafi, Mohd Saifullah Rusiman, and Shuhaida Ismail
- Subjects
General Computer Science ,Tumor size ,Mean squared error ,Computer science ,Colorectal cancer ,medicine.disease ,Support vector machine ,03 medical and health sciences ,0302 clinical medicine ,030220 oncology & carcinogenesis ,Linear regression ,Statistics ,medicine ,030212 general & internal medicine ,Cluster analysis - Abstract
This study proposed the new hybrid model of Multiple Linear Regression Clustering (MLRC) combined with Support Vector Machine (SVM) to predict tumor size of colorectal cancer (CRC). Three models: Multiple Linear Regression (MLR), MLRC and hybrid MLRC with SVM model were compared to get the best model in predicting tumor size of colorectal cancer using two measurement statistical errors. The proposed model of hybrid MLRC with SVM have found two significant clusters whereby, each clusters contained 15 and three significant variables for cluster 1 and 2, respectively. The experiments found that the proposed model tend to be the best model with least value of Mean Square Error (MSE) and Root Mean Square Error (RMSE). This finding has shed light to health practitioner in determining the factors that contribute to colorectal cancer.
- Published
- 2019
- Full Text
- View/download PDF
30. Forecasting accuracy: a comparative study between artificial neural network and autoregressive model for streamflow
- Author
-
Wan Nur Hawa Fatihah Wan Zurey, Aida Mustapha, and Shuhaida Ismail
- Subjects
Artificial neural network ,Information Systems and Management ,Mean squared error ,Forecast error ,Computer science ,Mean absolute error ,Streamflow ,Autoregressive ,Mean absolute percentage error ,Autoregressive model ,Artificial Intelligence ,Control and Systems Engineering ,Statistics ,Electrical and Electronic Engineering ,Reliability (statistics) ,Forecasting - Abstract
Estimating the reliability of potential prediction is very crucial as our life depended heavily on it. Thus, a simulation that concerned hydrological factors such as streamflow must be enhanced. In this study, Autoregressive (AR) and Artificial Neural Networks (ANN) were used. The forecasting result for each model was assessed by using various performance measurements such as Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), Mean Forecast Error (MFE) and Nash-Sutcliffe Model Efficiency Coefficient (CE). The experimental results showed the forecast performance of Durian Tunggal reservoir datasets by using ANN Model 7 with 7 hidden neurons has better forecast performance compared to AR (4). The ANN model has the smallest MAE (0.0116 m3/s), RMSE (0.0607 m3/s), MAPE (1.8214% m3/s), MFE (0.0058 m3/s) and largest CE (0.9957 m3/s) which show the capability of fitting to a nonlinear dataset. In conclusion, high predictive precision is an advantage as a proactive or precautionary measure that can be inferred in advance in order to avoid certain negative effects.
- Published
- 2020
- Full Text
- View/download PDF
31. The Combination of Autoregressive Integrated Moving Average (ARIMA) and Support Vector Machines (SVM) for Daily Rubber Price Forecasting
- Author
-
Shuhaida Ismail, Lai Jing Jong, Mohd Helmy Abd Wahab, Aida Mustapha, and Syed Zulkarnain Syed Idrus
- Subjects
Support vector machine ,Natural rubber ,Computer science ,business.industry ,visual_art ,visual_art.visual_art_medium ,Autoregressive integrated moving average ,Artificial intelligence ,Machine learning ,computer.software_genre ,business ,computer - Abstract
Natural rubber (NR) price is easily affect by the long term and short term exchange rate of developments on supply and demand sides, as well as the effects of exchange rates. Due to the fact that monthly, quarterly, and annually data have underwent the smoothing technique, it may has missed out some of the important characteristics or information describing the rubber price. Since the NR price is changing daily, therefore, this study focuses on the predicting the future daily prices. A combination models of Autoregressive Integrated Moving Average (ARIMA) and Support Vector Machine (SVM) is proposed in order to capture the future value of NR prices. The experimental results show that the proposed model performs the best whereby it has under predicted by 6.31% with the r value of 0.9976 compared to single ARIMA and SVM models. As the results, the combination model shows to be an effective tools in improving the forecasting accuracy by reducing the model forecast error.
- Published
- 2020
- Full Text
- View/download PDF
32. Crude Oil Price Forecasting Using Hybrid Support Vector Machine
- Author
-
Mohd Helmy Abd Wahab, Lee Jo Xian, Aida Mustapha, Shuhaida Ismail, and Syed Zulkarnain Syed Idrus
- Subjects
Support vector machine ,Computer science ,Agricultural engineering ,Crude oil - Abstract
Crude oil price is strongly impacting the world economy. However, it is very fluctuated and difficult for investor to make decision. Hence, forecasting is one of the ways to minimizing risks arise from indecision on future. This paper will apply Support Vector Machine (SVM) and Artificial Neural Network (ANN) and a proposed hybrid model name Empirical mode decomposition-Support Vector Machine (EMD-SVM) forecasting crude oil price. After obtaining the forecasting result, performance evaluation is carry out to show which method can better forecast the crude oil price. The result shows that the performance of cruel oil price forecasting can be significantly increased by using the proposed hybrid EMD-SVM model. Thus proven the hybrid model are out-perform than individual forecasting model.
- Published
- 2020
- Full Text
- View/download PDF
33. Machine learning approach for flood risks prediction
- Author
-
Shuhaida Ismail, Aida Mustapha, and Nazim Razali
- Subjects
Support Vector Machine ,Information Systems and Management ,Computer science ,Decision Tree ,0208 environmental biotechnology ,Decision tree ,02 engineering and technology ,Machine learning ,computer.software_genre ,Artificial Intelligence ,Bayesian Network ,k-Nearest Neighbour ,Electrical and Electronic Engineering ,Natural disaster ,Flood myth ,business.industry ,Bayesian network ,021001 nanoscience & nanotechnology ,020801 environmental engineering ,Support vector machine ,Control and Systems Engineering ,Early warning system ,Preventive action ,Artificial intelligence ,0210 nano-technology ,business ,Flood Prediction ,computer ,Predictive modelling - Abstract
Flood is one of main natural disaster that happens all around the globe caused law of nature. It has caused vast destruction of huge amount of properties, livestock and even loss of life. Therefore, the needs to develop an accurate and efficient flood risk prediction as an early warning system is highly essential. This study aims to develop a predictive modelling follow Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology by using Bayesian network (BN) and other Machine Learning (ML) techniques such as Decision Tree (DT), k-Nearest Neighbours (kNN) and Support Vector Machine (SVM) for flood risks prediction in Kuala Krai, Kelantan, Malaysia. The data is sourced from 5-year period between 2012 until 2016 consisting 1,827 observations. The performance of each models were compared in terms of accuracy, precision, recall and f-measure. The results showed that DT with SMOTE method performed the best compared to others by achieving 99.92% accuracy. Also, SMOTE method is found highly effective in dealing with imbalance dataset. Thus, it is hoped that the finding of this research may assist the non-government or government organization to take preventive action on flood phenomenon that commonly occurs in Malaysia due to the wet climate.
- Published
- 2020
- Full Text
- View/download PDF
34. Prediction of alcohol consumption among Portuguese secondary school students: A data mining approach
- Author
-
Nik Intan Areena Nik Azlan, Shuhaida Ismail, and Aida Mustapha
- Subjects
Normalization (statistics) ,Computer science ,business.industry ,Decision tree learning ,Decision tree ,Machine learning ,computer.software_genre ,Cross-validation ,Random forest ,Naive Bayes classifier ,Statistical classification ,Artificial intelligence ,Precision and recall ,business ,computer - Abstract
This paper is set to perform a comparative experiment on prediction of alcohol consumption among secondary school students. Data set used in this project contained 34 attribute was gathered from two Portuguese secondary schools in the year 2005–2006. Four classification algorithms are proposed and implemented, which include the Decision Tree, k-Nearest Neighbour (k-NN), Random Forest and Naive Bayes. These methods were trained and tested using 10-fold cross validation. The results showed that the Decision Tree algorithm produced highest values for accuracy, recall and precision compared to other classification algorithms. Besides, it is observed that Naive Bayes methods combined with Interquartile normalization provides a promising alternative classification technique in the area.
- Published
- 2018
- Full Text
- View/download PDF
35. Behavioural features for mushroom classification
- Author
-
Amy Rosshaida Zainal, Aida Mustapha, and Shuhaida Ismail
- Subjects
Mushroom ,Statistical classification ,education.field_of_study ,business.industry ,Principal component analysis ,Metric (mathematics) ,Population ,Feature (machine learning) ,Decision tree ,Pattern recognition ,Artificial intelligence ,business ,education - Abstract
Mushrooms have high benefits in the human body. However, not all mushrooms are edible. While some have medical properties to cure cancer, some other types of mushrooms may contain viruses that carry infectious diseases. This paper is set to study mushroom behavioural features such as the shape, surface and colour of the cap, gill and stalk, as well as the odour, population and habitat of the mushrooms. The Principal Component Analysis (PCA) algorithm is used for selecting the best features for the classification experiment using Decision Tree (DT) algorithm. The classification accuracy, coefficient metric, and time taken to build a classification model on a standard Mushroom dataset were measured. The behavioural feature of ‘odour’ was selected as the highest ranked feature that contribute to the high classification accuracy.
- Published
- 2018
- Full Text
- View/download PDF
36. THREE-PARAMETER LOGNORMAL DISTRIBUTION: PARAMETRIC ESTIMATION USING L-MOMENT AND TL-MOMENT APPROACH
- Author
-
Ani Shabri, Basri Badyalina, Siti Sarah Abadan, Norhafizah Yusof, Nur Amalina Mat Jan, and Shuhaida Ismail
- Subjects
Ratio distribution ,Statistics ,Log-Cauchy distribution ,General Engineering ,Log-logistic distribution ,Noncentral chi-squared distribution ,Asymptotic distribution ,Compound probability distribution ,Distribution fitting ,Three-point estimation ,Mathematics - Abstract
The three-parameter lognormal (LN3) distribution has been applied to the frequency analysis of flood events. L-moment and TL-moment methods are applied in estimating parameters of the LN3 distribution which are L-moment, η = 0 and TL-moment, η = 1, 2, 3, and 4 to the LN3 distribution. A simulation study is conducted in this paper by fitting this distribution to generate LN3 and non LN3 samples. Relative Root Mean Square Error (RRMSE) and relative bias are evaluated to illustrate the performance of this distribution. The performance of TL-moments approach was compared with L-moments based on the streamflow data from Sg. Trolak and Sg. Slim which are located in Perak, Malaysia. The results showed that TL-moments approach produced a better result at high quantile estimation compared to L-moments.
- Published
- 2016
- Full Text
- View/download PDF
37. A hybrid model of self organizing maps and least square support vector machine for river flow forecasting
- Author
-
Ruhaidah Samsudin, Ani Shabri, and Shuhaida Ismail
- Subjects
lcsh:GE1-350 ,Self-organizing map ,Operations research ,lcsh:T ,Computer science ,lcsh:Geography. Anthropology. Recreation ,computer.software_genre ,lcsh:Technology ,lcsh:TD1-1066 ,Support vector machine ,lcsh:G ,Streamflow ,Least squares support vector machine ,Water resource planning ,Autoregressive integrated moving average ,Performance indicator ,Data mining ,lcsh:Environmental technology. Sanitary engineering ,computer ,Hybrid model ,lcsh:Environmental sciences - Abstract
Successful river flow forecasting is a major goal and an essential procedure that is necessary in water resource planning and management. There are many forecasting techniques used for river flow forecasting. This study proposed a hybrid model based on a combination of two methods: Self Organizing Map (SOM) and Least Squares Support Vector Machine (LSSVM) model, referred to as the SOM-LSSVM model for river flow forecasting. The hybrid model uses the SOM algorithm to cluster the entire dataset into several disjointed clusters, where the monthly river flows data with similar input pattern are grouped together from a high dimensional input space onto a low dimensional output layer. By doing this, the data with similar input patterns will be mapped to neighbouring neurons in the SOM's output layer. After the dataset has been decomposed into several disjointed clusters, an individual LSSVM is applied to forecast the river flow. The feasibility of this proposed model is evaluated with respect to the actual river flow data from the Bernam River located in Selangor, Malaysia. The performance of the SOM-LSSVM was compared with other single models such as ARIMA, ANN and LSSVM. The performance of these models was then evaluated using various performance indicators. The experimental results show that the SOM-LSSVM model outperforms the other models and performs better than ANN, LSSVM as well as ARIMA for river flow forecasting. It also indicates that the proposed model can forecast more precisely, and provides a promising alternative technique for river flow forecasting.
- Published
- 2012
- Full Text
- View/download PDF
38. A hybrid model of self-organizing maps (SOM) and least square support vector machine (LSSVM) for time-series forecasting
- Author
-
Ruhaidah Samsudin, Shuhaida Ismail, and Ani Shabri
- Subjects
Self-organizing map ,Mean squared error ,Computer science ,business.industry ,General Engineering ,computer.software_genre ,Machine learning ,Least squares ,Field (computer science) ,Computer Science Applications ,Support vector machine ,Relevance vector machine ,Artificial Intelligence ,Artificial intelligence ,Data mining ,Time series ,business ,Hybrid model ,computer - Abstract
Support vector machine is a new tool from Artificial Intelligence (AI) field has been successfully applied for a wide variety of problem especially in time-series forecasting. In this paper, least square support vector machine (LSSVM) is an improved algorithm based on SVM, with the combination of self-organizing maps(SOM) also known as SOM-LSSVM is proposed for time-series forecasting. The objective of this paper is to examine the flexibility of SOM-LSSVM by comparing it with a single LSSVM model. To assess the effectiveness of SOM-LSSVM model, two well-known datasets known as the Wolf yearly sunspot data and the Monthly unemployed young women data are used in this study. The experiment shows SOM-LSSVM outperforms the single LSSVM model based on the criteria of mean absolute error (MAE) and root mean square error (RMSE). It also indicates that SOM-LSSVM provides a promising alternative technique in time-series forecasting.
- Published
- 2011
- Full Text
- View/download PDF
39. Effect of Dimensionality Reductions Technique in Modelling and Forecasting River Flow
- Author
-
Shuhaida Ismail, Siraj Mohammed Pandhiani, Ani Shabri, and Aida Mustapha
- Subjects
Environmental Engineering ,Hardware and Architecture ,General Chemical Engineering ,Streamflow ,General Engineering ,Computer Science (miscellaneous) ,Environmental science ,Soil science ,Biotechnology ,Curse of dimensionality - Abstract
The ability of obtain accurate information on future river flow is a fundamental key for water resources planning, and management. Traditionally, single models have been introduced to predict the future value of river flow. This paper investigates the ability of Principal Component Analysis as dimensionality reduction technique and combined with single Support Vector Machine and Least Square Support Vector Machine, referred to as PCA-SVM and PCA-LSSVM. This study also presents comparison between the proposed models with single models of SVM and LSSVM. These models are ranked based on four statistical measures namely Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Correlation Coefficient ( ), and Correlation of Efficiency (CE). The results shows that PCA combined with LSSVM has better performance compared to other models. The best ranked models are then measured using Mean of Forecasting Error (MFE) to determine its forecast rate. PCA-LSSVM proven to be better model as it also indicates a small percentage of under-predicted values compared to the observed river flow values of 0.89% for Tualang river while over-predicted by 2. 08% for Bernam river. The study concludes by recommending the PCA as dimension reduction approach combined with LSSVM for river flow forecasting due to better prediction results and stability than those achieved from single models
- Published
- 2018
- Full Text
- View/download PDF
40. Empirical Analysis on Sales of Video Games: A Data Mining Approach
- Author
-
Shuhaida Ismail, Aida Mustapha, Muhammad Fakri Othman, and Amar Aziz
- Subjects
Estimation ,History ,Computer science ,Online database ,Decision tree ,Nearest neighbour ,Data mining ,computer.software_genre ,computer ,Computer Science Applications ,Education ,Random forest - Abstract
This paper studies factors that make the sales of video games becomes a blockbuster. The dataset used is collected from an online database maintained by VGChartz.com. Using the dataset, the Rapid Miner tool is used to select the features or factors and produce efficient estimation of the data. The techniques used in this project included the k- Nearest Neighbour (k-NN), Random Forest and Decision Tree. The factors and differences in the results are deliberated and discussed.
- Published
- 2018
- Full Text
- View/download PDF
41. Comparative Analysis of River Flow Modelling by Using Supervised Learning Technique
- Author
-
Aida Mustapha, Shuhaida Ismail, Ani Shabri, and Siraj Mohamad Pandiahi
- Subjects
Support vector machine ,History ,Artificial neural network ,Streamflow ,Wavelet regression ,Supervised learning ,Statistics ,Square (algebra) ,Supervised training ,Computer Science Applications ,Education ,Mathematics - Abstract
The goal of this research is to investigate the efficiency of three supervised learning algorithms for forecasting monthly river flow of the Indus River in Pakistan, spread over 550 square miles or 1800 square kilometres. The algorithms include the Least Square Support Vector Machine (LSSVM), Artificial Neural Network (ANN) and Wavelet Regression (WR). The forecasting models predict the monthly river flow obtained from the three models individually for river flow data and the accuracy of the all models were then compared against each other. The monthly river flow of the said river has been forecasted using these three models. The obtained results were compared and statistically analysed. Then, the results of this analytical comparison showed that LSSVM model is more precise in the monthly river flow forecasting. It was found that LSSVM has he higher r with the value of 0.934 compared to other models. This indicate that LSSVM is more accurate and efficient as compared to the ANN and WR model.
- Published
- 2018
- Full Text
- View/download PDF
42. Empirical mode decomposition coupled with least square support vector machine for river flow forecasting
- Author
-
Shuhaida Ismail, Ani Shabri, and Siti Sarah Abadan
- Subjects
Engineering ,Mean squared error ,business.industry ,Mode (statistics) ,computer.software_genre ,Measure (mathematics) ,Hilbert–Huang transform ,Support vector machine ,Set (abstract data type) ,Decomposition (computer science) ,Data mining ,Stage (hydrology) ,business ,Algorithm ,computer - Abstract
This paper aims to investigate the ability of Empirical Mode Decompositio n (EMD) coupled with Least Square Support Vector Machine (LSSVM) model in order to improve the accuracy of river flow forecasting. To assess the effectiveness of this model, Bernam monthly river flow data, has served as the case study. The proposed model was set at three important stages which are decomposition, component identification and forecasting stages respectively. The first stage is known as decomposition stage where EMD were employed for decomposing the dataset into several numbers of Intrinsic Mode Functions (IMF) and a residue. During on second stage, the meaningful signals are identified using a statistical measure and the new dataset are obtained in this stage. The final stage applied LSSVM as a forecasting tool to perform the river flow forecasting. The performance of the EMD coupled with LSSVM model is compared with the single LSSVM models using various statistics measures of Mean Absolute Error (MAE), Root Mean Square Error (RMSE), correlation-coefficient (R) and Correlation of Efficiency (CE). The comparison results reveal the proposed model of EMD coupled with LSSVM model serves as a useful tool and a promising new method for river flow forecasting.
- Published
- 2015
- Full Text
- View/download PDF
43. Hybrid empirical mode decomposition- ARIMA for forecasting exchange rates
- Author
-
Shuhaida Ismail, Siti Sarah Abadan, and Ani Shabri
- Subjects
Mean squared error ,Econometrics ,Liberian dollar ,Mean absolute error ,Economics ,Model decomposition ,Autoregressive integrated moving average ,Random walk ,Hilbert–Huang transform - Abstract
This paper studied the forecasting of monthly Malaysian Ringgit (MYR)/ United State Dollar (USD) exchange rates using the hybrid of two methods which are the empirical model decomposition (EMD) and the autoregressive integrated moving average (ARIMA). MYR is pegged to USD during the Asian financial crisis causing the exchange rates are fixed to 3.800 from 2nd of September 1998 until 21st of July 2005. Thus, the chosen data in this paper is the post-July 2005 data, starting from August 2005 to July 2010. The comparative study using root mean square error (RMSE) and mean absolute error (MAE) showed that the EMD-ARIMA outperformed the single-ARIMA and the random walk benchmark model.
- Published
- 2015
- Full Text
- View/download PDF
44. Time Series Forecasting using Least Square Support Vector Machine for Canadian Lynx Data
- Author
-
Shuhaida Ismail and Ani Shabri
- Subjects
Engineering ,Artificial neural network ,biology ,Mean squared error ,business.industry ,General Engineering ,Canadian lynx ,SETAR ,Machine learning ,computer.software_genre ,biology.organism_classification ,Support vector machine ,Moving average ,Autoregressive integrated moving average ,Data mining ,Artificial intelligence ,Time series ,business ,computer - Abstract
Time series analysis and forecasting is an active research area over the last few decades. There are various kinds of forecasting models have been developed and researchers have relied on statistical techniques to predict the future. This paper discusses the application of Least Square Support Vector Machine (LSSVM) models for Canadian Lynx forecasting. The objective of this paper is to examine the flexibility of LSSVM in time series forecasting by comparing it with other models in previous research such as Artificial Neural Networks (ANN), Auto-Regressive Integrated Moving Average (ARIMA), Feed-Forward Neural Networks (FNN), Self-Exciting Threshold Auto-Regression (SETAR), Zhang’s model, Aladang’s hybrid model and Support Vector Regression (SVR) model. The experiment results show that the LSSVM model outperforms the other models based on the criteria of Mean Absolute Error (MAE) and Mean Square Error (MSE). It also indicates that LSSVM provides a promising alternative technique in time series forecasting.
- Published
- 2014
- Full Text
- View/download PDF
45. River Flow Forecasting: a Hybrid Model of Self Organizing Maps and Least Square Support Vector Machine
- Author
-
Ruhaidah Samsudin, Ani Shabri, and Shuhaida Ismail
- Subjects
Self-organizing map ,Engineering ,Training set ,Artificial neural network ,business.industry ,computer.software_genre ,Support vector machine ,Water resources ,Streamflow ,Data mining ,Time series ,business ,Hybrid model ,computer - Abstract
Successful river flow time series forecasting is a major goal and an essential procedure that is necessary in water resources planning and management. This study introduced a new hybrid model based on a combination of two familiar non-linear method of mathematical modeling: Self Organizing Map (SOM) and Least Square Support Vector Machine (LSSVM) model referred as SOM-LSSVM model. The hybrid model uses the SOM algorithm to cluster the training data into several disjointed clusters and the individual LSSVM is used to forecast the river flow. The feasibility of this proposed model is evaluated to actual river flow data from Bernam River located in Selangor, Malaysia. Their results have been compared to those obtained using LSSVM and artificial neural networks (ANN) models. The experiment results show that the SOM-LSSVM model outperforms other models for forecasting river flow. It also indicates that the proposed model can forecast more precisely and provides a promising alternative technique in river flow forecasting.
- Published
- 2010
- Full Text
- View/download PDF
46. Empirical Analysis on Sales of Video Games: A Data Mining Approach.
- Author
-
Amar Aziz, Shuhaida Ismail, Muhammad Fakri Othman, and Aida Mustapha
- Published
- 2018
- Full Text
- View/download PDF
47. Comparative Analysis of River Flow Modelling by Using Supervised Learning Technique.
- Author
-
Shuhaida Ismail, Siraj Mohamad Pandiahi, Ani Shabri, and Aida Mustapha
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.