1. Development of Basic Knowledge Construction Technique to Reduce The Volume of High-Dimensional Big Data.
- Author
-
Karya, Gede, Sitohang, Benhard, Akbar, Saiful, and Moertini, Veronica S.
- Subjects
INFORMATION & communication technologies ,EXTRACTION techniques ,DATA warehousing ,DATA reduction ,HIGH technology ,BIG data - Abstract
Big data is growing fast following Big data has the characteristics of high volume, speed, and variety (3v) and continues to grow exponentially following the development of the world's use of information and communication technology. The main problem with big data is the data deluge. The need for technology and large data storage and processing methods to keep pace with the exponential data growth rate is potentially limitless, giving rise to the problem of exponentially increasing technology requirements as well. The weakness of previous big data analysis approaches (batch and online real time processing) is that it requires high technology (large storage, memory and processing). This paper proposes a new approach in big-data analysis by separating the basic knowledge (BK) construction process from original data with a much smaller volume (volume reduction), then analyzing it into final knowledge, thus requiring smaller/simpler analysis technology. The proposals include formulating the definition and representation of BK, developing methods for constructing BK from source data, and analyzing BK into final knowledge. We propose a BK construction method based on a knowledge extraction technique using BIRCH clustering algorithm for instance reduction, and handling high-dimensional problems by parallelizing the dimension calculation process to calculate the distance between instances. We use the Adjusted Rand Index (ARI) to measure the similarity of the final knowledge of the baseline and proposed methods. First, the BIRCH baseline was modified by parallelizing the calculations has succeeded in increasing speed from 17% to 25%. Next, the parallel BIRCH (PBIRCH) baseline was broken into BK construction and BK analysis, has succeeded in reducing volume by 96% or more and increasing speed by 43.50%, with similar final knowledge results (ARI=1). Based on these results, we conclude that the BK construction method and analysis from BK into final knowledge for highdimensional big data have significantly reduced volume and speed up the analytical process without reducing the quality of the final knowledge. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF