1. Enterprise Level Data Warehouse System Based on Hive in Big Data Environment.
- Author
-
Fan, Xiaoyun and Lu, Jianfeng
- Subjects
DATABASES ,RELATIONAL databases ,ELECTRONIC data processing ,BIG data ,DATA management - Abstract
Hive can store massive data through extended clusters, far exceeding the expansion and storage capabilities of traditional databases, and has become a major tool for building data warehouses in the era of big data. Based on Hive technology, this paper studies the enterprise-level data warehouse architecture composed of data storage layer, Hive data warehouse layer and application layer. Then, the paper studies data warehouse tools, uses HDFS for underlying storage and MapReduce for computing engine, and compares Hive data warehouse with relational database. Then, the ETL process is studied. The data goes through three stages: extraction, conversion and loading, and finally the data flows from the source to the target end. Finally, the system function and test are studied, the subsystems including data processing, data management, data migration and data analysis are constructed, and the core functions are tested. The test results were consistent with the expected results and met the delivery standards. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF