1. 거대언어모델 기반 검색증강생성 시스템의 표 데이터 인식률을 높이기 위한 최적의 초매개변수 조합.
- Author
-
정민수 and 이정훈
- Subjects
LANGUAGE models ,NATURAL languages ,CORPORA ,QUESTION answering systems - Abstract
Large Language Models are highly proficient at handling unstructured data, like natural language, but their performance significantly declines when processing structured data, such as tables or other similar formats. To address this limitation, this study proposes an optimal combination of hyperparameters aimed at improving the recognition of table data in a retrieval-augmented question-answering system. Preprocessing techniques are applied to ensure the effective handling of table data, and the experiments conducted use corpora based on preprocessed tables. The main focus was on discovering the best-performing hyperparameter combination by adjusting chunk sizes and varying overlap settings. The experimental results revealed that the optimal hyperparameters differed based on the specific language model being used. Although chunk size had little effect on overall response quality, introducing overlap consistently led to notable performance improvements. Future research will extend these findings by conducting further experiments with structured data across various domains. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF