Back to Search Start Over

Ontology extension by online clustering with large language model agents.

Authors :
Wu G
Ling C
Graetz I
Zhao L
Source :
Frontiers in big data [Front Big Data] 2024 Oct 07; Vol. 7, pp. 1463543. Date of Electronic Publication: 2024 Oct 07 (Print Publication: 2024).
Publication Year :
2024

Abstract

An ontology is a structured framework that categorizes entities, concepts, and relationships within a domain to facilitate shared understanding, and it is important in computational linguistics and knowledge representation. In this paper, we propose a novel framework to automatically extend an existing ontology from streaming data in a zero-shot manner. Specifically, the zero-shot ontology extension framework uses online and hierarchical clustering to integrate new knowledge into existing ontologies without substantial annotated data or domain-specific expertise. Focusing on the medical field, this approach leverages Large Language Models (LLMs) for two key tasks: Symptom Typing and Symptom Taxonomy among breast and bladder cancer survivors. Symptom Typing involves identifying and classifying medical symptoms from unstructured online patient forum data, while Symptom Taxonomy organizes and integrates these symptoms into an existing ontology. The combined use of online and hierarchical clustering enables real-time and structured categorization and integration of symptoms. The dual-phase model employs multiple LLMs to ensure accurate classification and seamless integration of new symptoms with minimal human oversight. The paper details the framework's development, experiments, quantitative analyses, and data visualizations, demonstrating its effectiveness in enhancing medical ontologies and advancing knowledge-based systems in healthcare.<br />Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.<br /> (Copyright © 2024 Wu, Ling, Graetz and Zhao.)

Details

Language :
English
ISSN :
2624-909X
Volume :
7
Database :
MEDLINE
Journal :
Frontiers in big data
Publication Type :
Academic Journal
Accession number :
39435030
Full Text :
https://doi.org/10.3389/fdata.2024.1463543