1. ENWAR: A RAG-empowered Multi-Modal LLM Framework for Wireless Environment Perception
- Author
-
Nazar, Ahmad M., Celik, Abdulkadir, Selim, Mohamed Y., Abdallah, Asmaa, Qiao, Daji, and Eltawil, Ahmed M.
- Subjects
Computer Science - Networking and Internet Architecture ,Computer Science - Artificial Intelligence - Abstract
Large language models (LLMs) hold significant promise in advancing network management and orchestration in 6G and beyond networks. However, existing LLMs are limited in domain-specific knowledge and their ability to handle multi-modal sensory data, which is critical for real-time situational awareness in dynamic wireless environments. This paper addresses this gap by introducing ENWAR, an ENvironment-aWARe retrieval augmented generation-empowered multi-modal LLM framework. ENWAR seamlessly integrates multi-modal sensory inputs to perceive, interpret, and cognitively process complex wireless environments to provide human-interpretable situational awareness. ENWAR is evaluated on the GPS, LiDAR, and camera modality combinations of DeepSense6G dataset with state-of-the-art LLMs such as Mistral-7b/8x7b and LLaMa3.1-8/70/405b. Compared to general and often superficial environmental descriptions of these vanilla LLMs, ENWAR delivers richer spatial analysis, accurately identifies positions, analyzes obstacles, and assesses line-of-sight between vehicles. Results show that ENWAR achieves key performance indicators of up to 70% relevancy, 55% context recall, 80% correctness, and 86% faithfulness, demonstrating its efficacy in multi-modal perception and interpretation.
- Published
- 2024