Back to Search Start Over

COPSRO: An Offline Empirical Game Theoretic Method With Conservative Critic.

Authors :
Shao Z
Zhuang L
Li H
Wang S
Source :
IEEE transactions on neural networks and learning systems [IEEE Trans Neural Netw Learn Syst] 2024 Oct 01; Vol. PP. Date of Electronic Publication: 2024 Oct 01.
Publication Year :
2024
Publisher :
Ahead of Print

Abstract

This article studies how to learn approximate Nash equilibrium (NE) from static historical datasets by empirical game-theoretic analysis (EGTA), which provides a simulation-based framework to model complex multiagent interactions. Generally, EGTA requires plentiful interactions with the environment or simulator to estimate a cogent and tractable game model approximating the underlying game. However, these exploratory interactions often suffer from low data utilization efficiency and may not be feasible in risk-sensitive applications. To address these problems, this article investigates a new EGTA paradigm for offline settings and introduces a novel algorithm called conservative offline policy space response oracle (COPSRO) to identify NE from fixed datasets without active data collection. COPSRO initiates by extracting a set of strategies from the offline dataset to construct an overcomplete strategy population, achieving an approximation to the policy space of the original game. Then, COPSRO integrates the conservative critic (CC) to tackle the challenge of overestimation inherent in offline learning scenarios. Additionally, it devises the offline NE solver to iteratively compute approximate NE. Consequently, COPSRO can ascertain equilibrium strategies without real-world interaction, markedly enhancing its utility in risk-averse settings. This article provides both theoretical analysis and empirical evaluation to demonstrate the effectiveness and superiority of COPSRO across various real-world tasks in the offline setting. Our method surpasses existing approaches in terms of convergence and exploitability, especially when the coverage ration of dataset is low (20% or 10%).

Details

Language :
English
ISSN :
2162-2388
Volume :
PP
Database :
MEDLINE
Journal :
IEEE transactions on neural networks and learning systems
Publication Type :
Academic Journal
Accession number :
39352820
Full Text :
https://doi.org/10.1109/TNNLS.2024.3454477