Author: "Wang, Yongzhao" / Database: OpenAIRE - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Wang, Yongzhao"' showing total 6 results

Start Over Author "Wang, Yongzhao" Database OpenAIRE

6 results on '"Wang, Yongzhao"'

1. Regularization for Strategy Exploration in Empirical Game-Theoretic Analysis

Author: Wang, Yongzhao and Wellman, Michael P.
Subjects: FOS: Computer and information sciences, Computer Science - Computer Science and Game Theory, Computer Science - Multiagent Systems, Computer Science and Game Theory (cs.GT), Multiagent Systems (cs.MA)
Abstract: In iterative approaches to empirical game-theoretic analysis (EGTA), the strategy space is expanded incrementally based on analysis of intermediate game models. A common approach to strategy exploration, represented by the double oracle algorithm, is to add strategies that best-respond to a current equilibrium. This approach may suffer from overfitting and other limitations, leading the developers of the policy-space response oracle (PSRO) framework for iterative EGTA to generalize the target of best response, employing what they term meta-strategy solvers (MSSs). Noting that many MSSs can be viewed as perturbed or approximated versions of Nash equilibrium, we adopt an explicit regularization perspective to the specification and analysis of MSSs. We propose a novel MSS called regularized replicator dynamics (RRD), which simply truncates the process based on a regret criterion. We show that RRD is more adaptive than existing MSSs and outperforms them in various games. We extend our study to three-player games, for which the payoff matrix is cubic in the number of strategies and so exhaustively evaluating profiles may not be feasible. We propose a profile search method that can identify solutions from incomplete models, and combine this with iterative model construction using a regularized MSS. Finally, and most importantly, we reveal that the regret of best response targets has a tremendous influence on the performance of strategy exploration through experiments, which provides an explanation for the effectiveness of regularization in PSRO., Comment: 15pages, 12 figures
Published: 2023
Full Text: View/download PDF

2. Analyzing of Cooperative Locating Error and Formation Configuration of AUV Based on Geometric Interpretation

Author: Junqi Qu, Lichuan Zhang, Guang Pan, and Wang Yongzhao
Subjects: Computer science, General Engineering, TL1-4050, Ellipse, Evaluation function, formation configuration, Upper and lower bounds, Interpretation (model theory), autonomous underwater vehicles, symbols.namesake, Singular value decomposition, symbols, Fisher information, Algorithm, cooperative location, location performance, information ellipse, Motor vehicles. Aeronautics. Astronautics
Abstract: Because the evaluation of the location performance of AUVs with the lower bound of the Cramer-Rao inequality is not intuitive, the geometric interpretation method is proposed based on geometric ellipse. The Fisher information matrix is used to replace the Cramer-Rao inequality. The priori information matrix and the measurement information matrix are synthesized into the posterior information matrix through singular value decomposition. The geometric ellipse is used to geometrically represent the posterior information matrix. The posterior information ellipse area is used to establish the location performance evaluation function. By analyzing the performance evaluation function, the optimal formation configurations of single-master, dual-master and three-master AUVs are designed. The implementation of the special formation configuration for the three-master AUV formation proposed in the paper is easier, while the location performance of the optimal formation configuration is not much inferior. The simulation results verify that the optimal formation configuration has a higher location accuracy and that the special formation configuration is effective.
Published: 2020

3. Patterning the consecutive Pd₃ to Pd₁ on Pd₂ surface via temperature-promoted reactive metal-support interaction

Author: Niu, Yiming, Wang, Yongzhao, Chen, Junnan, Li, Shiyan, Huang, Xing, Willinger, Marc, Zhang, Wei, Liu, Yuefeng, and Zhang, Bingsen
Abstract: Atom-by-atom control of a catalyst surface is a central yet challenging topic in heterogeneous catalysis, which enables precisely confined adsorption and oriented approach of reactant molecules. Here, exposed surfaces with either consecutive Pd trimers (Pd3) or isolated Pd atoms (Pd1) are architected for Pd2Ga intermetallic nanoparticles (NPs) using reactive metal-support interaction (RMSI). At elevated temperatures under hydrogen, in situ atomic-scale transmission electron microscopy directly visualizes the refacetting of Pd2Ga NPs from energetically favorable (013)/(020) facets to (011)/(002). Infrared spectroscopy and acetylene hydrogenation reaction complementarily confirm the evolution from consecutive Pd3 to Pd1 sites of Pd2Ga catalysts with the concurrent fingerprinting CO adsorption and featured reactivities. Through theoretical calculations and modeling, we reveal that the restructured Pd2Ga surface results from the preferential arrangement of additionally reduced Ga atoms on the surface. Our work provides previously unidentified mechanistic insight into temperature-promoted RMSI and possible solutions to control and rearrange the surface atoms of supported intermetallic catalyst., Science Advances, 8 (49), ISSN:2375-2548
Published: 2022
Full Text: View/download PDF

4. Empirical Game-Theoretic Analysis for Mean Field Games

Author: Wang, Yongzhao and Wellman, Michael P.
Subjects: FOS: Computer and information sciences, Computer Science - Multiagent Systems, Multiagent Systems (cs.MA)
Abstract: We present a simulation-based approach for solution of mean field games (MFGs), using the framework of empirical game-theoretical analysis (EGTA). Our primary method employs a version of the double oracle, iteratively adding strategies based on best response to the equilibrium of the empirical MFG among strategies considered so far. We present Fictitious Play (FP) and Replicator Dynamics as two subroutines for computing the empirical game equilibrium. Each subroutine is implemented with a query-based method rather than maintaining an explicit payoff matrix as in typical EGTA methods due to a representation issue we highlight for MFGs. By introducing game model learning and regularization, we significantly improve the sample efficiency of the primary method without sacrificing the overall learning performance. Theoretically, we prove that a Nash equilibrium (NE) exists in the empirical MFG and show the convergence of iterative EGTA to NE of the full MFG with either subroutine. We test the performance of iterative EGTA in various games and show that it outperforms directly applying FP to MFGs in terms of iterations of strategy introduction., 11 papes, 6 figures
Published: 2021

5. Evaluating Strategy Exploration in Empirical Game-Theoretic Analysis

Author: Wang, Yongzhao, Ma, Qiurui, and Wellman, Michael P.
Subjects: FOS: Computer and information sciences, Computer Science - Multiagent Systems, Multiagent Systems (cs.MA)
Abstract: In empirical game-theoretic analysis (EGTA), game models are extended iteratively through a process of generating new strategies based on learning from experience with prior strategies. The strategy exploration problem in EGTA is how to direct this process so to construct effective models with minimal iteration. A variety of approaches have been proposed in the literature, including methods based on classic techniques and novel concepts. Comparing the performance of these alternatives can be surprisingly subtle, depending sensitively on criteria adopted and measures employed. We investigate some of the methodological considerations in evaluating strategy exploration, defining key distinctions and identifying a few general principles based on examples and experimental observations. In particular, we emphasize the fact that empirical games create a space of strategies that should be evaluated as a whole. Based on this fact, we suggest that the minimum regret constrained profile (MRCP) provides a particularly robust basis for evaluating a space of strategies, and propose a local search method for MRCP that outperforms previous approaches. However, the computation of MRCP is not always feasible especially in large games. In this scenario, we highlight consistency considerations for comparing across different approaches. Surprisingly, we find that recent works violate these considerations that are necessary for evaluation, which may result in misleading conclusions on the performance of different approaches. For proper evaluation, we propose a new evaluation scheme and demonstrate that our scheme can reveal the true learning performance of different approaches compared to previous evaluation methods., Comment: 23 pages, 4 figures, 8 tables
Published: 2021
Full Text: View/download PDF

6. Learning to Play against Any Mixture of Opponents

Author: Smith, Max Olan, Anthony, Thomas, Wang, Yongzhao, and Wellman, Michael P.
Subjects: FOS: Computer and information sciences, Computer Science - Multiagent Systems, Multiagent Systems (cs.MA)
Abstract: Intuitively, experience playing against one mixture of opponents in a given domain should be relevant for a different mixture in the same domain. We propose a transfer learning method, Q-Mixing, that starts by learning Q-values against each pure-strategy opponent. Then a Q-value for any distribution of opponent strategies is approximated by appropriately averaging the separately learned Q-values. From these components, we construct policies against all opponent mixtures without any further training. We empirically validate Q-Mixing in two environments: a simple grid-world soccer environment, and a complicated cyber-security game. We find that Q-Mixing is able to successfully transfer knowledge across any mixture of opponents. We next consider the use of observations during play to update the believed distribution of opponents. We introduce an opponent classifier -- trained in parallel to Q-learning, using the same data -- and use the classifier results to refine the mixing of Q-values. We find that Q-Mixing augmented with the opponent classifier function performs comparably, and with lower variance, than training directly against a mixed-strategy opponent.
Published: 2020
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results on '"Wang, Yongzhao"'

1. Regularization for Strategy Exploration in Empirical Game-Theoretic Analysis

2. Analyzing of Cooperative Locating Error and Formation Configuration of AUV Based on Geometric Interpretation

3. Patterning the consecutive Pd₃ to Pd₁ on Pd₂ surface via temperature-promoted reactive metal-support interaction

4. Empirical Game-Theoretic Analysis for Mean Field Games

5. Evaluating Strategy Exploration in Empirical Game-Theoretic Analysis

6. Learning to Play against Any Mixture of Opponents

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

Publisher

6 results on '"Wang, Yongzhao"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources