1. Anytime Performance Assessment in Blackbox Optimization Benchmarking
- Author
-
Nikolaus Hansen, Anne Auger, Dimo Brockhoff, Tea Tusar, Randomized Optimisation (RANDOPT ), Centre de Mathématiques Appliquées - Ecole Polytechnique (CMAP), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS)-Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), École polytechnique (X)-Centre National de la Recherche Scientifique (CNRS), Jozef Stefan Institute [Ljubljana] (IJS), Slovenian Research Agency Grant P2-0209, Slovenian Research Agency Grant N2-0254, and ANR-12-MONU-0009,NumBBO,Analyse, Amelioration, Evaluation d'algorithmes numériques pour l'optimisation boîte-noire(2012)
- Subjects
blackbox optimization ,[INFO.INFO-CE]Computer Science [cs]/Computational Engineering, Finance, and Science [cs.CE] ,[INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE] ,[INFO.INFO-NA]Computer Science [cs]/Numerical Analysis [cs.NA] ,Theoretical Computer Science ,[INFO.INFO-PF]Computer Science [cs]/Performance [cs.PF] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Computational Theory and Mathematics ,anytime optimization ,quality indicator ,performance assessment ,benchmarking ,Software ,[INFO.INFO-MS]Computer Science [cs]/Mathematical Software [cs.MS] - Abstract
open access; International audience; We present concepts and recipes for the anytime performance assessment when benchmarking optimization algorithms in a blackbox scenario. We consider runtime-oftentimes measured in number of blackbox evaluations needed to reach a target quality-to be a universally measurable cost for solving a problem. Starting from the graph that depicts the solution quality versus runtime, we argue that runtime is the only performance measure with a generic, meaningful, and quantitative interpretation. Hence, our assessment is solely based on runtime measurements. We discuss proper choices for solution quality indicators in single-and multiobjective optimization, as well as in the presence of noise and constraints. We also discuss the choice of the target values, budget-based targets, and the aggregation of runtimes by using simulated restarts, averages, and empirical cumulative distributions which generalize convergence graphs of single runs. The presented performance assessment is to a large extent implemented in the comparing continuous optimizers (COCO) platform freely available at https://github.com/numbbo/coco.
- Published
- 2022
- Full Text
- View/download PDF