Back to Search Start Over

Making Robust Generalizers Less Rigid with Soft Ascent-Descent

Authors :
Holland, Matthew J.
Hamada, Toma
Publication Year :
2024

Abstract

While the traditional formulation of machine learning tasks is in terms of performance on average, in practice we are often interested in how well a trained model performs on rare or difficult data points at test time. To achieve more robust and balanced generalization, methods applying sharpness-aware minimization to a subset of worst-case examples have proven successful for image classification tasks, but only using deep neural networks in a scenario where the most difficult points are also the least common. In this work, we show how such a strategy can dramatically break down under more diverse models, and as a more robust alternative, instead of typical sharpness we propose and evaluate a training criterion which penalizes poor loss concentration, which can be easily combined with loss transformations such as CVaR or DRO that control tail emphasis.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2408.03619
Document Type :
Working Paper