Back to Search Start Over

Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Authors :
Wang, Yizhong
Mishra, Swaroop
Alipoormolabashi, Pegah
Kordi, Yeganeh
Mirzaei, Amirreza
Arunkumar, Anjana
Ashok, Arjun
Dhanasekaran, Arut Selvan
Naik, Atharva
Stap, David
Pathak, Eshaan
Karamanolakis, Giannis
Lai, Haizhi Gary
Purohit, Ishan
Mondal, Ishani
Anderson, Jacob
Kuznia, Kirby
Doshi, Krima
Patel, Maitreya
Pal, Kuntal Kumar
Moradshahi, Mehrad
Parmar, Mihir
Purohit, Mirali
Varshney, Neeraj
Kaza, Phani Rohitha
Verma, Pulkit
Puri, Ravsehaj Singh
Karia, Rushang
Sampat, Shailaja Keyur
Doshi, Savan
Mishra, Siddhartha
Reddy, Sujan
Patro, Sumanta
Dixit, Tanay
Shen, Xudong
Baral, Chitta
Choi, Yejin
Smith, Noah A.
Hajishirzi, Hannaneh
Khashabi, Daniel
Publication Year :
2022

Abstract

How well can NLP models generalize to a variety of unseen tasks when provided with task instructions? To address this question, we first introduce Super-NaturalInstructions, a benchmark of 1,616 diverse NLP tasks and their expert-written instructions. Our collection covers 76 distinct task types, including but not limited to classification, extraction, infilling, sequence tagging, text rewriting, and text composition. This large and diverse collection of tasks enables rigorous benchmarking of cross-task generalization under instructions -- training models to follow instructions on a subset of tasks and evaluating them on the remaining unseen ones. Furthermore, we build Tk-Instruct, a transformer model trained to follow a variety of in-context instructions (plain language task definitions or k-shot examples). Our experiments show that Tk-Instruct outperforms existing instruction-following models such as InstructGPT by over 9% on our benchmark despite being an order of magnitude smaller. We further analyze generalization as a function of various scaling parameters, such as the number of observed tasks, the number of instances per task, and model sizes. We hope our dataset and model facilitate future progress towards more general-purpose NLP models.<br />Comment: Accepted to EMNLP 2022, 25 pages

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2204.07705
Document Type :
Working Paper