Skip to search
Skip to main content
Back to Search
Start Over
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors :
Srivastava, Aarohi Rastogi, Abhinav Rao, Abhishek Shoeb, Abu Awal Md Abid, Abubakar Fisch, Adam Brown, Adam R. Santoro, Adam Gupta, Aditya Garriga-Alonso, Adrià Kluska, Agnieszka Lewkowycz, Aitor Agarwal, Akshat Power, Alethea Ray, Alex Warstadt, Alex Kocurek, Alexander W. Safaya, Ali Tazarv, Ali Xiang, Alice Parrish, Alicia Nie, Allen Hussain, Aman Askell, Amanda Dsouza, Amanda Slone, Ambrose Rahane, Ameet Iyer, Anantharaman S. Andreassen, Anders Madotto, Andrea Santilli, Andrea Stuhlmüller, Andreas Dai, Andrew La, Andrew Lampinen, Andrew Zou, Andy Jiang, Angela Chen, Angelica Vuong, Anh Gupta, Animesh Gottardi, Anna Norelli, Antonio Venkatesh, Anu Gholamidavoodi, Arash Tabassum, Arfa Menezes, Arul Kirubarajan, Arun Mullokandov, Asher Sabharwal, Ashish Herrick, Austin Efrat, Avia Erdem, Aykut Karakaş, Ayla Roberts, B. Ryan Loe, Bao Sheng Zoph, Barret Bojanowski, Bartłomiej Özyurt, Batuhan Hedayatnia, Behnam Neyshabur, Behnam Inden, Benjamin Stein, Benno Ekmekci, Berk Lin, Bill Yuchen Howald, Blake Orinion, Bryan Diao, Cameron Dour, Cameron Stinson, Catherine Argueta, Cedrick Ramírez, César Ferri Singh, Chandan Rathkopf, Charles Meng, Chenlin Baral, Chitta Wu, Chiyu Callison-Burch, Chris Waites, Chris Voigt, Christian Manning, Christopher D. Potts, Christopher Ramirez, Cindy Rivera, Clara E. Siro, Clemencia Raffel, Colin Ashcraft, Courtney Garbacea, Cristina Sileo, Damien Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Dan Freeman, Daniel Khashabi, Daniel Levy, Daniel González, Daniel Moseguí Perszyk, Danielle Hernandez, Danny Chen, Danqi Ippolito, Daphne Gilboa, Dar Dohan, David Drakard, David Jurgens, David Datta, Debajyoti Ganguli, Deep Emelin, Denis Kleyko, Denis Yuret, Deniz Chen, Derek Tam, Derek Hupkes, Dieuwke Misra, Diganta Buzan, Dilyar Mollo, Dimitri Coelho Yang, Diyi Lee, Dong-Ho Schrader, Dylan Shutova, Ekaterina Cubuk, Ekin Dogus Segal, Elad Hagerman, Eleanor Barnes, Elizabeth Donoway, Elizabeth Pavlick, Ellie Rodola, Emanuele Lam, Emma Chu, Eric Tang, Eric Erdem, Erkut Chang, Ernie Chi, Ethan A. Dyer, Ethan Jerzak, Ethan Kim, Ethan Manyasi, Eunice Engefu Zheltonozhskii, Evgenii Xia, Fanyue Siar, Fatemeh Martínez-Plumed, Fernando Happé, Francesca Chollet, Francois Rong, Frieda Mishra, Gaurav Winata, Genta Indra de Melo, Gerard Kruszewski, Germán Parascandolo, Giambattista Mariani, Giorgio Wang, Gloria Jaimovitch-López, Gonzalo Betz, Gregor Gur-Ari, Guy Galijasevic, Hana Kim, Hannah Rashkin, Hannah Hajishirzi, Hannaneh Mehta, Harsh Bogar, Hayden Shevlin, Henry Schütze, Hinrich Yakura, Hiromu Zhang, Hongming Wong, Hugh Mee Ng, Ian Noble, Isaac Jumelet, Jaap Geissinger, Jack Kernion, Jackson Hilton, Jacob Lee, Jaehoon Fisac, Jaime Fernández Simon, James B. Koppel, James Zheng, James Zou, James Kocoń, Jan Thompson, Jana Wingfield, Janelle Kaplan, Jared Radom, Jarema Sohl-Dickstein, Jascha Phang, Jason Wei, Jason Yosinski, Jason Novikova, Jekaterina Bosscher, Jelle Marsh, Jennifer Kim, Jeremy Taal, Jeroen Engel, Jesse Alabi, Jesujoba Xu, Jiacheng Song, Jiaming Tang, Jillian Waweru, Joan Burden, John Miller, John Balis, John U. Batchelder, Jonathan Berant, Jonathan Frohberg, Jörg Rozen, Jos Hernandez-Orallo, Jose Boudeman, Joseph Guerr, Joseph Jones, Joseph Tenenbaum, Joshua B. Rule, Joshua S. Chua, Joyce Kanclerz, Kamil Livescu, Karen Krauth, Karl Gopalakrishnan, Karthik Ignatyeva, Katerina Markert, Katja Dhole, Kaustubh D. Gimpel, Kevin Omondi, Kevin Mathewson, Kory Chiafullo, Kristen Shkaruta, Ksenia Shridhar, Kumar McDonell, Kyle Richardson, Kyle Reynolds, Laria Gao, Leo Zhang, Li Dugan, Liam Qin, Lianhui Contreras-Ochando, Lidia Morency, Louis-Philippe Moschella, Luca Lam, Lucas Noble, Lucy Schmidt, Ludwig He, Luheng Colón, Luis Oliveros Metz, Luke Şenel, Lütfi Kerem Bosma, Maarten Sap, Maarten ter Hoeve, Maartje Farooqi, Maheen Faruqui, Manaal Mazeika, Mantas Baturan, Marco Marelli, Marco Maru, Marco Quintana, Maria Jose Ramírez Tolkiehn, Marie Giulianelli, Mario Lewis, Martha Potthast, Martin Leavitt, Matthew L. Hagen, Matthias Schubert, Mátyás Baitemirova, Medina Orduna Arnaud, Melody McElrath, Melvin Yee, Michael A. Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michael Swędrowski, Michał Bevilacqua, Michele Yasunaga, Michihiro Kale, Mihir Cain, Mike Xu, Mimee Suzgun, Mirac Walker, Mitch Tiwari, Mo Bansal, Mohit Aminnaseri, Moin Geva, Mor Gheini, Mozhdeh T, Mukund Varma Peng, Nanyun Chi, Nathan A. Lee, Nayeon Krakover, Neta Gur-Ari Cameron, Nicholas Roberts, Nicholas Doiron, Nick Martinez, Nicole Nangia, Nikita Deckers, Niklas Muennighoff, Niklas Keskar, Nitish Shirish Iyer, Niveditha S. Constant, Noah Fiedel, Noah Wen, Nuan Zhang, Oliver Agha, Omar Elbaghdadi, Omar Levy, Omer Evans, Owain Casares, Pablo Antonio Moreno Doshi, Parth Fung, Pascale Liang, Paul Pu Vicol, Paul Alipoormolabashi, Pegah Liao, Peiyuan Liang, Percy Chang, Peter Eckersley, Peter Htut, Phu Mon Hwang, Pinyu Miłkowski, Piotr Patil, Piyush Pezeshkpour, Pouya Oli, Priti Mei, Qiaozhu Lyu, Qing Chen, Qinlang Banjade, Rabin Rudolph, Rachel Etta Gabriel, Raefer Habacker, Rahel Risco, Ramon Millière, Raphaël Garg, Rhythm Barnes, Richard Saurous, Rif A. Arakawa, Riku Raymaekers, Robbe Frank, Robert Sikand, Rohan Novak, Roman Sitelew, Roman LeBras, Ronan Liu, Rosanne Jacobs, Rowan Zhang, Rui Salakhutdinov, Ruslan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Ryan Yang, Rylan Singh, Sahib Mohammad, Saif M. Anand, Sajant Dillavou, Sam Shleifer, Sam Wiseman, Sam Gruetter, Samuel Bowman, Samuel R. Schoenholz, Samuel S. Han, Sanghyun Kwatra, Sanjeev Rous, Sarah A. Ghazarian, Sarik Ghosh, Sayan Casey, Sean Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sebastian Sadeghi, Sepideh Hamdan, Shadi Zhou, Sharon Srivastava, Shashank Shi, Sherry Singh, Shikhar Asaadi, Shima Gu, Shixiang Shane Pachchigar, Shubh Toshniwal, Shubham Upadhyay, Shyam Shyamolima Debnath Shakeri, Siamak Thormeyer, Simon Melzi, Simone Reddy, Siva Makini, Sneha Priscilla Lee, Soo-Hwan Torene, Spencer Hatwar, Sriharsha Dehaene, Stanislas Divic, Stefan Ermon, Stefano Biderman, Stella Lin, Stephanie Prasad, Stephen Piantadosi, Steven T. Shieber, Stuart M. Misherghi, Summer Kiritchenko, Svetlana Mishra, Swaroop Linzen, Tal Schuster, Tal Li, Tao Yu, Tao Ali, Tariq Hashimoto, Tatsu Wu, Te-Lin Desbordes, Théo Rothschild, Theodore Phan, Thomas Wang, Tianle Nkinyili, Tiberius Schick, Timo Kornev, Timofei Tunduny, Titus Gerstenberg, Tobias Chang, Trenton Neeraj, Trishala Khot, Tushar Shultz, Tyler Shaham, Uri Misra, Vedant Demberg, Vera Nyamai, Victoria Raunak, Vikas Ramasesh, Vinay Prabhu, Vinay Uday Padmakumar, Vishakh Srikumar, Vivek Fedus, William Saunders, William Zhang, William Vossen, Wout Ren, Xiang Tong, Xiaoyu Zhao, Xinran Wu, Xinyi Shen, Xudong Yaghoobzadeh, Yadollah Lakretz, Yair Song, Yangqiu Bahri, Yasaman Choi, Yejin Yang, Yichi Hao, Yiding Chen, Yifu Belinkov, Yonatan Hou, Yu Hou, Yufang Bai, Yuntao Seid, Zachary Zhao, Zhuoye Wang, Zijian Wang, Zijie J. Wang, Zirui Wu, Ziyi Srivastava, Aarohi Rastogi, Abhinav Rao, Abhishek Shoeb, Abu Awal Md Abid, Abubakar Fisch, Adam Brown, Adam R. Santoro, Adam Gupta, Aditya Garriga-Alonso, Adrià Kluska, Agnieszka Lewkowycz, Aitor Agarwal, Akshat Power, Alethea Ray, Alex Warstadt, Alex Kocurek, Alexander W. Safaya, Ali Tazarv, Ali Xiang, Alice Parrish, Alicia Nie, Allen Hussain, Aman Askell, Amanda Dsouza, Amanda Slone, Ambrose Rahane, Ameet Iyer, Anantharaman S. Andreassen, Anders Madotto, Andrea Santilli, Andrea Stuhlmüller, Andreas Dai, Andrew La, Andrew Lampinen, Andrew Zou, Andy Jiang, Angela Chen, Angelica Vuong, Anh Gupta, Animesh Gottardi, Anna Norelli, Antonio Venkatesh, Anu Gholamidavoodi, Arash Tabassum, Arfa Menezes, Arul Kirubarajan, Arun Mullokandov, Asher Sabharwal, Ashish Herrick, Austin Efrat, Avia Erdem, Aykut Karakaş, Ayla Roberts, B. Ryan Loe, Bao Sheng Zoph, Barret Bojanowski, Bartłomiej Özyurt, Batuhan Hedayatnia, Behnam Neyshabur, Behnam Inden, Benjamin Stein, Benno Ekmekci, Berk Lin, Bill Yuchen Howald, Blake Orinion, Bryan Diao, Cameron Dour, Cameron Stinson, Catherine Argueta, Cedrick Ramírez, César Ferri Singh, Chandan Rathkopf, Charles Meng, Chenlin Baral, Chitta Wu, Chiyu Callison-Burch, Chris Waites, Chris Voigt, Christian Manning, Christopher D. Potts, Christopher Ramirez, Cindy Rivera, Clara E. Siro, Clemencia Raffel, Colin Ashcraft, Courtney Garbacea, Cristina Sileo, Damien Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Dan Freeman, Daniel Khashabi, Daniel Levy, Daniel González, Daniel Moseguí Perszyk, Danielle Hernandez, Danny Chen, Danqi Ippolito, Daphne Gilboa, Dar Dohan, David Drakard, David Jurgens, David Datta, Debajyoti Ganguli, Deep Emelin, Denis Kleyko, Denis Yuret, Deniz Chen, Derek Tam, Derek Hupkes, Dieuwke Misra, Diganta Buzan, Dilyar Mollo, Dimitri Coelho Yang, Diyi Lee, Dong-Ho Schrader, Dylan Shutova, Ekaterina Cubuk, Ekin Dogus Segal, Elad Hagerman, Eleanor Barnes, Elizabeth Donoway, Elizabeth Pavlick, Ellie Rodola, Emanuele Lam, Emma Chu, Eric Tang, Eric Erdem, Erkut Chang, Ernie Chi, Ethan A. Dyer, Ethan Jerzak, Ethan Kim, Ethan Manyasi, Eunice Engefu Zheltonozhskii, Evgenii Xia, Fanyue Siar, Fatemeh Martínez-Plumed, Fernando Happé, Francesca Chollet, Francois Rong, Frieda Mishra, Gaurav Winata, Genta Indra de Melo, Gerard Kruszewski, Germán Parascandolo, Giambattista Mariani, Giorgio Wang, Gloria Jaimovitch-López, Gonzalo Betz, Gregor Gur-Ari, Guy Galijasevic, Hana Kim, Hannah Rashkin, Hannah Hajishirzi, Hannaneh Mehta, Harsh Bogar, Hayden Shevlin, Henry Schütze, Hinrich Yakura, Hiromu Zhang, Hongming Wong, Hugh Mee Ng, Ian Noble, Isaac Jumelet, Jaap Geissinger, Jack Kernion, Jackson Hilton, Jacob Lee, Jaehoon Fisac, Jaime Fernández Simon, James B. Koppel, James Zheng, James Zou, James Kocoń, Jan Thompson, Jana Wingfield, Janelle Kaplan, Jared Radom, Jarema Sohl-Dickstein, Jascha Phang, Jason Wei, Jason Yosinski, Jason Novikova, Jekaterina Bosscher, Jelle Marsh, Jennifer Kim, Jeremy Taal, Jeroen Engel, Jesse Alabi, Jesujoba Xu, Jiacheng Song, Jiaming Tang, Jillian Waweru, Joan Burden, John Miller, John Balis, John U. Batchelder, Jonathan Berant, Jonathan Frohberg, Jörg Rozen, Jos Hernandez-Orallo, Jose Boudeman, Joseph Guerr, Joseph Jones, Joseph Tenenbaum, Joshua B. Rule, Joshua S. Chua, Joyce Kanclerz, Kamil Livescu, Karen Krauth, Karl Gopalakrishnan, Karthik Ignatyeva, Katerina Markert, Katja Dhole, Kaustubh D. Gimpel, Kevin Omondi, Kevin Mathewson, Kory Chiafullo, Kristen Shkaruta, Ksenia Shridhar, Kumar McDonell, Kyle Richardson, Kyle Reynolds, Laria Gao, Leo Zhang, Li Dugan, Liam Qin, Lianhui Contreras-Ochando, Lidia Morency, Louis-Philippe Moschella, Luca Lam, Lucas Noble, Lucy Schmidt, Ludwig He, Luheng Colón, Luis Oliveros Metz, Luke Şenel, Lütfi Kerem Bosma, Maarten Sap, Maarten ter Hoeve, Maartje Farooqi, Maheen Faruqui, Manaal Mazeika, Mantas Baturan, Marco Marelli, Marco Maru, Marco Quintana, Maria Jose Ramírez Tolkiehn, Marie Giulianelli, Mario Lewis, Martha Potthast, Martin Leavitt, Matthew L. Hagen, Matthias Schubert, Mátyás Baitemirova, Medina Orduna Arnaud, Melody McElrath, Melvin Yee, Michael A. Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michael Swędrowski, Michał Bevilacqua, Michele Yasunaga, Michihiro Kale, Mihir Cain, Mike Xu, Mimee Suzgun, Mirac Walker, Mitch Tiwari, Mo Bansal, Mohit Aminnaseri, Moin Geva, Mor Gheini, Mozhdeh T, Mukund Varma Peng, Nanyun Chi, Nathan A. Lee, Nayeon Krakover, Neta Gur-Ari Cameron, Nicholas Roberts, Nicholas Doiron, Nick Martinez, Nicole Nangia, Nikita Deckers, Niklas Muennighoff, Niklas Keskar, Nitish Shirish Iyer, Niveditha S. Constant, Noah Fiedel, Noah Wen, Nuan Zhang, Oliver Agha, Omar Elbaghdadi, Omar Levy, Omer Evans, Owain Casares, Pablo Antonio Moreno Doshi, Parth Fung, Pascale Liang, Paul Pu Vicol, Paul Alipoormolabashi, Pegah Liao, Peiyuan Liang, Percy Chang, Peter Eckersley, Peter Htut, Phu Mon Hwang, Pinyu Miłkowski, Piotr Patil, Piyush Pezeshkpour, Pouya Oli, Priti Mei, Qiaozhu Lyu, Qing Chen, Qinlang Banjade, Rabin Rudolph, Rachel Etta Gabriel, Raefer Habacker, Rahel Risco, Ramon Millière, Raphaël Garg, Rhythm Barnes, Richard Saurous, Rif A. Arakawa, Riku Raymaekers, Robbe Frank, Robert Sikand, Rohan Novak, Roman Sitelew, Roman LeBras, Ronan Liu, Rosanne Jacobs, Rowan Zhang, Rui Salakhutdinov, Ruslan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Ryan Yang, Rylan Singh, Sahib Mohammad, Saif M. Anand, Sajant Dillavou, Sam Shleifer, Sam Wiseman, Sam Gruetter, Samuel Bowman, Samuel R. Schoenholz, Samuel S. Han, Sanghyun Kwatra, Sanjeev Rous, Sarah A. Ghazarian, Sarik Ghosh, Sayan Casey, Sean Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sebastian Sadeghi, Sepideh Hamdan, Shadi Zhou, Sharon Srivastava, Shashank Shi, Sherry Singh, Shikhar Asaadi, Shima Gu, Shixiang Shane Pachchigar, Shubh Toshniwal, Shubham Upadhyay, Shyam Shyamolima Debnath Shakeri, Siamak Thormeyer, Simon Melzi, Simone Reddy, Siva Makini, Sneha Priscilla Lee, Soo-Hwan Torene, Spencer Hatwar, Sriharsha Dehaene, Stanislas Divic, Stefan Ermon, Stefano Biderman, Stella Lin, Stephanie Prasad, Stephen Piantadosi, Steven T. Shieber, Stuart M. Misherghi, Summer Kiritchenko, Svetlana Mishra, Swaroop Linzen, Tal Schuster, Tal Li, Tao Yu, Tao Ali, Tariq Hashimoto, Tatsu Wu, Te-Lin Desbordes, Théo Rothschild, Theodore Phan, Thomas Wang, Tianle Nkinyili, Tiberius Schick, Timo Kornev, Timofei Tunduny, Titus Gerstenberg, Tobias Chang, Trenton Neeraj, Trishala Khot, Tushar Shultz, Tyler Shaham, Uri Misra, Vedant Demberg, Vera Nyamai, Victoria Raunak, Vikas Ramasesh, Vinay Prabhu, Vinay Uday Padmakumar, Vishakh Srikumar, Vivek Fedus, William Saunders, William Zhang, William Vossen, Wout Ren, Xiang Tong, Xiaoyu Zhao, Xinran Wu, Xinyi Shen, Xudong Yaghoobzadeh, Yadollah Lakretz, Yair Song, Yangqiu Bahri, Yasaman Choi, Yejin Yang, Yichi Hao, Yiding Chen, Yifu Belinkov, Yonatan Hou, Yu Hou, Yufang Bai, Yuntao Seid, Zachary Zhao, Zhuoye Wang, Zijian Wang, Zijie J. Wang, Zirui Wu, Ziyi
Publication Year :
2022
Abstract
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.<br />Comment: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench
Details
Database :
OAIster
Publication Type :
Electronic Resource
Accession number :
edsoai.on1333777343
Document Type :
Electronic Resource