Back to Search Start Over

PaLM 2 Technical Report

Authors :
Anil, Rohan
Dai, Andrew M.
Firat, Orhan
Johnson, Melvin
Lepikhin, Dmitry
Passos, Alexandre
Shakeri, Siamak
Taropa, Emanuel
Bailey, Paige
Chen, Zhifeng
Chu, Eric
Clark, Jonathan H.
Shafey, Laurent El
Huang, Yanping
Meier-Hellstern, Kathy
Mishra, Gaurav
Moreira, Erica
Omernick, Mark
Robinson, Kevin
Ruder, Sebastian
Tay, Yi
Xiao, Kefan
Xu, Yuanzhong
Zhang, Yujing
Abrego, Gustavo Hernandez
Ahn, Junwhan
Austin, Jacob
Barham, Paul
Botha, Jan
Bradbury, James
Brahma, Siddhartha
Brooks, Kevin
Catasta, Michele
Cheng, Yong
Cherry, Colin
Choquette-Choo, Christopher A.
Chowdhery, Aakanksha
Crepy, Clément
Dave, Shachi
Dehghani, Mostafa
Dev, Sunipa
Devlin, Jacob
Díaz, Mark
Du, Nan
Dyer, Ethan
Feinberg, Vlad
Feng, Fangxiaoyu
Fienber, Vlad
Freitag, Markus
Garcia, Xavier
Gehrmann, Sebastian
Gonzalez, Lucas
Gur-Ari, Guy
Hand, Steven
Hashemi, Hadi
Hou, Le
Howland, Joshua
Hu, Andrea
Hui, Jeffrey
Hurwitz, Jeremy
Isard, Michael
Ittycheriah, Abe
Jagielski, Matthew
Jia, Wenhao
Kenealy, Kathleen
Krikun, Maxim
Kudugunta, Sneha
Lan, Chang
Lee, Katherine
Lee, Benjamin
Li, Eric
Li, Music
Li, Wei
Li, YaGuang
Li, Jian
Lim, Hyeontaek
Lin, Hanzhao
Liu, Zhongtao
Liu, Frederick
Maggioni, Marcello
Mahendru, Aroma
Maynez, Joshua
Misra, Vedant
Moussalem, Maysam
Nado, Zachary
Nham, John
Ni, Eric
Nystrom, Andrew
Parrish, Alicia
Pellat, Marie
Polacek, Martin
Polozov, Alex
Pope, Reiner
Qiao, Siyuan
Reif, Emily
Richter, Bryan
Riley, Parker
Ros, Alex Castro
Roy, Aurko
Saeta, Brennan
Samuel, Rajkumar
Shelby, Renee
Slone, Ambrose
Smilkov, Daniel
So, David R.
Sohn, Daniel
Tokumine, Simon
Valter, Dasha
Vasudevan, Vijay
Vodrahalli, Kiran
Wang, Xuezhi
Wang, Pidong
Wang, Zirui
Wang, Tao
Wieting, John
Wu, Yuhuai
Xu, Kelvin
Xu, Yunhan
Xue, Linting
Yin, Pengcheng
Yu, Jiahui
Zhang, Qiao
Zheng, Steven
Zheng, Ce
Zhou, Weikang
Zhou, Denny
Petrov, Slav
Wu, Yonghui
Publication Year :
2023

Abstract

We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on downstream tasks across different model sizes, while simultaneously exhibiting faster and more efficient inference compared to PaLM. This improved efficiency enables broader deployment while also allowing the model to respond faster, for a more natural pace of interaction. PaLM 2 demonstrates robust reasoning capabilities exemplified by large improvements over PaLM on BIG-Bench and other reasoning tasks. PaLM 2 exhibits stable performance on a suite of responsible AI evaluations, and enables inference-time control over toxicity without additional overhead or impact on other capabilities. Overall, PaLM 2 achieves state-of-the-art performance across a diverse set of tasks and capabilities. When discussing the PaLM 2 family, it is important to distinguish between pre-trained models (of various sizes), fine-tuned variants of these models, and the user-facing products that use these models. In particular, user-facing products typically include additional pre- and post-processing steps. Additionally, the underlying models may evolve over time. Therefore, one should not expect the performance of user-facing products to exactly match the results reported in this report.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2305.10403
Document Type :
Working Paper