Start Over

On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

Authors :: Shin, Seongjin
Lee, Sang-Woo
Ahn, Hwijeen
Kim, Sungdong
Kim, HyoungSeok
Kim, Boseop
Cho, Kyunghyun
Lee, Gichang
Park, Woomyoung
Ha, Jung-Woo
Sung, Nako
Publication Year :: 2022
Abstract: Many recent studies on large-scale language models have reported successful in-context zero- and few-shot learning ability. However, the in-depth analysis of when in-context learning occurs is still lacking. For example, it is unknown how in-context learning performance changes as the training corpus varies. Here, we investigate the effects of the source and size of the pretraining corpus on in-context learning in HyperCLOVA, a Korean-centric GPT-3 model. From our in-depth investigation, we introduce the following observations: (1) in-context learning performance heavily depends on the corpus domain source, and the size of the pretraining corpus does not necessarily determine the emergence of in-context learning, (2) in-context learning ability can emerge when a language model is trained on a combination of multiple corpora, even when each corpus does not result in in-context learning on its own, (3) pretraining with a corpus related to a downstream task does not always guarantee the competitive in-context learning performance of the downstream task, especially in the few-shot setting, and (4) the relationship between language modeling (measured in perplexity) and in-context learning does not always correlate: e.g., low perplexity does not always imply high in-context few-shot learning performance.<br />Comment: Accepted to NAACL2022 as a long paper. Camera-ready version

Subjects :: Computer Science - Computation and Language

Details

Database :: arXiv
Publication Type :: Report
Accession number :: edsarx.2204.13509
Document Type :: Working Paper

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources