Back to Search
Start Over
When can transformers compositionally generalize in-context?
- Publication Year :
- 2024
-
Abstract
- Many tasks can be composed from a few independent components. This gives rise to a combinatorial explosion of possible tasks, only some of which might be encountered during training. Under what circumstances can transformers compositionally generalize from a subset of tasks to all possible combinations of tasks that share similar components? Here we study a modular multitask setting that allows us to precisely control compositional structure in the data generation process. We present evidence that transformers learning in-context struggle to generalize compositionally on this task despite being in principle expressive enough to do so. Compositional generalization becomes possible only when introducing a bottleneck that enforces an explicit separation between task inference and task execution.<br />Comment: ICML 2024 workshop on Next Generation of Sequence Modeling Architectures
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2407.12275
- Document Type :
- Working Paper