Learning without training: The implicit dynamics of in-context learning
Jul 21, 2025
- Benoit Dherin* · Google Research
- Michael Munn* · Google Research
- Hanna Mazzawi* · Google Research
- Michael Wunder · Google Research
- Javier Gonzalvo · Google Research
Abstract
In-context learning (ICL) has emerged as a powerful paradigm where language models can learn new tasks simply by observing examples without parameter updates. This paper explores the mechanisms behind ICL, examining how models leverage their pre-trained representations to adapt to new tasks through few-shot prompting. We investigate the relationship between pre-training objectives, model architecture, and ICL performance, revealing insights into how models can generalize from limited examples. Our analysis suggests that ICL effectiveness stems from the model's ability to recognize patterns in the prompt and apply learned representations in novel contexts. This work contributes to understanding the fundamental capabilities of large language models and their potential for flexible, sample-efficient learning across diverse domains.
Introduction
Traditional machine learning approaches require extensive training data and parameter updates to adapt to new tasks. In-context learning represents a paradigm shift, where models can learn new capabilities simply by observing a few examples in their input context. This ability has profound implications for AI systems, enabling rapid adaptation without retraining and opening new possibilities for flexible, sample-efficient learning.
What is In-Context Learning?
In-context learning occurs when a language model demonstrates improved performance on a task after seeing a few examples, without any parameter updates. The model essentially "learns" the task pattern from the examples provided in its input context.
Key Characteristics
ICL requires no gradient updates or fine-tuning—the model adapts purely through the information present in the input sequence.
Few-Shot Learning
Models can learn from just a handful of examples, making them highly sample-efficient compared to traditional approaches.
Task Flexibility
The same model can adapt to diverse tasks—from translation to reasoning—without architectural changes.
Zero-Shot Capability
Some models can even perform tasks without any examples, relying solely on their pre-trained knowledge.
Theoretical Framework
Our analysis suggests that ICL effectiveness stems from the model's ability to recognize and apply patterns learned during pre-training. The key insight is that language models develop rich internal representations that can be flexibly recombined to solve new tasks.
Pattern Recognition
Models learn to identify task-specific patterns in the input context and apply similar patterns to new examples.
Representation Reuse
Pre-trained representations encode knowledge that can be flexibly applied across different domains and tasks.
Attention Mechanisms
Transformer attention allows models to focus on relevant parts of the context and establish connections between examples and target outputs.
Experimental Results
Our experiments demonstrate that ICL performance scales with model size and is influenced by the quality and quantity of pre-training data. We observe consistent improvements across diverse tasks, from language understanding to mathematical reasoning.
Scaling Laws
ICL performance improves predictably with model size, suggesting that larger models develop richer, more flexible representations.
Task Diversity
Models show strong performance across a wide range of tasks, indicating that ICL is a general capability rather than task-specific adaptation.
Applications and Implications
- Rapid prototyping and experimentation with new tasks
- Sample-efficient learning in data-scarce domains
- Flexible AI systems that can adapt without retraining
- Improved accessibility to AI capabilities
Limitations and Future Work
- Performance variability across different task types
- Dependence on high-quality pre-training data
- Limited interpretability of ICL mechanisms
- Potential for prompt injection and adversarial examples
Conclusion
In-context learning represents a fundamental shift in how we think about machine learning and AI adaptation. By enabling models to learn new tasks without parameter updates, ICL opens new possibilities for flexible, efficient AI systems. Understanding the mechanisms behind ICL is crucial for developing more capable and accessible AI technologies.
For the full details, see the original whitepaper: Learning without training: The implicit dynamics of in-context learning.