Learning without training: The implicit dynamics of in-context learning
jul 2025
Authors:
- Benoit Dherin* · Google Research
- Michael Munn* · Google Research
- Hanna Mazzawi* · Google Research
- Michael Wunder · Google Research
- Javier Gonzalvo · Google Research
Abstract
Introduction
Traditional machine learning approaches require extensive training data and parameter updates to adapt to new tasks. In-context learning represents a paradigm shift, where models can learn new capabilities simply by observing a few examples in their input context. This ability has profound implications for AI systems, enabling rapid adaptation without retraining and opening new possibilities for flexible, sample-efficient learning.
What is In-Context Learning?
In-context learning occurs when a language model demonstrates improved performance on a task after seeing a few examples, without any parameter updates. The model essentially "learns" the task pattern from the examples provided in its input context.
Key Characteristics
ICL requires no gradient updates or fine-tuning—the model adapts purely through the information present in the input sequence.
Few-Shot Learning
Models can learn from just a handful of examples, making them highly sample-efficient compared to traditional approaches.
Task Flexibility
The same model can adapt to diverse tasks—from translation to reasoning—without architectural changes.
Zero-Shot Capability
Some models can even perform tasks without any examples, relying solely on their pre-trained knowledge.
Mechanisms of In-Context Learning
Our research investigates the underlying mechanisms that enable in-context learning in large language models.
Implicit Parameter Updates
We demonstrate that ICL can be understood as an implicit form of gradient descent, where the model's activations adapt to the task without explicit parameter updates.
Representation Reuse
Models leverage pre-trained representations, recombining them in ways that match the patterns observed in the few-shot examples.
Attention Dynamics
The attention mechanism plays a crucial role in ICL, enabling the model to selectively focus on relevant examples and extract task-specific patterns.
Experimental Results
We conducted extensive experiments to understand the factors that influence in-context learning performance.
Pre-training Objectives
Models trained with diverse objectives and data sources demonstrate stronger ICL capabilities, suggesting that exposure to varied tasks during pre-training enhances adaptability.
Model Scale
ICL ability scales with model size, with larger models showing dramatically improved few-shot performance, indicating that capacity plays a crucial role in this capability.
Example Selection
The choice and ordering of examples significantly impact ICL performance, with diverse, representative examples yielding the best results.
Theoretical Framework
We propose a theoretical framework for understanding in-context learning that connects it to traditional learning paradigms.
- ICL can be formalized as implicit meta-learning during pre-training
- The transformer architecture inherently supports rapid adaptation through its attention mechanism
- Pre-training creates a rich representation space that enables generalization from few examples
- ICL performance is bounded by the diversity and coverage of the pre-training distribution
Applications and Implications
In-context learning has significant implications for AI applications and future research directions.
- Rapid prototyping of new AI capabilities without specialized training
- Personalization of models to individual users through examples
- Low-resource adaptation to specialized domains
- Flexible multi-task systems that can switch between capabilities on demand
- Sample-efficient learning for rare or novel tasks
Limitations and Future Work
While in-context learning represents a powerful paradigm, it has several limitations that present opportunities for future research.
- Context length constraints limit the number of examples that can be provided
- Performance varies significantly across tasks and example selections
- ICL remains less effective than fine-tuning for some complex tasks
- Theoretical understanding of ICL remains incomplete
- Limited ability to retain learned patterns across separate inference sessions
Conclusion
In-context learning represents a fundamental shift in how we think about machine learning, moving beyond the traditional paradigm of explicit parameter updates. Our research provides insights into the mechanisms behind this capability, demonstrating that it emerges from the interplay between pre-training, model architecture, and the rich representational capacity of large language models. As we continue to explore and enhance ICL capabilities, we can expect increasingly flexible, sample-efficient AI systems capable of rapid adaptation to new tasks and domains.
For more details, see the original paper: Learning without training: The implicit dynamics of in-context learning