Learning without training: The implicit dynamics of in-context learning

Jul 21, 2025

  • Benoit Dherin* · Google Research
  • Michael Munn* · Google Research
  • Hanna Mazzawi* · Google Research
  • Michael Wunder · Google Research
  • Javier Gonzalvo · Google Research

Abstract

In-context learning (ICL) has emerged as a powerful paradigm where language models can learn new tasks simply by observing examples without parameter updates. This paper explores the mechanisms behind ICL, examining how models leverage their pre-trained representations to adapt to new tasks through few-shot prompting. We investigate the relationship between pre-training objectives, model architecture, and ICL performance, revealing insights into how models can generalize from limited examples. Our analysis suggests that ICL effectiveness stems from the model's ability to recognize patterns in the prompt and apply learned representations in novel contexts. This work contributes to understanding the fundamental capabilities of large language models and their potential for flexible, sample-efficient learning across diverse domains.

Introduction

Traditional machine learning approaches require extensive training data and parameter updates to adapt to new tasks. In-context learning represents a paradigm shift, where models can learn new capabilities simply by observing a few examples in their input context. This ability has profound implications for AI systems, enabling rapid adaptation without retraining and opening new possibilities for flexible, sample-efficient learning.

What is In-Context Learning?

In-context learning occurs when a language model demonstrates improved performance on a task after seeing a few examples, without any parameter updates. The model essentially "learns" the task pattern from the examples provided in its input context.

Key Characteristics

ICL requires no gradient updates or fine-tuning—the model adapts purely through the information present in the input sequence.

Few-Shot Learning

Models can learn from just a handful of examples, making them highly sample-efficient compared to traditional approaches.

Task Flexibility

The same model can adapt to diverse tasks—from translation to reasoning—without architectural changes.

Zero-Shot Capability

Some models can even perform tasks without any examples, relying solely on their pre-trained knowledge.

Theoretical Framework

Our analysis suggests that ICL effectiveness stems from the model's ability to recognize and apply patterns learned during pre-training. The key insight is that language models develop rich internal representations that can be flexibly recombined to solve new tasks.

Pattern Recognition

Models learn to identify task-specific patterns in the input context and apply similar patterns to new examples.

Representation Reuse

Pre-trained representations encode knowledge that can be flexibly applied across different domains and tasks.

Attention Mechanisms

Transformer attention allows models to focus on relevant parts of the context and establish connections between examples and target outputs.

Experimental Results

Our experiments demonstrate that ICL performance scales with model size and is influenced by the quality and quantity of pre-training data. We observe consistent improvements across diverse tasks, from language understanding to mathematical reasoning.

Scaling Laws

ICL performance improves predictably with model size, suggesting that larger models develop richer, more flexible representations.

Task Diversity

Models show strong performance across a wide range of tasks, indicating that ICL is a general capability rather than task-specific adaptation.

Applications and Implications

  • Rapid prototyping and experimentation with new tasks
  • Sample-efficient learning in data-scarce domains
  • Flexible AI systems that can adapt without retraining
  • Improved accessibility to AI capabilities

Limitations and Future Work

  • Performance variability across different task types
  • Dependence on high-quality pre-training data
  • Limited interpretability of ICL mechanisms
  • Potential for prompt injection and adversarial examples

Conclusion

In-context learning represents a fundamental shift in how we think about machine learning and AI adaptation. By enabling models to learn new tasks without parameter updates, ICL opens new possibilities for flexible, efficient AI systems. Understanding the mechanisms behind ICL is crucial for developing more capable and accessible AI technologies.

For the full details, see the original whitepaper: Learning without training: The implicit dynamics of in-context learning.