Advanced Training Mechanics

In-Context Learning

How models learn from examples in their prompt without weight updates

What it is

In-context learning (ICL) is the ability of an LLM to adapt its behavior based on examples provided in the prompt, without any gradient updates to its weights. Provide 5 examples of a new classification task, and the model generalizes to new instances, even for tasks it's never seen during training.

This is fundamentally different from traditional machine learning, where adaptation requires re-training. ICL appears to be an emergent capability of sufficiently large models, smaller models don't show robust in-context learning.

The mechanism is still actively studied. Leading theories suggest models perform a form of implicit gradient descent in the forward pass, or learn general pattern-matching algorithms during training that apply at inference time.

Why it matters

ICL is why few-shot prompting works and why you can teach a model new tasks through careful prompt construction. Understanding ICL helps you use few-shot prompting more effectively and understand why structured prompt design sometimes produces dramatically better results than natural language instructions alone.