Model Types

Base Models

Pure autocomplete, powerful but raw and hard to direct

What it is

A base model is the direct output of pre-training, a model trained only to predict the next token with no instruction fine-tuning or safety training applied. Base models have internalized an enormous amount of knowledge and capability from their training data, but they express it inconsistently.

If you prompt a base model to translate text to Spanish, it might do the translation, or it might continue the text as if it were a forum thread, or generate a different translation task. It has the capability but can't reliably surface it on demand.

Base models are valuable for researchers who want to fine-tune models for specific tasks from a clean starting point, or who want to study what capabilities emerge purely from scale and data.

Why it matters

Understanding base models clarifies why post-training matters so much, and why "the model knows X" doesn't mean "the model will tell you X." It also gives context for evaluating open-weight model releases, when Meta releases a Llama base model vs. an instruct model, they serve very different purposes. A base model needs additional fine-tuning before being deployable in a product.