Practical Decision-Making

Model Selection Frameworks

When to fine-tune vs. prompt, self-host vs. API, and which model family to use

What it is

A practical framework for model selection:

Prompting first: For most tasks, start with prompt engineering against a strong API model. It's fastest to iterate and often sufficient.

RAG if knowledge is the gap: If the model lacks specific knowledge (company data, recent events, proprietary information), add RAG before considering fine-tuning.

Fine-tune if behavior is the gap: If the model reliably understands the task but consistently produces the wrong format or style, fine-tuning is appropriate.

Self-host if privacy or cost at scale is the constraint: Open-weight models for data-sensitive applications or very high-volume deployments.

Model family selection: Claude excels at instruction following and agentic tasks; GPT-4 is strong across the board with a large ecosystem; Gemini has the longest context and Google integration.

Why it matters

This decision tree is something you'll apply on every project. Getting it right (not over-engineering or under-engineering) is what makes AI projects succeed. Being able to walk a client through this framework demonstrates genuine expertise beyond "we'll just use ChatGPT."

Related concepts

Cost and Deployment Tradeoffs Fine-tuning

Resources

RAG vs Fine-Tuning vs Prompt Engineering: Optimizing AI Models

youtube.com· IBM Technology's comparison of RAG, fine-tuning, and prompt engineering for different use cases. Covers when each approach is best with clear decision framework. **Confirmed.**

10 min

Stanford CS230 | Autumn 2025 | Lecture 8: Agents, Prompts, and RAG

youtube.com· Stanford's 2025 lecture covering agents, prompting strategies, and RAG as part of the model selection landscape. Academic authority with practical framing. **Confirmed.**

60 min