Understand the tradeoffs between RAG and fine-tuning — and learn a decision framework for choosing the right approach for your use case.

Learn when to choose RAG over fine-tuning for LLM applications. Covers data freshness, cost, hallucination, knowledge boundaries, and practical decision criteria.

RAG vs Fine-Tuning: When to Choose Each

Why This Is Asked

This is a foundational AI engineering decision. Interviewers ask it to see if you have a clear mental model of what each approach solves and when each is appropriate.

Key Concepts to Cover

RAG strengths — up-to-date knowledge, source attribution, no training required
Fine-tuning strengths — style/format consistency, specialized behavior, latency
Knowledge vs. behavior — RAG adds knowledge; fine-tuning changes behavior
Data freshness — RAG handles updates trivially; fine-tuning requires retraining
Cost — RAG has per-query retrieval cost; fine-tuning has upfront training cost

How to Approach This

1. The Core Distinction

RAG: Augments the model's knowledge at inference time by retrieving relevant documents. Knowledge is stored externally and can be updated instantly.

Fine-tuning: Updates model weights to change its behavior or embed specialized knowledge. Knowledge/behavior is baked in and requires retraining to change.

2. Choose RAG When:

Data changes frequently: Product docs, news, pricing, policies
You need source attribution: Cite exactly which document supported the answer
Reducing hallucination is critical: Grounding in retrieved text limits invention
Limited labeled training data: RAG works with any existing document corpus
Knowledge needs to be auditable and updatable: Fine-tuning is an unreliable mechanism for injecting factual knowledge — it risks catastrophic forgetting, you cannot update or audit specific facts baked into weights, and models do not reliably recall fine-tuned facts consistently. RAG keeps knowledge external, queryable, and correctable

3. Choose Fine-Tuning When:

Consistent output format or style: Teaching a model to always output valid SQL
Narrow and well-defined task: A classifier, code formatter, specialized extractor
Latency is critical: Fine-tuned smaller models are faster
The task is a behavior pattern: Not facts, but how to do something

4. The "Both" Option

RAG and fine-tuning are not mutually exclusive:

Fine-tune a model on your domain to improve reasoning style and format
Use RAG to supply up-to-date factual knowledge
Example: Fine-tune on internal engineering style guide → RAG over current codebase docs

5. Practical Decision Heuristic

Start with a base model + RAG. Only add fine-tuning if:

RAG alone does not achieve needed accuracy or behavior
You have enough data to fine-tune without overfitting
The task is stable enough that a fine-tuned model will not go stale quickly

Common Follow-ups

"What about continued pre-training vs. fine-tuning vs. RAG?" Continued pre-training teaches domain vocabulary and concepts. Fine-tuning adapts behavior. RAG provides specific facts. Each serves a different role.
"Can RAG ever replace fine-tuning entirely?" For knowledge-intensive tasks, largely yes. For behavior-shaping tasks (output format, tone, task-specific reasoning), fine-tuning achieves more reliable results.
"How do you decide which documents to include in the RAG corpus?" Start with all documents that could plausibly answer user queries. Measure retrieval quality. Remove low-signal sources. Apply filters: recency, quality, domain relevance.

When Would You Choose RAG Over Fine-Tuning?

Why This Is Asked

Key Concepts to Cover

How to Approach This

1. The Core Distinction

2. Choose RAG When:

3. Choose Fine-Tuning When:

4. The "Both" Option

5. Practical Decision Heuristic

Common Follow-ups

Related Questions

Design a RAG Pipeline from Scratch

Compare Few-Shot Prompting vs. Fine-Tuning for a Classification Task

How Do You Handle Chunking Strategies for Different Document Types?

Prep for the full interview loop