The failure modes in your production data should determine the intervention, not what you’ve tried before or what a vendor recommended. A RAG refactor on a problem in the model weights, or a LoRA adapter for a task that needed better retrieval: both are failures of sequence, not execution.
Every AI services company has a specialty. Fine-tuning shops recommend fine-tuning. RAG consultancies recommend RAG. The recommendation follows from their default, not your production failures.
At $3–5k, the diagnostic classifies which failure modes are actually present, maps each one to the right intervention, and defines what it will cost, before any build commitment. You go into the execution decision with a written analysis, not a guess made before anyone’s seen your data.
The first mistake is usually the technique. The second is the evals built around it.