What is fine-tuning?
Large language models start out as general-purpose systems trained on enormous, broad datasets. Fine-tuning is how you specialise one of those models without building it from scratch. You feed it a curated set of examples — say, thousands of support tickets paired with ideal replies — and continue training so the model's weights shift toward your desired behaviour. The result is a model that reliably writes in a particular voice, follows a niche format, or handles a domain the base model only knew vaguely.
Fine-tuning sits between two other techniques: prompting (giving instructions and examples in the request itself) and retrieval-augmented generation, or RAG (fetching relevant documents at query time and inserting them into the context). Knowing when to reach for each is one of the most practical decisions in applied AI.
How does fine-tuning work?
Fine-tuning continues the training process on a model that already understands language, using a dataset that is small compared to the original but rich in the patterns you care about. Each example nudges the model's weights so the behaviour generalises beyond the exact cases you supplied. A typical workflow looks like this:
- Collect a clean, labelled dataset of input-output pairs that demonstrate the behaviour you want (often hundreds to tens of thousands of examples).
- Choose a base model and a fine-tuning method — full fine-tuning updates every weight; lightweight methods like LoRA update only a small set of added parameters, which is cheaper and faster.
- Run the training job, monitoring for overfitting (the model memorising examples instead of learning the pattern).
- Evaluate the tuned model against held-out examples, then deploy it and keep measuring quality on real traffic.
A plain example: a base model can summarise a legal contract competently, but a model fine-tuned on a law firm's past summaries will match that firm's house style, flag the clauses they care about, and use their terminology — without you having to spell all of that out in every prompt.
Fine-tuning vs RAG vs prompting: which should you use?
Pick the lightest tool that solves your problem. Prompting is instant and free to change. RAG is best when the model needs current or private facts. Fine-tuning is best when you need to change how the model behaves, not what it knows. They are not mutually exclusive — many production systems use all three together.
| Technique | Best for | Changes model weights? | Keeps knowledge fresh? | Cost to iterate |
|---|---|---|---|---|
| Prompting | Steering tone, format, and one-off tasks | No | Only via context you paste in | Free / instant |
| RAG | Injecting current or private facts at query time | No | Yes — re-index your data | Low / medium |
| Fine-tuning | Locking in a consistent style, format, or task behaviour | Yes | No — retrain to update | Medium / high |
- Use prompting first. It is the fastest to change and handles a surprising amount with good instructions and a few examples.
- Use RAG when answers depend on facts that change or that the base model never saw — documentation, product catalogues, your own notes. Updating data does not require retraining.
- Use fine-tuning when you need consistent behaviour at scale: a fixed output format, a brand voice, or a specialised task where prompting alone is inconsistent.
- RAG keeps knowledge current; fine-tuning keeps behaviour consistent. Combining a tuned model with RAG retrieval is common in production.
What are the tradeoffs of fine-tuning?
Fine-tuning is powerful but heavier than the alternatives. It costs compute and engineering time, it produces a model snapshot that goes stale as the world changes, and updating it means running another training job. A fine-tuned model can also overfit — performing brilliantly on your examples but worse on edge cases. And because the new behaviour is baked into weights, it is harder to audit or roll back than a prompt or a retrieval index you can edit in seconds.
Fine-tuning also does not reliably teach a model new facts the way RAG does. It is far better at adjusting form, style, and task framing than at injecting fresh, verifiable knowledge. For most teams, prompting plus RAG covers the majority of needs, and fine-tuning is reserved for high-volume tasks where a small, consistent improvement compounds.
How does MiyoMind personalise without per-user fine-tuning?
MiyoMind does not fine-tune a separate model for each user — that would be slow, expensive, and hard to keep current. Instead, it personalises at request time using two mechanisms that give you the feel of a tailored assistant while staying fresh and editable.
- Persona: you describe how Miyo should sound and behave, and that guidance is woven into the system instructions on every message — closer to structured prompting than to fine-tuning.
- Long-term memory: MiyoMind remembers the facts and preferences that matter to you and surfaces the relevant ones when they apply, similar in spirit to RAG. Memories are encrypted at rest with AES-256-GCM.
Under the hood, MiyoMind runs the open-source OpenClaw agent runtime, a model router called Hermes, and our own orchestration, memory, billing, safety, and routing code. It routes to frontier models from OpenAI, Anthropic, Google, xAI, and Alibaba rather than tying you to a single fine-tuned model — so you get the right model for each task, with your persona and memory applied on top. Because personalisation lives in editable persona and memory rather than frozen weights, you can change how Miyo works instantly, without retraining anything.
Frequently asked questions
What is fine-tuning in AI?
Fine-tuning is taking a pre-trained AI model and training it further on a smaller, focused dataset so it adapts to a specific task, tone, or domain. It updates the model's internal weights, permanently changing how it behaves rather than just steering it for a single request.
What is the difference between fine-tuning and RAG?
Fine-tuning changes how a model behaves by retraining its weights on examples, while RAG (retrieval-augmented generation) fetches relevant documents at query time and adds them to the context. Fine-tuning is best for consistent style or task behaviour; RAG is best for keeping answers current and grounded in specific facts.
When should I fine-tune instead of prompting?
Prompt first, because it is free and instant to change. Fine-tune when prompting alone gives inconsistent results on a high-volume task — for example when you need a fixed output format or a reliable brand voice across thousands of requests. The consistency gain has to justify the training cost.
Does fine-tuning teach a model new facts?
Not reliably. Fine-tuning is much better at adjusting form, style, and task framing than at injecting fresh, verifiable knowledge. For facts that change or that the base model never saw, retrieval-augmented generation (RAG) is usually the better tool because you can update the data without retraining.
Does MiyoMind fine-tune a model for me?
No. MiyoMind personalises through an editable persona and long-term memory that are applied at request time, not by training a separate model per user. This keeps your assistant current and lets you change its behaviour instantly, while routing each task to the best available frontier model.
Related
Meet your new assistant
Already in WhatsApp, Telegram, Discord, and the web. 100 free credits every month — no card required.