What is a foundation model?
The term "foundation model" was coined by researchers at Stanford's Center for Research on Foundation Models in 2021 to describe a shift in how AI is built. Instead of training a separate model for every task, you train one large model on a huge, general dataset and then reuse it everywhere. That single base becomes the foundation other applications are built on top of.
Most foundation models are large language models (LLMs) trained on text, but the idea is broader. A foundation model can be multimodal, meaning it handles text, images, audio or code. What unites them is the recipe: broad pre-training first, task-specific adaptation second.
How does a foundation model work?
A foundation model works in two stages: it is pre-trained broadly on enormous datasets, then adapted to specific jobs. Pre-training is the expensive, one-time step that produces general knowledge and reasoning ability. Adaptation is the cheap, repeatable step that points those general capabilities at a particular problem.
- Pre-training: the model learns patterns from a massive, diverse dataset (web text, books, code, sometimes images) by predicting missing or next tokens. This is where general language, reasoning and world knowledge come from.
- Adaptation: the same base model is steered toward a task. The lightest method is prompting (just describing the task in plain language); heavier methods include fine-tuning on a smaller, focused dataset or attaching retrieval and tools.
- Deployment: the adapted model is wired into a product, where it answers questions, drafts content, writes code or analyses files in response to real user input.
Because the costly pre-training is done once, developers can build dozens of distinct applications, such as a coding assistant, a customer-support bot, a research tool, on the same foundation. That reuse is exactly why these models reshaped the AI industry: capability that once took a dedicated team and dataset is now a prompt away.
What are some examples of foundation models?
The best-known foundation models are the large language and multimodal models released by major AI labs. They share the broad-pre-training, then-adapt recipe but differ in size, training data, modality and strengths.
| Lab | Model family | Primary modality |
|---|---|---|
| OpenAI | GPT family | Text, with multimodal variants |
| Anthropic | Claude family | Text and vision |
| Gemini family | Natively multimodal | |
| xAI | Grok family | Text, with multimodal variants |
| Alibaba | Qwen family | Text, code and multimodal |
Beyond language, foundation models exist for images (text-to-image generators), audio (speech recognition and synthesis) and even protein structure. The label applies whenever one broadly trained model is adapted to many downstream uses rather than purpose-built for a single one.
Why do foundation models matter?
Foundation models matter because they turned AI from a bespoke, per-task engineering effort into a shared, general-purpose layer. A small team can now build a capable product on top of an existing model instead of collecting data and training one from scratch, which lowers cost, shortens timelines and broadens who can ship AI features.
That rapid adoption is largely possible because foundation models are reusable. But the model is only one ingredient. On its own, a foundation model has no memory of you, no access to your tools, no safety boundary and no billing. The orchestration around the model, the part that decides which model to use, remembers context, calls tools and enforces safety, is what turns a raw model into a useful product.
How does MiyoMind use foundation models?
MiyoMind builds on frontier foundation models from several labs rather than betting on one. Its model router, Hermes, picks an appropriate model from OpenAI, Anthropic, Google, xAI or Alibaba for each request, so you benefit from each lab's strengths without choosing one yourself. MiyoMind is not a wrapper around a single model.
The foundation models do the raw reasoning and generation. Everything that makes Miyo, MiyoMind's assistant, genuinely useful is MiyoMind's own layer on top:
- Orchestration on the open-source OpenClaw runtime plus Hermes routing, so the right model handles each task.
- Long-term memory of what matters to you, encrypted at rest with AES-256-GCM.
- Live web search with citations, reminders that fire across your chat apps, image generation, voice transcription and document analysis.
- Secure OAuth connectors to tools like Gmail, Google Calendar, Drive, Notion, Slack, GitHub and Linear, around 30 in total.
- Safety and isolation: every paid user gets a dedicated, sandboxed container, and a 10-layer prompt-injection defence runs on every message.
You reach all of it inside WhatsApp, Telegram, Discord or the web dashboard at miyomind.com, with nothing extra to install for chat. The free tier runs on a shared direct-agent path; paid plans add a dedicated, sandboxed container. In short, foundation models supply the intelligence, and MiyoMind supplies the memory, tools, safety and routing that turn that intelligence into an assistant you can actually live in.
Frequently asked questions
What is the difference between a foundation model and an LLM?
A large language model (LLM) is a foundation model that works specifically with text. Foundation model is the broader category: it includes LLMs but also image, audio and multimodal models. Put simply, every LLM is a foundation model, but not every foundation model is an LLM.
How is a foundation model trained?
It is pre-trained once on a very large, diverse dataset by predicting missing or next tokens, which builds general knowledge and reasoning. After that broad pre-training, it is adapted to specific tasks through prompting, fine-tuning on focused data, or by attaching retrieval and tools.
Are foundation models the same as generative AI?
They overlap but are not identical. Most generative AI products are powered by foundation models, and most modern foundation models can generate text or images. Foundation model describes how the model is built and reused, while generative AI describes what it does: producing new content.
Does MiyoMind train its own foundation model?
No. MiyoMind uses frontier foundation models from OpenAI, Anthropic, Google, xAI and Alibaba, selected per request by its Hermes router. MiyoMind's proprietary work is the orchestration, memory, billing, safety and routing layered on top of those models, not the models themselves.
Why use multiple foundation models instead of one?
Different models have different strengths in reasoning, speed, cost and multimodal handling. By routing each request to a suitable model across several labs, MiyoMind avoids being limited to one provider's trade-offs and can adapt as the underlying models improve.
Can you adapt a foundation model without retraining it?
Yes. The lightest form of adaptation is prompting: you describe the task in plain language and the model responds. You can also add retrieval or tools to extend it. Full fine-tuning is only needed when prompting and retrieval are not enough for a specialised task.
Related
Meet your new assistant
Already in WhatsApp, Telegram, Discord, and the web. 100 free credits every month — no card required.