Skip to main content
AI Glossary

What is a large language model (LLM)?

A large language model (LLM) is an AI system trained on enormous amounts of text to predict the next word in a sequence. By learning statistical patterns across billions of words, it can answer questions, write, summarise, translate and reason in plain language. GPT, Claude and Gemini are well-known examples.
Last updated June 2, 2026

Large language models are the engines behind most modern AI assistants, chatbots and writing tools. They turned the abstract idea of "talking to a computer in plain English" into something millions of people now do daily. But the term gets thrown around loosely. This page explains what an LLM actually is, how it learns, what it can and cannot do, and where it fits into a real product like MiyoMind.

What is a large language model, exactly?

A large language model is a type of neural network trained on a vast corpus of text to predict the most likely next chunk of language given everything before it. "Large" refers to two things: the size of the training data (often a meaningful slice of the public internet, books and code) and the number of internal parameters — the adjustable weights the model learns, frequently numbering in the billions or trillions. Those parameters store the statistical patterns of how language is used.

Most current LLMs are built on the transformer architecture, introduced by Google researchers in 2017. Transformers use a mechanism called attention, which lets the model weigh how relevant every earlier word is to the word it is currently producing. This is what allows an LLM to keep track of context across long passages rather than treating each word in isolation.

  • Neural network: a layered mathematical model loosely inspired by how neurons connect.
  • Parameters: the billions of learned weights that encode patterns in language.
  • Transformer: the architecture, using attention, that powers nearly all modern LLMs.
  • Tokens: the small word-pieces an LLM actually reads and predicts.

How are LLMs trained and how do they predict text?

LLMs learn by playing a giant game of fill-in-the-blank across their training data. The text is first broken into tokens — small units that may be whole words, parts of words, or punctuation. During pre-training, the model repeatedly tries to predict the next token; when it guesses wrong, the error is used to nudge its parameters slightly. Repeated trillions of times, this process teaches the model grammar, facts, styles and rough reasoning patterns purely from prediction.

After pre-training, models are usually refined further. Instruction tuning teaches them to follow requests, and techniques like reinforcement learning from human feedback (RLHF) align their answers with what people find helpful and safe. At inference time — when you actually use it — the model generates one token at a time, each time picking a likely continuation, then feeding its own output back in to produce the next one.

  1. Tokenise: split your input into tokens the model understands.
  2. Predict: estimate a probability for every possible next token.
  3. Sample: choose a token (often with controlled randomness for variety).
  4. Repeat: append the chosen token and predict again until the response is complete.

What can LLMs do, and where do they fall short?

Because they learned from such broad text, LLMs are general-purpose. They can draft and edit writing, answer questions, summarise long documents, translate between languages, write and explain code, extract structured data from messy text, and hold a coherent conversation. Connected to external tools, they can also take actions — searching the live web, calling APIs, or reading files — which is the basis of agentic AI.

The limits matter just as much. An LLM's built-in knowledge is frozen at its training cutoff, so it can be out of date. It can hallucinate — produce plausible-sounding but false statements — because it is optimised for likely text, not verified truth. It has a finite context window (how much it can read at once), can reflect biases present in its training data, and has no genuine understanding or intent. These are reasons to pair LLMs with retrieval, citations and human oversight rather than trusting them blindly.

65%of organisations reported regularly using generative AI, nearly double the prior yearSource: McKinsey, The State of AI, 2024

What are some examples of large language models?

The best-known LLM families come from a handful of frontier AI labs. They differ in size, training approach, speed, cost and strengths — some are tuned for reasoning, others for long context, low latency or coding.

LLM familyBuilt byOften noted for
GPTOpenAIBroad general capability and a large ecosystem
ClaudeAnthropicLong context and a focus on safety and helpfulness
GeminiGoogleMultimodal input and tight Google integration
GrokxAIReal-time-leaning, conversational style
QwenAlibabaStrong open-weight multilingual options
A few widely-used LLM families and who builds them

How does MiyoMind use large language models?

MiyoMind is a personal AI assistant — named Miyo by default — that you talk to inside WhatsApp, Telegram, Discord or a web dashboard. Rather than tying you to a single LLM, MiyoMind routes across several frontier models from OpenAI, Anthropic, Google, xAI and Alibaba using a model router called Hermes. Hermes can send a quick question to a fast, cheaper model and a hard reasoning task to a more capable one, so each request is handled by an appropriate engine.

Importantly, MiyoMind is not "a wrapper around one model". The raw LLMs supply the language ability; the orchestration that turns that into a dependable assistant — long-term memory, secure tool connections, billing, prompt-injection defences and routing — is MiyoMind's own code, built on the open-source OpenClaw agent runtime. That layer is what lets one conversation search the live web with citations, set reminders that fire across your chat apps, read documents, and remember what matters to you.

Frequently asked questions

What does LLM stand for?

LLM stands for large language model. It is an AI system trained on huge amounts of text to understand and generate human-like language by predicting the most likely next word, or token, in a sequence.

What is the difference between an LLM and a chatbot?

An LLM is the underlying model that generates language. A chatbot is a product built on top of an LLM, adding a conversation interface, memory, tools and safety controls. MiyoMind, for example, is an assistant powered by several LLMs, not an LLM itself.

Why do large language models make mistakes or hallucinate?

LLMs predict statistically likely text rather than looking up verified facts, so they can produce fluent answers that are wrong. Their knowledge is also frozen at a training cutoff. Pairing them with live web search and citations, as MiyoMind does, reduces this risk.

Are GPT, Claude and Gemini all large language models?

Yes. GPT (OpenAI), Claude (Anthropic) and Gemini (Google) are all families of large language models. They differ in size, speed, cost and strengths, which is why a router like MiyoMind's Hermes can choose the most suitable one for each task.

Do I need to choose which LLM to use with MiyoMind?

No. MiyoMind routes your request across several frontier LLMs automatically using Hermes, matching the model to the task. You simply chat in WhatsApp, Telegram, Discord or the web dashboard, and the system selects an appropriate engine behind the scenes.

What is a token in a large language model?

A token is the small unit of text an LLM reads and predicts — often a word, part of a word, or punctuation. Models process and generate one token at a time, and usage is typically measured in tokens, which is part of how AI tools meter cost.

Meet your new assistant

Already in WhatsApp, Telegram, Discord, and the web. 100 free credits every month — no card required.