AI Glossary

What are tokens in AI?

Tokens are the small chunks of text that AI language models read and generate. A token is roughly four characters or three-quarters of a word, so "unbelievable" might be three tokens. Models process text as sequences of tokens, and pricing, context limits and speed are all measured in tokens rather than words.

Last updated June 2, 2026

When you type a message to an AI assistant, the model never sees your letters directly. It first breaks your text into tokens, the units it actually processes. Understanding tokens explains almost everything about how large language models (LLMs) behave: why they cost what they cost, why they have a memory limit, and why a long PDF can suddenly become expensive to analyse.

What is a token, exactly?

A token is a piece of text, usually a short word, part of a word, or a punctuation mark. A tokenizer is the component that splits raw text into these pieces using a learned vocabulary. Common words like "the" or "and" are a single token, while rarer or longer words get split into several. Spaces, line breaks and emojis count too.

As a rough rule for English text, one token is about four characters or three-quarters of a word. So 100 tokens is roughly 75 words, and a single page of prose is often 500 to 800 tokens. The exact split depends on the model's tokenizer, so the same sentence can map to slightly different token counts across OpenAI, Anthropic and Google models.

"cat" → 1 token
"unbelievable" → often 3 tokens (un + believ + able)
"MiyoMind" → typically 2 to 3 tokens, because it is not a common word
A number like "2026" → 1 to 2 tokens
Code, JSON and non-English text → usually more tokens per character than plain English

How does text become tokens?

Most modern LLMs use a method called byte-pair encoding (BPE) or a close variant. Instead of memorising every possible word, the tokenizer learns a fixed vocabulary of frequently occurring character sequences. Frequent chunks become their own token; anything unusual is assembled from smaller pieces. This lets a model handle made-up words, typos and new slang without ever seeing a blank.

Your raw text is normalised (whitespace and encoding are cleaned up).
The tokenizer scans it and matches the longest sequences in its vocabulary.
Each matched chunk is replaced by a numeric token ID.
The model processes that sequence of IDs and predicts the next token, one at a time.
The output token IDs are converted back into readable text for you.

Why are AI pricing and limits measured in tokens?

Because tokens are the unit of work. A model's effort scales with how many tokens it reads (input) and writes (output), so providers price API access per token, usually per million tokens, with input and output charged at different rates. Tokens also define the context window, the maximum number of tokens a model can consider at once, covering your prompt, any attached files and the running conversation.

This is why a long document can be costly to summarise: feeding a 50-page report might mean tens of thousands of input tokens before the model writes a single word of its answer. It is also why an assistant can "forget" earlier parts of a very long chat. Once a conversation exceeds the context window, the oldest tokens have to be dropped or summarised to make room.

Content	Approximate tokens	Why it matters
A one-line question	10 to 30 tokens	Cheap, fast input
A page of text (~500 words)	650 to 800 tokens	Adds up across long chats
A 20-page PDF	10,000+ tokens	Large input cost before any reply
A detailed written answer	300 to 1,000 tokens	Output tokens often cost more than input

How token counts translate to text

33 trilliontokens of text data used to train a leading open model, illustrating the scale at which tokens are countedSource: Meta Llama 3 model card, 2024

How do tokens relate to MiyoMind credits?

MiyoMind hides raw tokens behind a single, simpler unit: credits. You do not have to learn each model's per-token price or do tokenizer maths in your head. Behind the scenes MiyoMind's billing layer measures the actual tokens consumed by the model plus any tools it used, converts that real usage into credits, and bills only what your request genuinely cost. One credit is worth roughly $0.005 of value.

Because MiyoMind routes across frontier models from OpenAI, Anthropic, Google, xAI and Alibaba through its Hermes router, the token price varies by model and task. Metering on actual usage means a short "set a reminder" message costs far less than asking the assistant to read and analyse a long PDF. You see your balance and history in the dashboard rather than a wall of token figures.

Free plan: $0/mo, 100 credits every month, no card required.
Plus: $14.99/mo, 6,000 credits per month, with a dedicated sandboxed container.
Pro: $39.99/mo, 18,000 credits per month, with a dedicated sandboxed container.
Top-up packs are also available (for example 600 credits for $3 or 10,000 for $50) when you need more.

The practical upshot: you talk to Miyo in plain language inside WhatsApp, Telegram, Discord or the web dashboard, and credits quietly meter the underlying tokens and tools. Tokens are still doing all the work under the hood, but you reason about your usage in credits, which map cleanly to dollars.

Frequently asked questions

How many tokens are in a word?

For typical English text, one word is roughly 1.3 tokens, and one token is about three-quarters of a word or four characters. So 1,000 words is usually around 1,300 to 1,500 tokens. The ratio shifts for code, numbers and non-English languages, which tend to use more tokens per character.

Why does my AI assistant cost more for long documents?

Because the entire document is converted into input tokens that the model must read before it can answer. A long PDF can be tens of thousands of tokens, and you pay for those input tokens even before the reply is generated. Shorter prompts and attachments keep usage, and therefore cost, lower.

What is a context window in terms of tokens?

The context window is the maximum number of tokens a model can consider at once, including your prompt, any files, and the conversation so far. When a chat grows beyond that limit, older tokens get dropped or summarised, which is why an assistant can lose track of very early details in a long conversation.

Do input and output tokens cost the same?

Usually not. Most providers charge separate rates for input tokens (what the model reads) and output tokens (what it writes), and output is often priced higher. MiyoMind abstracts this away with credits, metering the real input and output usage of whichever model handled your request.

How does MiyoMind charge for tokens?

MiyoMind does not bill you in raw tokens. It measures the actual tokens and tools a request consumes, converts that to credits, and deducts only the real cost. One credit is worth about $0.005 of value, and you track everything as a simple credit balance in the dashboard instead of per-token figures.

Are tokens the same across all AI models?

No. Each model family uses its own tokenizer and vocabulary, so the same sentence can produce slightly different token counts on OpenAI, Anthropic and Google models. The general scale, roughly four characters per token, is similar, but exact counts and per-token prices vary by model.

Embeddings Vector database Function calling AI Glossary What is an AI agent?

Meet your new assistant

Already in WhatsApp, Telegram, Discord, and the web. 100 free credits every month — no card required.

Get started free How it works