Skip to main content
Feature

Voice messages and transcription

Send MiyoMind a voice note on WhatsApp, Telegram, Discord or the web dashboard, and Miyo transcribes it with OpenAI Whisper, then acts on it like any typed message. Speak a reminder, a question, or a draft request, and you get a real answer back in seconds, hands-free.
Last updated June 2, 2026

Typing isn't always practical. You're walking, driving, cooking, or just thinking out loud faster than your thumbs can keep up. MiyoMind lets you talk to Miyo, its assistant, by sending a plain voice note inside the chat apps you already use. There's no separate voice app to open and no special mode to enable: record, send, and Miyo handles the rest.

Behind the scenes, every voice note is transcribed to text using OpenAI's Whisper speech-recognition model. That transcript then flows through exactly the same pipeline as a typed message, so anything you can ask MiyoMind to do in text, you can ask by voice: search the live web, set a recurring reminder, draft an email, read a document, generate an image, or recall something from a past conversation.

How does voice transcription work?

When you send a voice message, MiyoMind downloads the audio from the platform, transcribes it with Whisper, and treats the resulting text as your turn in the conversation. You don't trigger transcription manually; it happens automatically whenever the incoming message is audio.

  1. Record a voice note in WhatsApp, Telegram, or Discord (or attach audio in the web dashboard) and send it to Miyo.
  2. MiyoMind fetches the audio and sends it to Whisper for speech-to-text transcription.
  3. The transcript is processed exactly like a typed message, including web search, reminders, memory and connector actions.
  4. Miyo replies in the chat. By default the reply is text, and you can ask Miyo to reply with a voice note when you'd rather listen.

What languages does it support?

Whisper is a multilingual model trained on a broad range of spoken languages, so you can dictate to Miyo in many languages beyond English and still get an accurate transcript. It also handles accents, casual phrasing, and natural pauses far better than older voice-input tools, which means you can talk normally instead of speaking like a robot.

If a clip is noisy, mumbled, or very short, transcription quality can drop, the same way it would for any speech-recognition system. A quick rule of thumb: speak clearly, keep background noise low, and front-load the actual request so the most important words are captured cleanly.

What can I actually do with voice notes?

Because the transcript runs through MiyoMind's full agent pipeline, a voice note isn't limited to dictation, it can trigger real work. Here are realistic examples you could speak into a single message:

  • "Remind me to call the dentist every Monday at 9am." — Miyo creates a recurring reminder that fires in your chat app.
  • "What's the latest on the EU AI Act, with sources?" — Miyo searches the live web and replies with cited findings.
  • "Draft a polite follow-up email to Priya about the invoice." — Miyo writes the draft for you to review and send.
  • "Summarise the PDF I just sent in three bullet points." — Miyo reads the document and returns a tight summary.
  • "What did we decide about the launch date last week?" — Miyo recalls the relevant past conversation.
  • "Make me an image of a minimalist desk setup for a blog header." — Miyo generates the image and delivers it.

Voice is especially handy for hands-free moments: capturing a thought before you forget it, queuing a reminder while you're between meetings, or asking a quick research question on the move.

How much does it cost?

MiyoMind runs on a single credit balance rather than per-feature fees. Sending a voice note uses a small amount of credits for the transcription step, plus whatever the rest of the task costs, for example a web search or image generation. Credits meter actual model and tool usage, so a short "remind me" voice note costs very little, while a research-heavy request costs more.

PlanPrice / moCredits / moDedicated container
Free$0100No (shared direct-agent)
Plus$14.996,000Yes (1)
Pro$39.9918,000Yes (1)
MiyoMind plans (USD)

All tiers can send voice notes. You can also buy top-up credit packs (600 for $3, 2,000 for $10, 5,000 for $25, 10,000 for $50) if you run through your monthly allowance. One credit is worth roughly $0.005 of value.

Is it private?

Voice notes are handled with the same safeguards as everything else on MiyoMind. Every paid user gets a dedicated, sandboxed container with no public internet egress, a read-only root filesystem, and zero stored external API keys. Your integrations and long-term memories are encrypted at rest with AES-256-GCM, and a 10-layer prompt-injection defence plus output scrubbing runs on every message, transcribed audio included. The free tier runs on a shared direct-agent path instead of a dedicated container.

Tips for better results

  • Say the action first, then the detail: "Remind me Friday at 5 to send the report" beats burying the request mid-sentence.
  • Keep clips reasonably short and focused, one task per voice note transcribes more reliably than a rambling monologue.
  • Speak in a quiet spot when you can; background noise is the biggest enemy of accurate transcription.
  • Ask for a voice reply when you want to listen instead of read, useful when your hands or eyes are busy.
  • If a transcript looks slightly off, just send a quick text correction, Miyo keeps the context from the same conversation.

Frequently asked questions

Which apps can I send voice messages to MiyoMind on?

You can send voice notes to Miyo on WhatsApp, Telegram, and Discord, and attach audio in the web dashboard at miyomind.com. There's nothing extra to install for chat, you use MiyoMind inside the apps you already have.

How does MiyoMind transcribe my voice notes?

MiyoMind transcribes audio using OpenAI's Whisper speech-recognition model. The transcript is then processed exactly like a typed message, so your spoken request can trigger web search, reminders, drafting, memory recall, and more.

Can MiyoMind understand languages other than English?

Yes. Whisper is a multilingual model, so you can dictate in many languages and still get an accurate transcript. It also handles accents and natural, casual speech well, so you can talk the way you normally would.

Does sending a voice note cost extra credits?

Voice notes use a small amount of credits for the transcription step, plus whatever the rest of the task costs. Credits meter actual usage, so a short reminder costs very little while a research-heavy request costs more. All plans, including Free, can send voice notes.

Can Miyo reply with a voice note instead of text?

Yes. By default Miyo replies in text, but you can ask it to respond with audio, and MiyoMind's text-to-speech turns the answer into a voice note you can listen to. That makes it a true hands-free AI voice assistant.

Are my voice messages kept private?

Voice notes get the same protections as every message. Paid users run in a dedicated sandboxed container with no public internet egress, integrations and memories are encrypted at rest with AES-256-GCM, and a 10-layer prompt-injection defence plus output scrubbing runs on every message.

Meet your new assistant

Already in WhatsApp, Telegram, Discord, and the web. 100 free credits every month — no card required.