Audio never leaves your device.
Full Whisper transcription and speaker diarization run on the Apple Neural Engine. No API. No cloud. No account. Works offline after the model downloads on first launch.
Capture on iPhone. Transcribe on the Apple Neural Engine. Route the markdown to iCloud, Obsidian, your n8n watch-folder, your webhook — wherever your automation already lives.
VoiceToMD doesn't replace your vault. It feeds it. Voice notes are context, not prompts.
Full Whisper transcription and speaker diarization run on the Apple Neural Engine. No API. No cloud. No account. Works offline after the model downloads on first launch.
Bind the Action Button. Press it on a locked phone. FaceID, recording, lock-screen pill — Pause, Stop, Send — without ever opening the app. Siri and Control Center work too.
iCloud Drive, Email, Webhook, SFTP, Shortcut. Fan out one capture to multiple destinations. Markdown with YAML frontmatter — drop it into anything that already reads markdown.
The whole capture happens on the lock screen. Press, talk, send. The Live Activity walks you from recording to vault without ever opening the app.
FaceID resolves, the mic opens — recording is running before you finish raising the phone.
Red indicator, profile name, destination subtitle, self-ticking timer. Pause and Stop are 44pt pills.
Audio length, ETA, animated progress. Sub-state "Identifying speakers" when diarization runs.
Final filename, transcribed pill. Hold sends to History to triage later. Send goes straight to your destination.
Configure as many destinations as you need, group them into profiles, and fan one capture out to multiple targets simultaneously.
Drop into any folder — including inside an Obsidian vault.
Pre-filled compose sheet, ready for tap-to-send.
POST JSON to any URL. Headers, secret token, retries.
Key-based auth into your home server, NAS, or VPS.
Hand off to any iOS Shortcut — your automation, your call.
Free includes iCloud Drive and Email. Pro adds Webhook, SFTP, Shortcut, and multi-destination fan-out.
--- title: Notes from the project review created: 2026-05-09T14:32:11-05:00 source: voice profile: voice-memo duration_sec: 184 destination: anzel-vault-inbox device: iPhone model: openai_whisper-small_216MB audio_retained: true audio_path: 2026-05-09-notes.m4a speakers: 2 speaker_labels: [Chris, Hannah] tags: [voice-note] --- **Chris:** The screenshot spec needs to land before the App Store description. They share vocabulary. **Hannah:** Agreed. Let's sequence it that way.
Drop into Obsidian, Foam, Logseq, Quartz, your static-site generator, or any pipeline that reads YAML. Nothing to preprocess. Nothing to clean up.
**Speaker:** labels appear only when more than one voice is in the room. One-speaker takes stay clean.{date}, {slug}, {profile}. Auto-suffixes on collision.No account. No subscription manager. Apple handles billing.
| VoiceToMD | Plaud | Otter / Fireflies | Apple Voice Memos | |
|---|---|---|---|---|
| Where audio is processed | On-device | Their cloud | Their cloud | On-device (no transcription) |
| Built for | PKM / vault users with own LLM stack | Knowledge workers, no existing system | Meeting professionals | Anyone |
| Output | Markdown with YAML frontmatter | Their app + 27+ exports | SaaS dashboard + PDF | Audio file in iCloud |
| Destination | Wherever you want | Plaud cloud (lock-in) | Their dashboard | Manual export |
| Pricing | $30/yr or $2.99/mo | $99–240/yr | $17–30/seat/mo | Free |
| Hardware | iPhone you already own | Separate $159 device | None | None |
Small (216 MB), Distil-Large (594 MB), and Large-Turbo (954 MB). Pick during onboarding. Switch any time in Settings. Re-transcribe past notes with a different model whenever you want.
.md files with YAML frontmatter, ready for the Dataview / Templater stack you already have. The same profile can fan out to a webhook simultaneously — your vault and your n8n flow both get a copy.
No account. No cloud transcription. Just press the button and talk.
Download on the App Store