Voice to polished text,
effortlessly
Yark combines high-quality speech recognition with AI polish to deliver accurate, ready-to-use text. Choose any model you trust — we supercharge it.
Speed
Speak faster. Do more.
The average person types 40 WPM but speaks 150 WPM. That's a 3.7x speed advantage waiting to be unlocked.
From a power user managing multiple AI sessions:
"When I type, I can manage 2 Claude Code sessions with brief prompts. With voice input, I manage 4-5 sessions simultaneously, reply to work messages, chat with friends, query ChatGPT — all while giving each AI longer, more detailed prompts that produce higher-quality responses."
Where voice input shines
AI conversations
Provide richer context, get better answers. Manage multiple AI sessions effortlessly.
Email & messages
Draft replies in seconds instead of minutes. Natural tone, faster throughput.
Notes & documentation
Capture thoughts at the speed of thinking. Perfect for meetings, brainstorms, journals.
How It Works
No black box. Here's exactly what happens.
We don't hide behind mysterious "AI magic." Yark is transparent about how it works.
You Speak
Press the hotkey and speak naturally into your microphone
Transcription
Local or cloud models convert your voice to text
AI Polish
An LLM refines grammar, punctuation, and formatting
Ready Text
Clean, polished text is inserted at your cursor
You Speak
Press the hotkey and speak naturally into your microphone
Transcription
Local or cloud models convert your voice to text
AI Polish
An LLM refines grammar, punctuation, and formatting
Ready Text
Clean, polished text is inserted at your cursor
You choose every model. You own every key. We just make them work together beautifully.
Using non-autoregressive models like SenseVoice, local transcription of 40s audio takes just 0.5s. Combined with fast LLMs like Groq, total latency is around 1 second — comparable quality to Wispr Flow or Typeless, with even lower latency on some model combinations.
Features
Everything you need for voice-to-text
High-quality transcription with AI polish, designed for maximum flexibility and privacy.
High-Quality Transcription
Combines fast local STT with AI polish. Get accurate, formatted text in seconds, so you can focus on expressing yourself.
- Fast local processing
- AI-enhanced output
- Works offline
- Multiple model options
AI-Powered Polish
Every transcription is refined by an LLM of your choice. Fix grammar, add punctuation, improve readability — automatically.
- Grammar correction
- Smart punctuation
- Custom prompts supported
- BYOK (Bring Your Own Key)
Your Models, Your Control
Run entirely local for maximum privacy, or use cloud APIs for ultimate accuracy. Your keys, your choice, your data.
- Fully local supported
- Choose your providers
- No data collection
- Full transparency
150+ Languages
From English to Cantonese, Hindi to Arabic. Supports major world languages and dialects for all your scenarios.
- Auto language detection
- Multiple dialects
- Mixed language support
- Continuously expanding
With Omnilingual model: 1600+ languages supported
And much more
Global Hotkey
Start and stop transcription from anywhere with a fully customizable keyboard shortcut. Works in any app.
Sensitive Voice Pickup
In a quiet office, library, or late at night? Yark's tuned audio pipeline captures your voice clearly even when whispering.
Bring Your Own Everything
Use local models for free transcription, or cloud services like Groq. Customize the polish prompt to match your writing style.
Flexible Model Support
Choose your STT model (SenseVoice, FunASR, etc.) and LLM provider (OpenAI, Anthropic, Groq, local). Mix and match for your needs.
Audio Ducking
Automatically lower other audio when recording to improve transcription quality.
AI Rewrite Mode
Select any text and let AI rewrite, polish, or transform its style with one click.
Custom Dictionary
Add technical terms, names, and jargon for improved accuracy in your domain.
Auto-detect dictionary terms(Coming soon)
Text Snippets
Pre-define common phrases or templates (like email signatures, addresses) and insert them instantly by saying a trigger word.
Cost
The real cost of transcription
Stop paying $15-30/month for services that hide how they work.
Yark
$29 one-time
Free APIs
With Groq or OpenRouter free tier
$0
/month
Premium APIs
XAI Grok heavy usage
~$0.80
/month
Subscription Tools
Ongoing subscription
Wispr Flow
Monthly
$15/mo
Annual
$12/mo
Typeless
Monthly
$30/mo
Annual
$12/mo
Save up to $348 every year
Just $29 for first year with free APIs
Privacy
Privacy on your terms
Unlike subscription services where your voice goes to unknown servers, with Yark you know exactly where your data flows.
Fully Local Mode
Option to run everything on your Mac. Zero data leaves your device. Perfect for sensitive content.
BYOK Cloud
Use your own API keys. Choose providers with no-retention policies for peace of mind.
Hybrid Mode
Local STT + Cloud LLM. Best of both worlds — fast, private speech capture with powerful AI polish.
You're always in control. No hidden data collection. No mysterious cloud processing.
Pricing
Simple, transparent pricing
Pay once, own forever. Plus your own API costs (which can be $0).
One-time payment • Own it forever
- Unlimited transcription
- Bring Your Own API Keys
- AI-powered polish
- 150+ languages supported
- Full privacy control
- 1 year of updates ($19/year to extend)
14-day free trial • No credit card required
API costs are separate and depend on your usage. With free tiers, your monthly cost can be $0.
Ready to transform how you type?
Download Yark and experience voice-to-text that just works. Free 14-day trial, no credit card required.