Speech Recognition

Configure local and cloud speech-to-text options

Overview

Yark supports multiple speech recognition engines, giving you flexibility between privacy, accuracy, and speed.

Local Models (Recommended)

Local models run entirely on your Mac, ensuring your voice data never leaves your device.

Available Models

| Model | Size | Languages | Best For | |-------|------|-----------|----------| | SenseVoice Small | ~200MB | 50+ | Fast transcription | | Whisper Small | ~500MB | 99+ | Balanced accuracy | | Whisper Medium | ~1.5GB | 99+ | High accuracy | | Omnilingual | ~300MB | 100+ | Multilingual |

Downloading Models

  1. Go to Settings > Speech Recognition > Local Models
  2. Click Download next to your preferred model
  3. Wait for the download to complete
  4. Select the model as your default

Cloud Providers (BYOK)

Yark supports "Bring Your Own Key" (BYOK) for cloud providers:

  • OpenAI Whisper API - High accuracy, requires API key
  • Groq - Fast inference, requires API key

Setting Up Cloud Providers

  1. Go to Settings > Speech Recognition > Cloud Providers
  2. Enter your API key for the desired provider
  3. Select the provider as your default

Choosing the Right Option

| Priority | Recommendation | |----------|----------------| | Privacy | Use local models | | Speed | SenseVoice or Groq | | Accuracy | Whisper Medium or OpenAI | | Multilingual | Omnilingual or Whisper |