cjpais/Handy: A free, open source, and extensible speech-to-text application that works completely offline.

discord

A free, open source and extensible speech-to-text application that works completely offline.

Handy is a cross-platform desktop application built with Torry (Rust + React/TypeScript) that provides simple, privacy-focused speech transcription. Press a shortcut, speak, and have your words appear in any text field – all without sending your voice to the cloud.

Handy was actually created to fill the lack of open source, extensible speech-to-text tools. As stated on Handy.computer:

  • Free: Accessibility tooling is in everyone’s hands, not behind a paywall
  • open source:Together we can build ahead. Help yourself and contribute to something bigger
  • Personal: Your voice lives on your computer. Get transcriptions without sending audio to the cloud
  • Easy: One tool, one job. Transcribe what you say and put it into a text box

Handy isn’t trying to be the best speech-to-text app — it’s trying to be the most forkable app.

  1. Press A configurable keyboard shortcut to start/stop recording (or use push-to-talk mode)
  2. Speak Your words when the shortcut is activated
  3. release and processes your speech using Handy Whisper
  4. to get Your written text will be pasted directly into whatever app you’re using

This process is completely local:

  • Silence is filtered using VAD (Voice Activity Detection) with Silero
  • Transcription uses the model of your choice:
    • whisper model (Small/Medium/Turbo/Large) with GPU acceleration when available
    • Parrot V3 – CPU-optimized model with excellent performance and automatic language detection
  • Works on Windows, macOS and Linux
  1. Download the latest release from the release page or website
  2. Install the application following platform-specific instructions
  3. Launch Handy and grant the necessary system permissions (Microphone, Accessibility)
  4. Configure your favorite keyboard shortcuts in Settings
  5. Start transcribing!

For detailed build instructions, including platform-specific requirements, see BUILD.md.

Handy is designed as a Tauri application combining:

  • front end:React + TypeScript with Tailwind CSS for Settings UI
  • backend: Rust for systems integration, audio processing and ML inference
  • core library: :
    • whisper-rs: Local Speech Recognition with Whisper Model
    • transcription-rs:CPU-optimized speech recognition with Parakeet model
    • cpal: cross-platform audio I/O
    • vad-rs:voice activity detection
    • rdev: global keyboard shortcuts and system events
    • rubato:audio resampling

Handy includes an advanced debug mode for development and troubleshooting. Access by pressing:

  • Mac OS: : Cmd+Shift+D
  • Windows/Linux: : Ctrl+Shift+D

Known issues and current limitations

This project is under active development and has some known issues. We believe in transparency regarding the current situation:

Major Issues (Help Wanted)

Whisper Model Crash:

  • Whisper models crash on some system configurations (Windows and Linux)
  • Does not affect all systems – issue is configuration dependent
    • If you experience a crash and are a developer, please help fix and provide debug logs!

Wayland Support (Linux):

  • Limited support for Wayland Display Server
  • Is necessary wtype Or dotool For text input to work correctly (see Linux notes below for installation)

Text input tools:

For reliable text input on Linux, install the appropriate tools for your display server:

display server Recommended Equipment install command
X11 xdotool sudo apt install xdotool
welland wtype sudo apt install wtype
Both dotool sudo apt install dotool (Needed input Group)

  • X11: to install xdotool For both direct typing and clipboard paste shortcuts
  • welland: to install wtype (preferred) or dotool For text input to work correctly
  • dotool setup:need to add your user to input Group: sudo usermod -aG input $USER (then log out and back in)

Without these tools, Handy reverts to Enigo which may have limited compatibility, especially on Wayland.

other notes:

  • Recording overlay is disabled by default on Linux (Overlay Position: None) because some compositors treat it as the active window. When the overlay appears it can steal focus, which prevents Handy from pasting it back into the application that triggered the transcription. If you enable overlays anyway, be aware that clipboard-based pasting may fail or end up in the wrong window.

  • If you’re having trouble running an app, run it with environment variables WEBKIT_DISABLE_DMABUF_RENDERER=1 can help

  • You can manage global shortcuts outside of Handy and still control the app through Signal. is being sent SIGUSR2 Handy turns process recording on/off, which lets Wayland window managers or other hotkey daemons take ownership of keybindings. Example (sway):

    bindsym $mod+o exec pkill -USR2 -n handy

    pkill This is just a hint—it doesn’t end the process.

  • macOS (both Intel and Apple silicon)
  • x64 windows
  • x64 linux

System Requirements/Recommendations

The following are recommendations for running Handy on your own machine. If you do not meet the system requirements, application performance may be impaired. We are working to improve performance across all types of computers and hardware.

For Whisper models:

  • Mac OS:M Series Mac, Intel Mac
  • windows: Intel, AMD, or NVIDIA GPU
  • linux: Intel, AMD, or NVIDIA GPU

For Parakeet V3 models:

  • CPU only operations – Runs on a wide variety of hardware
  • minimum: Intel Skylake (6th Generation) or equivalent AMD processor
  • Display:~5x real-time speedup on mid-range hardware (tested on i5)
  • automatic language recognition – No manual language selection required

Roadmap and active development

We are actively working on many features and improvements. Contributions and feedback are welcome!

debug logging:

  • Adding debug logging to a file to help diagnose problems

macOS keyboard improvements:

  • Support for glob keys as transcription triggers
  • Rewrite of global shortcut handling for MacOS and potentially other OS as well.

Opt-in Analytics:

  • Collect anonymous usage data to help improve Handy
  • Privacy-first approach with explicit opt-in

Settings refactoring:

  • Clean up and refactor settings system that is becoming bloated and messy
  • Implement better summarization for settings management

Torii Commands Cleanup:

  • Abstract and organize the Tauri command pattern
  • Check out Tori-Specta for better type security and organization

Manual model installation (for proxy users or network restrictions)

If you’re in a proxy, firewall, or restricted network environment where Handy can’t automatically download models, you can download and install them manually. The URLs are publicly accessible from any browser.

Step 1: Find your app data directory

  1. Open Handy Settings
  2. navigate to About this section
  3. Copy the “App Data Directory” path shown there, or use the shortcut:
    • Mac OS: : Cmd+Shift+D to open the debug menu
    • Windows/Linux: : Ctrl+Shift+D to open the debug menu

Typical paths are:

  • Mac OS: : ~/Library/Application Support/com.pais.handy/
  • windows: : C:\Users\{username}\AppData\Roaming\com.pais.handy\
  • linux: : ~/.config/com.pais.handy/

Step 2: Create Model Directory

Inside your app data directory, create a models Folder if it doesn’t already exist:

# macOS/Linux
mkdir -p ~/Library/Application\ Support/com.pais.handy/models

# Windows (PowerShell)
New-Item -ItemType Directory -Force -Path "$env:APPDATA\com.pais.handy\models"

Step 3: Download Model Files

Download the models you want from below

Whisper models (single .bin files):

  • Small (487 MB): https://blob.handy.computer/ggml-small.bin
  • Medium (492 MB): https://blob.handy.computer/whisper-medium-q4_1.bin
  • Turbo (1600 MB): https://blob.handy.computer/ggml-large-v3-turbo.bin
  • Large (1100 MB): https://blob.handy.computer/ggml-large-v3-q5_0.bin

Parrot models (compressed archives):

  • V2 (473 MB): https://blob.handy.computer/parakeet-v2-int8.tar.gz
  • V3 (478 MB): https://blob.handy.computer/parakeet-v3-int8.tar.gz

For Whisper models (.bin files):

just keep .bin file directly models Directory:

{app_data_dir}/models/
├── ggml-small.bin
├── whisper-medium-q4_1.bin
├── ggml-large-v3-turbo.bin
└── ggml-large-v3-q5_0.bin

For Parakeet model (.tar.gz archive):

  1. remove .tar.gz file
  2. keep it extracted directory In models folder
  3. The directory name should be exactly like this:
    • Parrot V2: : parakeet-tdt-0.6b-v2-int8
    • Parrot V3: : parakeet-tdt-0.6b-v3-int8

The final structure should look like this:

{app_data_dir}/models/
├── parakeet-tdt-0.6b-v2-int8/     (directory with model files inside)
│   ├── (model files)
│   └── (config files)
└── parakeet-tdt-0.6b-v3-int8/     (directory with model files inside)
    ├── (model files)
    └── (config files)

Important Notes:

  • For Parakeet models, extracted directory name Sure Mail exactly as shown above
  • Do not rename .bin Files for Whisper models—use exact file names from download URL
  • After placing the files, restart Handy to detect new models

Step 5: Verify Installation

  1. restart handy
  2. Settings → Open Model
  3. Your manually installed models should now appear as “Downloaded”.
  4. Choose the model you want to use and test transcription
  1. Check for existing issues at github.com/cjpais/Handy/issues
  2. fork the repository and create a feature branch
  3. test thoroughly on your target platform
  4. Submit a pull request with a clear description of the changes
  5. join the discussion – Contact us at contact@handy.computer

The goal is to create a useful tool and a foundation for others – a well-designed, simple codebase that serves the community.

We are grateful for the support of our sponsors who help make Handy possible:

wordcab

epicenter

MIT License – see license file for details.

  • whisper by OpenAI for speech recognition models
  • whisper.cpp and ggml Amazing cross-platform whisper inference/acceleration
  • Celero For the ultimate lightweight VAD
  • Tauri The team behind the excellent Rust-based app framework
  • community contributor Helping improve Handy

“Your search for the perfect speech-to-text tool can end here – not because Handy is perfect, but because you can make it perfect for you.”



<a href

Leave a Comment