Speech Recognition
Coming Soon

VoiceMe

Real-time subtitles. No audio leaves your machine.

On-device Whisper ASR turns live audio into live subtitles, overlaid on any application — meetings, calls, screen recordings. Three model sizes for any hardware. Zero audio leaves your machine, ever.

On-device
Whisper inference
3
Model sizes
Live
Subtitle stream
0
Audio transmitted
WHISPER ON-DEVICEREAL-TIME SUBTITLESTINY BASE SMALLNO AUDIO UPLOADWIN + MACSENSITIVE MEETING SAFELIVE STREAMGPU ACCELERATEDWHISPER ON-DEVICEREAL-TIME SUBTITLESTINY BASE SMALLNO AUDIO UPLOADWIN + MACSENSITIVE MEETING SAFELIVE STREAMGPU ACCELERATED

Available on

  • Windows
    10 / 11 / Server
  • macOS
    12 Monterey+
Core capabilities

What it does.

01
Local Whisper

Speech recognition that never touches the cloud.

OpenAI's Whisper architecture, running on your CPU or GPU. Three model sizes — Tiny for ambient transcription, Base for general use, Small for accuracy-critical work. Multi-language support, automatic language detection. The kind of tool you can run during legally privileged or commercially sensitive meetings without having to ask anyone's permission.

  • Whisper Tiny, Base, Small models bundled
  • CPU and GPU inference paths (CUDA, Metal, ROCm)
  • 99 languages with automatic detection
  • Streaming inference — words appear as spoken
01
02
Always-on overlay

Subtitles float over whatever you're doing.

A non-blocking overlay window stays on top of any application — Zoom, Teams, browser tabs, documents. Move, resize, recolour and reposition without interrupting the underlying app. Configurable hide-on-idle so the subtitles disappear gracefully when no one's talking.

  • Always-on-top overlay across any app
  • Per-application positioning memory
  • Customisable font, size, colour, opacity
  • Hide on idle, reappear on speech
  • Global hot-keys for start / pause / hide / clear
02
Document
Live subtitles
Document
Meeting mode
Document
Hotkey: ⌘⇧V
03
Export transcripts

Save what was said. As plain text or SRT.

Optional transcript saving. Plain text for notes, SRT for video subtitling, JSON for downstream pipelines. Speaker diarisation (separating distinct voices) on the Small model. Useful for compliance teams who need a record without paying a vendor per minute of audio.

  • Plain text, SRT, and JSON export
  • Speaker diarisation on the Small model
  • Per-segment timestamps preserved
  • Optional automatic save to chosen folder
  • Off by default — audio is processed in memory unless you opt in
03
assistant
$ voiceme export --format srt
▸ Detecting speakers...
▸ Diarising: 3 voices found.
▸ Writing meeting_2026-05-19.srt
▸ Done. 412 segments.
More capabilities

Everything else it does.

Privacy by default

No audio is recorded unless you explicitly enable transcript saving. Even then, files stay on your machine.

Hot-key control

Bind start, pause, hide and clear to global hot-keys. Useful for impromptu transcription during calls.

Multi-language

Whisper's full language coverage — 99 languages with varying accuracy. Use automatic detection or pin a specific language.

GPU-aware

Detects available GPU (Apple Silicon, NVIDIA CUDA, ROCm). Falls back gracefully to CPU on machines without a compatible GPU.

Frequently asked

Questions we hear often.

No. All speech recognition runs on your machine via local Whisper inference. Audio is processed in memory and discarded unless you explicitly enable transcript saving.

Talk to the team that actually builds the software.

Pilot deployments, volume licensing, product demos, security questionnaires — all handled by engineers and product leads, not a routing layer. We respond within one business day.

Schedule a discovery call
Half-hour walkthrough with someone who built the product — no sales script.
Run a pilot deployment
Full-feature evaluation with guided install, configured for your environment.
Email us directly
sales@royalsoftworks.com — we respond within one business day.

Send us a message

Leave your details and we'll follow up within one business day.