Question 1

What is VoiceMe?

Accepted Answer

VoiceMe is a desktop application by Royal Softworks that runs OpenAI Whisper speech recognition locally to produce real-time subtitles overlaid on any application. It supports three model sizes (Tiny, Base, Small), CPU and GPU inference, 99 languages with automatic detection, and SRT/TXT/JSON transcript export. No audio is ever sent to a server — all processing happens on your machine.

Question 2

Does VoiceMe send audio to a server?

Accepted Answer

No. All speech recognition runs locally on your machine via Whisper inference. Audio is processed in memory and discarded unless you explicitly enable transcript saving. No audio is transmitted anywhere.

Question 3

Which Whisper model should I use?

Accepted Answer

VoiceMe includes three Whisper models. Tiny is the fastest and suited for ambient note-taking. Base offers balanced accuracy for most general use cases. Small provides the highest accuracy and is recommended for meeting transcription and any use case where words must be correct — it also supports speaker diarisation.

Question 4

What languages does VoiceMe support?

Accepted Answer

VoiceMe supports all 99 languages that OpenAI Whisper supports, with varying accuracy. You can use automatic language detection or pin a specific language for more consistent results.

VoiceMe

What it does.

Speech recognition that never touches the cloud.

Subtitles float over whatever you're doing.

Save what was said. As plain text or SRT.

Everything else it does.

Privacy by default

Hot-key control

Multi-language

GPU-aware

AssistantGeneral

OfficeBrief

Privatta

Questions we hear often.

Talk to the team that actually builds the software.

Send us a message