Context-Aware Dictation: Why Your App Matters
Say “hey Sarah the migration PR is up can you take a look when you get a chance.” That sentence should produce different text depending on where you say it.
In Gmail, it should be a formatted email with a greeting and sign-off. In Slack, it should be a casual message with no formality. In VS Code, it should probably be a code comment. Most dictation tools ignore this entirely — they give you the same raw text everywhere.
Resonant captures where you are when you speak.
What Resonant sees
Every dictation captures context from the macOS Accessibility API:
- •Active app — Slack, VS Code, Gmail, Notion, etc.
- •Window title — the file, the chat recipient, the document name
- •Browser URL — the page you're on in Chrome, Safari, or Arc
- •Selected text — whatever you highlighted before dictating
- •Visible text — snippets of on-screen content for additional context
This context is captured locally via the Accessibility API — no screen recording, no pixel analysis.
Same words, three outputs
Here's what happens when you say the same thing in three different apps:
Hey Sarah, The migration PR is up — can you take a look when you get a chance? I added the fallback validator like we discussed. Thanks!
hey Sarah the migration PR is up — can you take a look when you get a chance? added the fallback validator like we discussed
// TODO: Sarah to review — migration PR adds fallback validator for legacy tokens
Same spoken input. Three different outputs. The difference is context.
How it works
When cloud cleanup is enabled, Resonant sends the raw transcript along with the captured context to the cleanup model. The context tells the model: this is an email reply, this is a Slack message, this is a code file. The model adapts its formatting accordingly.
There are dedicated processing modes for emails, messages, and general text — each with its own prompt that produces app-appropriate output. The mode is selected automatically based on the detected app.
Even without cloud cleanup, the local text pipeline adjusts behavior based on context. Bullet-point detection activates in note-taking apps. Code formatting applies in IDEs. Casual punctuation is preserved in messaging apps.
Why this matters
Dictation that ignores context creates work. You dictate, then you edit. You add formatting. You adjust tone. The dictation saved you typing time but created editing time.
Context-aware dictation produces text that's ready to send. The formatting is right. The tone is right. The structure is right. You dictate and you're done.
That's the difference between dictation as a typing replacement and dictation as a thinking tool.