feat: add text replacements feature #455

schmurtzm · 2025-12-14T22:37:18Z

✨ New Feature: Text Replacements

This new feature introduces a simple and intuitive interface for performing text replacements within Handy.

To help better understand this feature in practice, here is a tutorial video (with sound):

Handy.Replacements.mp4

🎯 Purpose and Capabilities

The interface is designed to be simple, offline, powerful, and intuitive.
It allows users to:

Replace one term with another
Perform text expansion
Manage punctuation cleanly
Insert the current date or time
Execute command lines from a spoken trigger

🛠️ Replacement Management

It supports both basic and advanced usage through:

Graphical options (trimming punctuation and/or whitespace before and after the matched text)
Regular expression support
Magic commands (currently a limited set is available, but they can be easily extended thanks to a simple and scalable code-side implementation)

The Replacements interface also provides tools to efficiently manage rules:

Filter existing replacements
Export and import replacement lists to enable easy sharing within the Handy community
For each replacement, users can:
- Enable or disable it
- Edit it
- Duplicate it
- Delete it

The UI visually highlights:

Regular expression enabled or not
Trim punctuation or spaces enabled or not
Magic command used or not
Disabled replacements
Duplicate rules
Newly imported replacements

🔮 About Magic Commands

Available Magic Commands

[lowercase]
Converts the entire current phrase to lowercase
- Replacement rule example:
  "transform in lowercase" → [lowercase]
- Result example (voice → text result):
  "Hello World transform in lowercase" → "hello world"
[uppercase]
Converts the entire phrase to UPPERCASE
Example:
"Hello" → "HELLO"
[capitalize]
Capitalizes the first letter of each word
Example:
"jean dupont" → "Jean Dupont"
[nospace]
Removes all spaces from the phrase
Example:
"a test" → "atest"
[date]
Inserts the current date YYYY-MM-DD (depending your current region settings)
[time]
Inserts the current time (HH:MM)
[run]"command…"
Executes the specified command and prevents any transcription output for that trigger
The following placeholders can be used inside [run] commands:
{text} — full phrase
{text_nospace} — text without spaces
{text_nopunctuation} — text without punctuation
{text_nospace_nopunctuation} — text without spaces or punctuation

Example:
[run]"cmd /k echo {text_nospace_nopunctuation}"

French punctuation sample: handy-french-punctuation-v1.0.json

Introduces a new 'replacements' setting for accent-insensitive, punctuation-aware, and capitalization-controlled text substitutions in transcriptions. Adds UI for managing replacements, import/export functionality, backend support, and updates settings/types to support the feature.

(?i)ouvr(ir|ez|e) la parenthèse -------------------------------- This will match: "ouvrez la parenthèse" "Ouvre la parenthèse" "OUVRIR LA PARENTHÈSE"

Introduces magic tags (e.g., [lowercase], [date]) to the replacement system in both backend and frontend. Backend parses and applies transformations based on tags: '[lowercase]': 'Converts the entire text to lowercase', '[uppercase]': 'Converts the entire text to uppercase', '[capitalize]': 'Capitalizes the first letter of each word', '[nospace]': 'Removes all spaces from the text', '[date]': 'Inserts current date (YYYY-MM-DD)', '[time]': 'Inserts current time (HH:MM)', Frontend provides tags autocomplete, tooltips, and visual indicators for magic tags in the replacements UI.

- You can now trim after or/and before spaces and punctuation separately. - Replacements edition doesn't require to scroll anymore. - Possibility to clone a Replacement item - Better sizing for long texts / large fields

Introduces a global 'replacements_enabled' setting and per-replacement 'enabled' flags to control text replacements. Updates backend, bindings, and UI to support toggling replacements globally and individually, improving flexibility.

New [run] magic tag to allow running shell commands with text in parameter Example: [run]"C:\Windows\notepad.exe" [run]"start cmd /k echo {text_nospace_nopunctuation}" {text}: The original text. {text_nospace}: The text without spaces. {text_nopunctuation}: The text without punctuation. {text_nospace_nopunctuation}: The text without spaces or punctuation.

Replaced by power on / off icon instead of an eye

English and French added

Add Replacements keys to the German, Spanish, Japanese, Vietnamese and Chinese locale files. Add tab naming support.

cjpais · 2025-12-15T02:47:34Z

I'm going to be honest this adds a lot of UI for a marginal feature in my opinion. There is already a feature which does this, in addition to post-processing. I'm not sure we need a 3rd option as well. The magic commands are quite interesting. I will have to give a deeper look soon. I'm not quite decided on this overall and I will come back with some more in depth feedback

The UI is generally well laid out but seems to add a lot of complexity which I'm not sure is a good tradeoff

Again paging @dannysmith for an opinion.

jamaggs · 2025-12-15T06:27:00Z

Hello, just to say this feature is exactly what I need! It would be really great even just to be able to add new lines and other punctuation. Post processing with LLM doesn't really work for me. Thanks for creating it and I really hope that it makes it into Handy.

cjpais · 2025-12-15T06:37:11Z

@dannysmith maybe this is something like "developer mode" or something like I suggested in #454? im just very skeptical of something this complex making it into the main ui for mainstream users. ive also not had a chance to review the code yet, so not sure what the implementation looks like. it's possible developer/power mode could open extra keyboard shortcuts up too..

paging @VirenMohindra for feedback as well

jamaggs · 2025-12-15T07:14:41Z

I wonder if some standard users could benefit from a "standard" set of pre-set (punctuation) replacements, with more advanced users being able to change the configuration. For certain use cases Handy needs a lot of manual editing afterwards for punctuation - for instance writing sections of prose with speech - because there is no way to insert punctuation.

cjpais · 2025-12-15T07:22:18Z

perhaps @jamaggs, I generally favor having good defaults that work for 90% of people out of the box. I haven't considered the prose use-case a ton but makes a lot of sense. If you could provide which ones would help you the most that would be great

fwiw post-processing largely solves this for me I believe, using qwen3-a30b has generally been a good model from my experience. but this is a much heavier handed way of doing things.

jamaggs · 2025-12-15T07:41:43Z

Hello

I think that for out of the box options, mimicking the replacements that the Microsoft dictation has under the punctuation heading would be good. It probably also eases usability (I'm forced to use MS at work and others may be too, making the consistency good).

https://support.microsoft.com/en-gb/office/dictate-your-documents-in-word-3876e05f-3fcc-418f-b8ab-db7ce0d11d3c

Name	Symbol
period, full stop	.
comma	,
question mark	?
exclamation mark/point	!
new line	new line
apostrophe-s	's
colon	:
semicolon	;
open quotes / close quotes	" "
hyphen	-
ellipsis, dot dot dot	...
begin/open single quote / end/close single quote	' '
left/open parentheses / right/close parentheses	( )
left/open bracket / right/close bracket	[ ]
left/open brace / right/close brace	{ }

If there is the option for power users to refine then great. For instance, a notable omission from the Microsoft list is the em dash which is used quite a bit in prose.

I am coming at this from the perceptive of an English speaker but it would probably be a different list in different languages.

schmurtzm · 2025-12-15T11:11:02Z

Thanks for your first feedback :)

A lot of UI

Do you mean on the user-facing side, or on the code/implementation side?

From the user’s perspective, if it doesn't deserve a dedicated tab in the main settings menu, this UI could easily live behind a “Replacements” button inside the Post‑Processing tab. I deliberately tried to keep the default experience lightweight: when no rules exist, the form is essentially empty.

The intent was to make something approachable. A user can start with just two fields: “word to replace” and “replacement” and ignore everything else.

marginal feature

To be honest I don't get this point. I’m not sure how Handy is typically used on your side, but in my own usage it’s hard to imagine using it without some form of replacement layer. Otherwise, users must manually edit nearly every sentence to add punctuation, line breaks, parentheses, etc.

Yes, some LLMs can manage punctuation reasonably well, but:

they require CPU, disk space, time processing and model management
results are not always deterministic
they do not address very simple needs like consistently replacing one word or symbol with another

There is already a feature which does this, in addition to post-processing. I'm not sure we need a 3rd option as well.

When you say this, do you mean there is already a way to define explicit replacements today? If so, could you point me to where this is currently possible?

I'm not quite decided on this overall and I will come back with some more in depth feedback

Alright 🙂. My feeling is that rule-based replacements are easier to grasp than LLM-based post‑processing, especially for non‑technical users.
They are also instant, with low resources requirement and fully deterministic (same input → same output). This makes them complementary rather than competing with LLM post‑processing.

The magic commands are quite interesting

Yes and this is where I think a lot of the long‑term value lies.

Magic commands open the door to many workflows, currently it already offer few possibilities, for example:

saying “hello please capitalize” → HELLO
saying “open notepad” → launches Notepad

More importantly, they allow users to pipe their text into any external binary for custom processing. Offering this level of openness will almost certainly lead to unexpected and creative use‑cases from advanced users.

something this complex

Do you really feel it is complex from the user’s point of view?

I understand that the PR itself may look intimidating, but in practice the UI can be used at a very basic level with just two fields. Everything else is optional and intended for advanced usage.

If the feature is primarily aimed at advanced users, alternatives could be:

A dedicated settings tab (like currently) hidden behind an Advanced / Developer toggle
Or behind a "Replacements" button in the Post‑Processing tab

Conceptually, this is simply post‑processing without an LLM that's why I have created a dedicated tab.

I wonder if some standard users could benefit from a "standard" set of pre-set (punctuation) replacements

I’m confident that this is already true. I will provide an English punctuation preset soon. The current UI already makes it quite easy to create such rules (as shown in the video), and we could initially base them on Microsoft’s punctuation guidelines (even if they’re not always perfect imo).

This would also allow you to concretely test the import/export feature 😁.

About current post processing, I may be underestimating the current post‑processing feature, but for users who are not already familiar with LLMs, Handy does not yet really hold their hand:

discovering which model to use (e.g. qwen3‑a30b)
installing it
configuring it correctly for using it with Handy

In that context, replacements provide immediate value with almost zero friction, while also enabling things that LLMs simply do not offer.

By the way, there is one additional idea I deliberately did not include yet:
Parakeet V3 currently spell out numbers in words. I almost added a magic command to convert written numbers into digits using the Rust library text2num-rs. However, since many PRs tend to be rejected for being out of scope, I chose not to propose it yet.

That said, this illustrates the broader idea: Magic Commands are meant to let users send their text through any kind of processing pipeline, very easily without hardcoding every possible use‑case into Handy itself.

cjpais · 2025-12-15T18:12:05Z

Okay thank you for a detailed response and also a genuine and interesting position. I think there are a lot of PR's that honestly are not always as well thought out. A lot are entirely AI slop, with a glimmer of an idea behind them, but not necessarily full rationale. So I often am coming at things from a defensive side as the app's initial purpose is quite slim, and there are a lot of outstanding issues that I typically put as priority in front of features.

Do you mean on the user-facing side, or on the code/implementation side?

mostly on the user facing side. to be precise it's around:
'trim puncutation', 'trim spaces', 'regex', and 'next word'. when I look at it's just visually overwhelming for me. it's functional, but it's a lot of buttons to press. maybe keeping it to just the replacement itself makes sense, with a dropdown for advanced or something. again open to feedback, but generally I prefer less UI and better defaults where possible.

To be honest I don't get this point. I’m not sure how Handy is typically used on your side

For me Handy is used primarily for programming with LLM's or building up very specific context windows where spelling, punctuation, and precise language is not necessary because a human is not reading it. I do occasionally use it for other things, like typing parts of this message, but my speaking voice is also quite different than my written voice, so there's times at which I choose to use one over the other intentionally. The times that I'm writing with my voice I often have to do a lot of editing because my spoken word is a very conversational style that doesn't always lend itself well to reading. So the minor errors for me are even more minor in comparison to the actual syntactical structure of what I'm saying.

When you say this, do you mean there is already a way to define explicit replacements today? If so, could you point me to where this is currently possible?

I guess this is more the 'dictionary' rather than replacements itself. And upon reading your full message I'm honestly fine to drop/move that feature away from the primary modality potentially.

My feeling is that rule-based replacements are easier to grasp than LLM-based post‑processing, especially for non‑technical users.

you are correct.

More importantly, they allow users to pipe their text into any external binary for custom processing. Offering this level of openness will almost certainly lead to unexpected and creative use‑cases from advanced users.

this is the main reason I support this feature and you overall. you are genuinely thinking, and getting me excited to read the code.

Do you really feel it is complex from the user’s point of view?

its the ui point from above. "in practice the UI can be used at a very basic level". this is true, but again it's making the "basic level" the primary use case. Not adding anything more than is explicitly necessary, and hiding advanced things in creative ways so power users can still do the things they want. I think the app still has much to grow in this way as I'm just figuring everything out. I'm trying to find the best balance for both people using the app without knowing a thing about computers, and also giving power users the tools they want. (theres a reason, and more than one, that A LOT of stuff is in the debug menu). And to be honest with you, I'm the maintainer of a much bigger repo than I ever had imagined in much less time than I expected. Every PR, GitHub Issue, Discussion, is a learning moment for me still, and this one certainly is in that category.

I could address even more, but I just want you to know I support you and I appreciate the contributions you've made. I think there's some small things we can change and I'll sleep on it with more concrete feedback for the PR. I also need to review the code myself still, and really see how things are working. I love the extensibility you're talking about and intrigued.

The standout feedback piece is just minimizing the amount of UI immediately seen. I don't 100% know if this is a sidebar feature or not, and generally might be another thing needs to be tackled in the discussion #449. I think a UI overhaul is long overdue, and really would like to get it into a place where we genuinely feel there is a 1.0.0 release at some point. We are not there yet, but I think as things get more stable, we will start to finalize on all the things that will go into that release.

schmurtzm · 2025-12-15T21:36:12Z

UI complexity

That’s true, there are quite a few buttons 😄.

Initially, the interface only had the two text fields. Very quickly when using it firstly for punctuation, though, a recurring need emerged:

removing punctuation incorrectly injected by the model
trimming surrounding spaces, which vary by language (for example, in French there is a space before :, unlike in English)

It could probably be handled with regular expressions. However, in practice, editing regex patterns is more complex than toggling few simple switches. The typical workflow becomes:

define a replacement
test it by speaking
make a small adjustment (e.g. enable trim punctuation or trim spaces)
Doing this via simple toggles is fast, intuitive, and pleasant to use, compared to constantly rewriting regexes.

A good compromise for the UI could be:

only showing the two main fields by default
revealing advanced options behind a small dropdown or arrow

^{(may be the arrow should be upside the button)}

How Handy is used

Understood and it’s interesting because my own usage differs somewhat.
I’ve developed the habit of using Handy primarily to respond to instant messaging. As a result, my spoken diction gradually adapts to resemble my written style. Well, I can assure you that it doesn't affect the times when I'm not in front of my computer... At least, I hope not... 😄
That’s probably a form of laziness when it comes to talking to other humans 😅, but it also means I strongly need my text to be formatted and much more so than inside a prompt.

Dictionary usage

Ah, OK I hadn’t fully understood how this part of Handy worked.
Back in the days of Windows speech recognition (when it was fairly poor), weighted word matching was used: you would provide alternative spellings, each with a weight, to bias the recognizer toward a preferred result.
I assumed something similar was happening here: providing a word so that, when something close is spoken, it would be selected over another.

This is probably complementary. For example, to manage French punctuation, I’ve sometimes added deliberately odd spellings because the model tends to randomly choose between multiple valid forms (For instance: tiret bas | tiré-bas | tiré bas | tiré-ba

Magic Commands

Don’t expect too much at this stage 😅 for now, it’s more of a proof of concept.

You can already transform text using advanced functions or send the text to an external application but not much more yet.

The underlying idea is to give the community the ability to extend Handy’s functionality without having to modify the core codebase. The current [run] mechanism (executing an external command with the text as a parameter) could not be sufficient on its own to fully achieve this goal. It’s a first step, but probably too limited if we want something truly flexible and future-proof ? The real value is that this mechanism is designed to be extended by the community, and in this context, perhaps the first functions made available (uppercase, lowercase, insert date, etc.) should already be outsourced as well...

Speaking about the code itself, in full transparency: I would not have attempted this PR without the help of AI.
That said, I tried to respect the existing structure as much as possible. I won’t take it personally if the structure turns out not to be optimal, or if it doesn’t fully align with the original architecture and needs to be rewritten in some parts.

This PR is not the result of weeks of rigorous, manual coding. So it’s an ambivalent feeling:

pride in being able to bring ideas to life more easily
and a bit of impostor syndrome (especially when touching other projects that has existed for 10 years 😅)

At the very least, I will be happy if it helped seed an idea and produce inspiration for future features / UI.

In any case, it’s genuinely a pleasure to read your thoughtful and reasoned feedback, and I fully respect the defensive stance you take toward the project’s original goals when evaluating new features. What’s certain is that Handy doesn’t leave people indifferent, and it’s great to see contributors genuinely engaged.
I’ve also participated in a fairly large GitHub project myself, so I’m very aware of the pressure that comes with community expectations. Keeping some distance, preserving the fun, keeping free time and enjoying the process is essential to avoid burning out 😉

schmurtzm · 2025-12-15T22:47:29Z

@jamaggs, this is a first version of English punctuation rules : handy-replacements-english punctuation-v1.0.json

You just have to click on "import" button and select the file to start to use it.

To illustrate the action of these replacement rules:

I made this english voice sound file (It makes no sense and deliberately overuses punctuation).

Then I inject this sound file in handy (tested here with Parakeet v3):

Without rules:

He was there, ellipsis, line break. It with him, exclamation mark, or him, question mark, line break. In any case, comma. It with colon. Me, underscore comma. I say open quotation marks. Toad. Close quotation marks. Line break. Add the date and time. Test. Add the date. Test. Add the time. Line break. A slash. And at sign. And underscore an eighteen euro sign. Line break. Open parenthesis. It's over. Close parenthesis. Line break.

With rules imported:

He was there...
It with him! Or him?
In any case, it with: me_i say "toad"
2025-12-15 23:43 test, 2025-12-15 test, 23:43
A/and@and_an eighteen€
(it's over)

jamaggs · 2025-12-16T06:38:23Z

@schmurtzm looks amazing! I really hope this feature makes it in so that I can make use of this.

kbingoel · 2025-12-16T08:13:37Z

just expressing that I would also see big useability improvements from handy with this feature. Then I really don't see anything else missing. What this pull request adds would replace lots of tedious manual corrections I had in the past.
Only question is how this can be done with other languages as well..?

cjpais · 2025-12-16T09:41:50Z

@schmurtzm

A good compromise for the UI could be:

only showing the two main fields by default
revealing advanced options behind a small dropdown or arrow

yeah I think let's try to do this, that's roughly what I was imagining as well

Speaking about the code itself, in full transparency: I would not have attempted this PR without the help of AI.
That said, I tried to respect the existing structure as much as possible. I won’t take it personally if the structure turns out not to be optimal, or if it doesn’t fully align with the original architecture and needs to be rewritten in some parts.

Totally understandable and it's still appreciated and welcome! Hell most of the code in the codebase is written by AI! But I think there is definitely some uses of it which are more tasteful than others. I actually welcome AI generated code especially if a human has already reviewed it.

and a bit of impostor syndrome

totally understand, and just know we are both in the same boat!

Keeping some distance, preserving the fun, keeping free time and enjoying the process is essential to avoid burning out 😉

thats the goal :)

It will probably take me a few days before I can give this a proper review as a heads up

schmurtzm · 2025-12-16T09:52:53Z

@cjpais I tried to make the interface less overwhelming: by default I hide the options in an "Advanced Options" section and I revised the layout of these options to make it more conventional.

Put replacement options in an "Advanced Options" section for cleaner UI

yongkangc · 2025-12-17T11:48:20Z

that actually looks pretty neat

VirenMohindra · 2025-12-18T11:29:56Z

src-tauri/src/managers/transcription.rs

+                #[cfg(not(target_os = "windows"))]
+                {
+                    std::process::Command::new("sh")
+                        .arg("-c")
+                        .arg(&cmd_str)
+                        .spawn()
+                        .ok();
+                }


i believe the [run] command magic tag allows arbitrary shell command execution. this is a pretty significant security risk. some possible issues~

voice input could trigger unintended commands (ie "run delete everything")

no sanitization of {text} placeholders - command injection possible

no user confirmation before execution

.ok() silently swallows errors

if we want to add this we should look into a confirmation dialog before running commands (which seems to add a lot of friction and is the opposite of what this PR intends to do) or a separate setting to enable / disable [run] functionality

regardless, we should sanitize the input for placeholder values

oooh yeah we might want to put a pin in this for now, I think we can possibly pick this up in another PR. Largely this is something im curious about, but I would love to be executing in a sandbox where it makes sense.

Obvious point: allowing this even with sanitisation etc introduces a future maintenance burden re ensuring new features don't accidentally introduce code which can potentially exploit it.

VirenMohindra

think unit tests are needed before we can land something of this magnitude, some good examples are

for accent-insensitive pattern building
magic tag parsing
edge cases (empty text, overlapping matches)

we definitely need to add documentation AND a note in the UI warning users about the [run] command's security implications

VirenMohindra · 2025-12-18T11:31:48Z

src-tauri/src/managers/transcription.rs

+            let re = match regex::Regex::new(&search_pattern) {
+                Ok(re) => re,
+                Err(_) => continue, // Skip invalid regex
+            };


this would compile on every transcription, we should probably cache it in the Replacement struct right

VirenMohindra · 2025-12-18T11:32:32Z

src-tauri/src/managers/transcription.rs

+        let mut replaced_result = corrected_result.trim().to_string();
+        let mut global_transformations = Vec::new();
+
+        if settings.replacements_enabled {


we should extract this into an appropriately named function apply_replacements()

the contract could look something like

fn apply_replacements(text: &str, settings: &AppSettings) -> String

VirenMohindra · 2025-12-18T11:33:12Z

src-tauri/src/managers/transcription.rs

+                    const CREATE_NO_WINDOW: u32 = 0x08000000;
+                    std::process::Command::new("cmd")
+                        .args(["/C", &cmd_str])
+                       // .creation_flags(CREATE_NO_WINDOW)


can we ✂️ ?

Suggested change

// .creation_flags(CREATE_NO_WINDOW)

VirenMohindra · 2025-12-18T11:33:50Z

src-tauri/src/managers/transcription.rs

+                }
+                *text = result;
+            }
+            MagicTransformation::Run(cmd_template) => {


might make sense to log invalid regex and surface this to users. generally regex seems scary and not UI-friendly in my opinion

VirenMohindra · 2025-12-18T11:35:14Z