Skip to content

ken-morel/respell

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

respell

Respell is a small C program which does word-level spell correction. I actually made it to use from helix but it's perfectly usable anywhere else.

How to use it?

You call respell with the path to the word list as argument and pipe in the text to translate.

echo "Helo wold" | respell wordlists/english.txt

On the first run, if it does not exist, respell will create a /path/to/wordlist.bin file, the size depends on the size of the wordlist and the maximum number of deletes.

How does it work?

Using deletes! There are 1Million words in the wordlist, respell actually uses 200k of them(configurable during compilation), instead of comparing the text with the hundred thousand words:

  • Respell takes each individual word in the word list and find the hash of every word that can be formed by removing 0 to max_deletes letters from the word.
  • It stores the word list as well as the hashes of the deletes, and a maping of those hashes to actual words(like a hashmap) for faster access instead of comparing line by line.

And that's it, when you are trying to correct a word, it also finds deletes of that word, look for them in the dictionary, and the hash which matches exactly and have the fewest removals is the winner.

This system permits respell to be fast, and give pleasant results ... though it corrects word by word without context.

startup time

Since respell precomputes the hashes of the wordlist and the hashmap itself, startup only involves loading that binary into ram. The bigger the word list and the more the deletes, the heavier the binary file... e.g 200k words with 2 deletes are worth more than 100mb.

I believe 2 deletes is plenty enough, and even too much considering it will match strings with 2+2=4 characters difference.

Installation

By default, respell uses 1 delete and 150k words. To install you can tweak the install.fish script at repository root, it compiles respell, runs it over the different wordlists in ./lists and for each list it generates a respell-{lang} alias in /usr/bin which call the respell executable.

You can add options when compiling, like -DN_WORDS=50000 and -DN_DELETES=2 to override the defaults.

By default, if you run fish install from repository root, the install script will install respell at /usr/bin; install wordlists and binaries at /usr/share/respell, and setup an alias for /usr/bin/respell-en

Keybindings

You will surely want to create keybindings to use it from helix, usually I just create shell aliases, but since I recently had to use it more often.

[keys.select."A-space"]
s = "@|respell-en" # You can edit the command to change the language

# OR
s = "@|respell-en<ret>"
s = ":pipe respell-en"

But if you prefer using shell aliases, you may like:

# nu shell
alias ren = respell-en

And just type |ren

About

A little c script for fast, word-level spelling correction in helix

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors