Skip to content

Spellchecking#186

Open
tirix wants to merge 3 commits intometamath:mainfrom
tirix:spellcheck
Open

Spellchecking#186
tirix wants to merge 3 commits intometamath:mainfrom
tirix:spellcheck

Conversation

@tirix
Copy link
Copy Markdown
Collaborator

@tirix tirix commented Jun 1, 2025

This adds spell checking for theorem comments and section headers.

The new function is added in metamath-knife only, there are a few supporting changes to metamath-rs.
The function is conditional to the feature spell_check (but maybe this is not necessary?)

Spell checking goes through each statement and parses the comments, ignoring HTML and italics, and stopping at the parentheticals.
The zspell rust library is used for spellchecking.
The add_word $j command allows to append custom words to the dictionary.
The dictionary included has been taken from https://github.com/wooorm/dictionaries/tree/main/dictionaries/en.
The dictionary format is described here: https://manpages.ubuntu.com/manpages/bionic/man5/hunspell.5.html.

@tirix tirix mentioned this pull request Jun 1, 2025
@@ -0,0 +1,110 @@
//! Spell checking in theorem comments and section headers.
//!
//! This goes through each statement and parses the comments, ignoring HTML and italics, and stopping at the parentheticals.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible for words to appear between parentheticals, for example with revision reasons. Would it make more sense to just ignore parentheticals?

(Note to self: parentheticals refers to standard Contributed by Revised by etc statements)

Copy link
Copy Markdown
Collaborator Author

@tirix tirix Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this way I miss those revision reasons, which appear between parentheticals. I went for the easy solution here.
Ideally parenthetical would be left off or treated as a special CommentItem in the CommentParser iterator, this might be added at a later time, and would simplify this treatment.

Similarly, the lines of =-=-=-= in section headers are currently treated the same way and parsed for spellchecking, but it's not a big issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants