-
Notifications
You must be signed in to change notification settings - Fork 41
Spellchecker CI action #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
preview available: https://docs.tds.cscs.ch/98 |
| concretise | ||
| concretizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that I haven't made any judgment about e.g. z vs s in words like these. Feel free to bikeshed as much as you want about what goes into this file, but let's keep it practical.
|
preview available: https://docs.tds.cscs.ch/98 |
| ### FirecREST | ||
|
|
||
| Bristen can also be accessed using [FircREST][ref-firecrest] at the `https://api.cscs.ch/ml/firecrest/v1` API endpoint. | ||
| Bristen can also be accessed using [FirecREST][ref-firecrest] at the `https://api.cscs.ch/ml/firecrest/v1` API endpoint. |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FirecREST is a word that apparently it will not accept when it's in allow.txt, because it treats it as two words, Firec and REST. The first is not a recognized word.
patterns.txt allows regexes as a whitelist, and I've added FirecREST into that file instead.
| ### FirecREST | ||
|
|
||
| Clariden can also be accessed using [FircREST][ref-firecrest] at the `https://api.cscs.ch/ml/firecrest/v1` API endpoint. | ||
| Clariden can also be accessed using [FirecREST][ref-firecrest] at the `https://api.cscs.ch/ml/firecrest/v1` API endpoint. |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
docs/software/uenv.md
Outdated
|
|
||
| Uenv are user environments that provide scientific applications, libraries and tools. | ||
| This page will explain how to find, dowload and use uenv on the command line, and how to enable them in SLURM jobs. | ||
| Uenv are user enviroments that provide scientific applications, libraries and tools. |
Check failure
Code scanning / check-spelling
Unrecognized Spelling
|
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
|
preview available: https://docs.tds.cscs.ch/98 |
1 similar comment
|
preview available: https://docs.tds.cscs.ch/98 |
|
preview available: https://docs.tds.cscs.ch/98 |
|
More details about available configuration options: |
|
I am a little bit confused about the output. And does it show only spelling mistakes in the modified/added text, or for the whole document? |
Sort of both... let me expand: The action currently checks all files, and that's why the "Check Spelling" action fails. There are more words that I haven't whitelisted, that it doesn't like. This is a bit ugly I admit. On the other hand, the SARIF output only shows the inline "review" comments on files that have changed, like All the typos that the action found are visible on https://github.com/eth-cscs/cscs-docs/security/code-scanning?query=pr%3A98+is%3Aopen, but those include files that haven't been changed. One peculiarity is that one can enable SARIF output, or check only changed files, but they can't unfortunately be enabled at the same time. If I enable the only-check-changed-files option I'll need to look at the other options for reporting errors. TL;DR: it detects typos in all files and outside changed areas within a changed file, but the inline comments only show up on changed parts. It's definitely not ideal, but decent? 🤷 Edit: with the current setup, I could also set the action to always succeed, despite typos in other files, since they're not really relevant for the changes in the PR and the typos are anyway just suggestions. |
|
One problem is that it will generate a lot of false-positive comments about typos for larger changes, for example it doesn't like the second ... it really shouldn't be flagging filenames, link names, and other markdown/html artifacts. We can turn it on, to see how it works. |
I do worry about noise as well. My suggestion would be to mostly ignore the full "security scanning" output, and only care about what's reported for the changed lines. The tool does have a feature to ignore words with regexes as well ( |
|
Also, there are a bunch of additional lists/dictionaries that one can add to ignore things like URLs etc. that I haven't added right now. |
|
preview available: https://docs.tds.cscs.ch/98 |
|
preview available: https://docs.tds.cscs.ch/98 |
2 similar comments
|
preview available: https://docs.tds.cscs.ch/98 |
|
preview available: https://docs.tds.cscs.ch/98 |
|
preview available: https://docs.tds.cscs.ch/98 |
|
preview available: https://docs.tds.cscs.ch/98 |
|
I've added a few more patterns just for testing, and it did get rid of more false positives. If you're happy with the approach I'd stick to this and see how much noise it generates. I'm very happy to be pinged whenever there's too much noise and I can try to add more exclusions as we find the worst offenders, and as said, if it's too much noise we disable it eventually. |
bcumming
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try this out for a while.
We can continue to improve the filters to reduce noise, and if it causes too much noise of confusion we can remove it.
|
preview available: https://docs.tds.cscs.ch/98 |
This fixes more typos (following #57), and adds a spellchecking action (https://github.com/marketplace/actions/check-spelling) that warns about suspicious spelling.
I've been testing this on another PR, and it's not perfect, but I think the typos it catches are still better than the false positives it reports.
Some notes:
.github/actions/spelling/allow.txt.github/actions/spelling/only.txt(regex, only matching.mdfiles right now)github-advanced-security. Sounds a bit scary, but I find that was the nicest reporting. There are also alternatives to make the action add regular comments but this requires setting up additional workflows. There are examples of doing this, so if it's something someone wants I can look into it, but I was too lazy to set it up right now.Finally: if this is too noisy, we disable it. Remember that the typos it reports are never blocking, they can always be ignored.