Skip to content

Commit b6484a7

Browse files
authored
feat: add built-in help system with fuzzy search for INPUT parameters (#6935)
* feat: add built-in help system with fuzzy search for INPUT parameters - Add -h/--help flag for parameter documentation - Add -s/--search flag for keyword search - Implement fuzzy matching for typo suggestions (Levenshtein distance) - Case-insensitive parameter lookup - Auto-generate help data from markdown at build time - 458 parameters documented with comprehensive metadata Usage: abacus -h <param>, abacus -s <keyword> refactor(help): improve help system with critical fixes and performance optimizations docs(help): integrate help system documentation into quick start guide Add concise help system section to docs/quick_start/input.md covering: - Basic commands (-h, -s flags) - Example output for parameter lookup - Fuzzy matching for typo suggestions - Case-insensitive lookup behavior Remove standalone documentation files that are now redundant: - docs/HELP_SYSTEM.md - docs/FUZZY_SEARCH.md * fix(test): fix help system unit tests - Add input_help.cpp to MODULE_IO_parse_args test sources to fix linker error - Fix input_help_test.cpp include path and use value syntax instead of pointer - Fix parse_args_test.cpp death tests to use EXPECT_EXIT regex matching instead of CaptureStderr which doesn't work properly with fork() * refactor(help): address PR review feedback and improve fuzzy matching This commit addresses all code review comments and fixes a critical UX issue in the help system's fuzzy matching algorithm. ## Documentation Fixes - Fix incorrect "case-sensitive" documentation (should be "case-insensitive") - Update API documentation for find_similar_parameters() with new algorithm details - Add inline comments explaining 3-tier matching strategy ## Architecture Improvements - Add std::ostream& parameter to show_general_help() and show_parameter_help() - Enable flexible stream redirection (stdout for success, stderr for errors) - Maintain backward compatibility with default parameters - Eliminate code duplication by reusing show_general_help() in error handler ## UX Fix: Fuzzy Matching Algorithm ### Problem Pure Levenshtein distance suggested semantically irrelevant parameters: - "abacus -h relax" incorrectly suggested "dmax" and "nelec" ### Solution Implement 3-tier semantic matching strategy: 1. Prefix matches (e.g., "relax" → "relax_new") - Priority 0 2. Substring matches (e.g., "cut" → "ecutwfc") - Priority 1 3. Levenshtein distance for typos - Priority 10+ * feat(docs): auto-generate INPUT docs from C++ source with built-in help system Migrate INPUT parameter documentation from a manually-maintained markdown file to auto-generation from C++ source. Documentation fields (category, type, description, default_value, unit, availability) are now embedded directly in Input_Item registrations in read_input_item_*.cpp files, serving as the single source of truth for both the built-in --help system and the Sphinx documentation. Key changes: - Add docs/generate_docs_from_source.py to parse C++ source and generate input-main.md (492 parameters across 34 categories) - Hook generator into Sphinx build via builder-inited event in conf.py - Add [NOTE] documentation to 36 parameters preserving notes from the previous manually-maintained docs (version compat, usage caveats) - Fix generator regex to handle comments before Input_Item declarations - Rename tmp_item to item in smearing_sigma_temp block for generator compatibility - Remove old tools/generate_help_data.py and input_help_data.h pipeline * feat(docs): replace regex-based doc generation with YAML pipeline The old generate_docs_from_source.py parsed C++ source files with regex to extract Input_Item metadata, which was fragile and broke on code refactoring. Replace it with a pipeline that uses the binary's own parameter registry: abacus --generate-parameters-yaml > docs/parameters.yaml python docs/generate_input_main.py docs/parameters.yaml - Add ParameterHelp::generate_yaml() with YAML serialization helpers - Add --generate-parameters-yaml CLI flag to parse_args.cpp - Add POST_BUILD command in CMakeLists.txt to auto-regenerate docs/parameters.yaml on every build - Create docs/generate_input_main.py (YAML-to-markdown converter) - Update docs/conf.py Sphinx hook to use YAML pipeline - Track docs/parameters.yaml in the repository - Delete docs/generate_docs_from_source.py - Add 5 unit tests for YAML generation * fix(docs): quote numeric YAML values, remove auto-generation, update contributing guide - Quote numeric-looking default values in yaml_quote_if_needed() so PyYAML parses them as strings (fixes 92 parameters losing defaults) - Also quote .inf, -.inf, .nan YAML special values - Fix Python generate_input_main.py to use != '' instead of truthiness checks, preventing falsy-but-valid values (e.g. 0) from being dropped - Replace sys.exit(1) with FileNotFoundError in generate() for safe use as a library from conf.py - Remove POST_BUILD command from CMakeLists.txt (not portable across CMake generators, breaks cross-compilation) - Update CONTRIBUTING.md "Documenting INPUT Parameters" section to describe the YAML-based workflow and remind developers to regenerate docs/parameters.yaml when adding or modifying parameters - Regenerate docs/parameters.yaml with all values properly quoted * fix(docs): reorder add_item() calls in read_input_item_*.cpp to match legacy doc order The built-in help system generates input-main.md from C++ source, with parameter order determined by add_item() call order. After introducing the YAML-based doc pipeline, 19 of 32 documentation sections had different parameter ordering compared to the old hand-maintained input-main.md. Reorder item blocks within each of the 13 read_input_item_*.cpp files to restore the original documentation order. This is a pure mechanical reordering with no logic changes. 28 of 32 sections now match; the 4 remaining differences are due to cross-file constraints (constructor call order) that cannot be resolved without moving items between files. Add a comment to each item_*() function noting that add_item() call order determines generated documentation order. * Add pyyaml as requirements for documentation generation * Commit updated input-main.md and parameters.yaml * Commit the changes to the parameters.yaml * fix(input): restore develop-parity token count handling for read_input output items
1 parent 64d7e62 commit b6484a7

30 files changed

+12738
-2795
lines changed

docs/CONTRIBUTING.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ For more non-technical aspects, please refer to the [ABACUS Contribution Guide](
1010
- [Structure of the package](#structure-of-the-package)
1111
- [Submitting an Issue](#submitting-an-issue)
1212
- [Comment style for documentation](#comment-style-for-documentation)
13+
- [Documenting INPUT parameters](#documenting-input-parameters)
1314
- [Code formatting style](#code-formatting-style)
1415
- [Generating code coverage report](#generating-code-coverage-report)
1516
- [Adding a unit test](#adding-a-unit-test)
@@ -156,6 +157,72 @@ An practical example is class [LCAO_Deepks](https://github.com/deepmodeling/abac
156157
\f}
157158
```
158159
160+
## Documenting INPUT Parameters
161+
162+
ABACUS includes a built-in help system that allows users to query INPUT parameters directly from the command line (e.g., `abacus -h ecutwfc`). Parameter metadata is defined inline in the C++ source files (`source/source_io/module_parameter/read_input_item_*.cpp`) using `Input_Item` registrations.
163+
164+
A checked-in file `docs/parameters.yaml` contains a YAML dump of all parameter metadata, generated from the binary itself. This file is used by Sphinx to produce the online documentation page `input-main.md`.
165+
166+
### When to Update `docs/parameters.yaml`
167+
168+
You **must** regenerate `docs/parameters.yaml` whenever you:
169+
170+
- Add a new INPUT parameter
171+
- Remove an existing INPUT parameter
172+
- Change a parameter's description, type, default value, unit, category, or availability
173+
174+
### How to Regenerate
175+
176+
After building ABACUS, run:
177+
178+
```bash
179+
./build/abacus --generate-parameters-yaml > docs/parameters.yaml
180+
```
181+
182+
Then verify the YAML is valid:
183+
184+
```bash
185+
python3 -c "import yaml; d=yaml.safe_load(open('docs/parameters.yaml')); print(len(d['parameters']), 'parameters')"
186+
```
187+
188+
You can also regenerate the markdown documentation locally:
189+
190+
```bash
191+
python3 docs/generate_input_main.py docs/parameters.yaml --output docs/advanced/input_files/input-main.md
192+
```
193+
194+
**Important:** Include the updated `docs/parameters.yaml` in your commit when submitting a PR that modifies INPUT parameters. Reviewers should verify the YAML changes match the C++ source changes.
195+
196+
### Parameter Documentation Format
197+
198+
When adding or modifying INPUT parameters in C++ source, set the following fields on the `Input_Item`:
199+
200+
```cpp
201+
{
202+
Input_Item item("my_parameter");
203+
item.category = "System variables";
204+
item.type = "Integer";
205+
item.description = "Description of what this parameter does.";
206+
item.default_value = "0";
207+
item.unit = "Ry"; // Optional, empty string if no unit
208+
item.availability = ""; // Optional, empty string if always available
209+
// ... read_value, reset_value, check_value functions ...
210+
this->add_item(item);
211+
}
212+
```
213+
214+
Supported types: `Integer`, `Real`, `String`, `Boolean`
215+
216+
### Format Validation
217+
218+
After regenerating the YAML, you can spot-check a specific parameter:
219+
220+
```bash
221+
./build/abacus -h my_parameter
222+
```
223+
224+
This uses the same runtime registry that generates the YAML, so if the help output looks correct, the YAML will be correct too.
225+
159226
## Code formatting style
160227

161228
We use `clang-format` as our code formatter. The `.clang-format` file in root directory describes the rules to conform with.

0 commit comments

Comments
 (0)