-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Fix output path argument bug in build_sequences_per_dataset script #4143
Copy link
Copy link
Open
Labels
Description
Summary
The script build_sequences_per_dataset.py defines the CLI argument --per-dataset-sequences-path, but later writes to a different non-existent argument name. This causes a runtime failure when saving output.
Problem
In build_sequences_per_dataset.py, the script uses args.path_to_sequences_per_dataset_json, but that field is never defined by argparse.
The defined argument is:
--per-dataset-sequences-path
Steps to Reproduce
- Run the script with valid data arguments and:
--per-dataset-sequences-path out.json
- Let the script reach the output-writing section.
- Observe an
AttributeErrorfor missingpath_to_sequences_per_dataset_json.
Expected Behavior
The script should write JSON output to the path provided by --per-dataset-sequences-path and finish successfully.
Proposed Fix
-
Replace:
args.path_to_sequences_per_dataset_json
with:
args.per_dataset_sequences_path
in:
build_sequences_per_dataset.py
-
Optional cleanup:
- Fix malformed help text so
train,valid,testkey names are formatted correctly.
- Fix malformed help text so
Acceptance Criteria
- Script runs without
AttributeError. - Output JSON file is created at the path passed to
--per-dataset-sequences-path. - Console success message prints the correct output path.
Reactions are currently unavailable