-
Notifications
You must be signed in to change notification settings - Fork 2
Description
TL;DR
Pre-commit hooks are failing with ~24 pydocstyle errors and ~600+ COCO/BBOB compliance violations across the codebase. Additionally, pyproject.toml contains an excessive global Ruff ignore list (23+ codes) that masks code quality issues and must be reduced by 50% within 2 sprints.
CRITICAL: This issue blocks clean CI and requires complete remediation across 60-80 optimizer files, not just a subset.
Problem
Current state prevents clean CI runs and masks widespread documentation quality issues:
- ~24 pydocstyle violations (D107, D103, D417) across swarm_intelligence, constrained, multi_objective, probabilistic, and metaheuristic categories
- ~600+ COCO/BBOB compliance violations across 60-80 files in all 10 optimizer categories
- 23+ global Ruff ignore codes in pyproject.toml with no per-file justifications
- D107 and D103 are globally ignored yet still failing in pre-commit pydocstyle checks (configuration conflict)
Need to validate that batch script and template work correctly before processing remaining 107+ files. This issue creates 10 exemplar files spanning all categories, then scales remediation to complete coverage.
Solution
Phase 1: Validation (10 Exemplar Files)
Manually complete 10 high-priority optimizers with full COCO/BBOB compliance to serve as validation checkpoints and reference implementations.
Phase 2: Complete Remediation (60-80 Files)
Apply validated approach to ALL affected files for comprehensive CI/documentation quality restoration.
Files to Update
Phase 1: Validation Checkpoints (10 files)
-
opt/swarm_intelligence/particle_swarm.py(validates swarm intelligence category) -
opt/gradient_based/adamw.py(validates gradient-based category) -
opt/classical/simulated_annealing.py(validates classical category) -
opt/evolutionary/genetic_algorithm.py(validates evolutionary category) -
opt/metaheuristic/harmony_search.py(validates metaheuristic category) -
opt/swarm_intelligence/ant_colony.py(validates swarm diversity) -
opt/gradient_based/sgd_momentum.py(validates gradient diversity) -
opt/classical/nelder_mead.py(validates classical diversity) -
opt/evolutionary/differential_evolution.py(validates evolutionary diversity) -
opt/physics_inspired/gravitational_search.py(validates physics-inspired category)
Phase 2: Complete Coverage (50-70 additional files)
Organized by failure type for systematic remediation:
Tier 1: pydocstyle + COCO/BBOB Failures (~24 files - HIGHEST PRIORITY)
Files with both D107/D103/D417 violations AND missing COCO/BBOB sections:
swarm_intelligence/ (15 files)
- glowworm_swarm_optimization, wild_horse, orca_predator, sand_cat, african_buffalo_optimization, brown_bear, barnacles_mating, coati_optimizer, mountain_gazelle, artificial_hummingbird, mayfly_optimizer, honey_badger, black_widow, moth_search, dingo_optimizer
constrained/ (3 files)
- barrier_method, penalty_method, sequential_quadratic_programming
multi_objective/ (1 file)
- spea2
probabilistic/ (1 file)
- parzen_tree_stimator
metaheuristic/ (1 file)
- forensic_based
Tier 2: COCO/BBOB Only Failures (~40-50 files)
Files missing Algorithm Metadata, COCO/BBOB Benchmark Settings, seed=42 examples, or seed attribute documentation:
swarm_intelligence/ (~30 files)
- salp_swarm_algorithm, emperor_penguin, slime_mould, pelican_optimizer, marine_predators_algorithm, osprey_optimizer, firefly_algorithm, zebra_optimizer, ant_lion_optimizer, giant_trevally, chimp_optimization, artificial_fish_swarm_algorithm, grey_wolf_optimizer, golden_eagle, tunicate_swarm, cat_swarm_optimization, african_vultures_optimizer, dragonfly_algorithm, grasshopper_optimization, fennec_fox, harris_hawks_optimization, spotted_hyena, aquila_optimizer, starling_murmuration, seagull_optimization, moth_flame_optimization, bat_algorithm, dandelion_optimizer, manta_ray, cuckoo_search, flower_pollination, snow_geese, pathfinder, whale_optimization_algorithm, artificial_rabbits, reptile_search, bee_algorithm, artificial_gorilla_troops, squirrel_search
gradient_based/ (9 files)
- adadelta, amsgrad, adagrad, stochastic_gradient_descent, rmsprop, adaptive_moment_estimation, nesterov_accelerated_gradient, nadam, adamax
classical/ (7 files)
- bfgs, lbfgs, hill_climbing, trust_region, conjugate_gradient, powell, tabu_search
metaheuristic/ (9 files)
- cross_entropy_method, sine_cosine_algorithm, arithmetic_optimization, variable_neighbourhood_search, eagle_strategy, colliding_bodies_optimization, stochastic_fractal_search, very_large_scale_neighborhood_search, stochastic_diffusion_search, variable_depth_search, shuffled_frog_leaping_algorithm, particle_filter
evolutionary/ (5 files)
- cultural_algorithm, estimation_of_distribution_algorithm, imperialist_competitive_algorithm, cma_es
physics_inspired/ (4 files)
- rime_optimizer, atom_search, equilibrium_optimizer
social_inspired/ (3 files)
- teaching_learning, political_optimizer, soccer_league_optimizer
probabilistic/ (4 files)
- adaptive_metropolis, linear_discriminant_analysis, sequential_monte_carlo, bayesian_optimizer
constrained/ (2 files)
- augmented_lagrangian_method, successive_linear_programming
multi_objective/ (2 files)
- nsga_ii, moead
Detailed Pre-commit Failure Analysis
1. pydocstyle Failures (~24 violations)
Common patterns:
- D107: Missing docstring in
__init__methods (14 files) - D103: Missing docstring in public functions (6 files)
- D417: Missing argument descriptions in docstrings (2 files)
Example violations:
opt/swarm_intelligence/glowworm_swarm_optimization.py:144 in private method `_compute_fitness`:
D417: Missing argument descriptions in the docstring (argument(s) population are missing descriptions)
opt/swarm_intelligence/wild_horse.py:68 in public method `__init__`:
D107: Missing docstring in __init__
opt/constrained/barrier_method.py:227 in public function `constraint`:
D103: Missing docstring in public function
opt/multi_objective/spea2.py:430 in public function `f1`:
D103: Missing docstring in public function
2. COCO/BBOB Compliance Failures (~600+ violations)
The custom pre-commit hook validate-optimizer-docs found widespread missing sections across 60-80 optimizer files. Each file typically has 5-7 violations.
Required sections missing:
- ❌ Algorithm Metadata: (author, year, DOI)
- ❌ COCO/BBOB Benchmark Settings: (recommended hyperparameters)
- ❌ Args/Attributes: (missing or incomplete)
- ❌ Notes: (complexity analysis)
- ❌ References: (original paper citations)
- ❌ seed=42 in examples (BBOB reproducibility requirement)
- ❌ seed attribute documentation (BBOB compliance)
Example violations:
opt/swarm_intelligence/salp_swarm_algorithm.py: Missing required section 'Algorithm Metadata:' in class docstring
opt/swarm_intelligence/emperor_penguin.py: Example section should include 'seed=42' for reproducibility
opt/evolutionary/cultural_algorithm.py: Args section should document 'seed' parameter for BBOB compliance
opt/gradient_based/adadelta.py: Missing required section 'COCO/BBOB Benchmark Settings:' in class docstring
3. Excessive Ruff Global Ignore List (23+ codes)
Current state (pyproject.toml excerpt):
[tool.ruff.lint]
select = ["ALL"]
ignore = [
"PLR0913", # Too many arguments - should use config objects
"PLR1704", # Redefining argument - needs refactor
"N803", # Argument name not lowercase
"N806", # Variable name not lowercase
"E741", # Ambiguous variable name
"E501", # Line too long
"T201", # Print statements - used in example scripts
"COM812", # Missing trailing comma
"NPY002", # Legacy numpy random calls - used extensively
"D107", # Missing docstring in __init__
"D103", # Missing docstring in public function
"PLR2004", # Magic value comparisons - common in optimization
"PLR0912", # Too many branches
"PLR0915", # Too many statements
"C901", # Too complex
"SIM109", # Use in tuple comparison
"SIM110", # Use all() generator
"PLR1714", # Consider merging comparisons
"PERF401", # Use list comprehension
"RET504", # Unnecessary assignment before return
"S112", # try-except-continue
"BLE001", # Blind exception
"B007", # Loop control variable not used
"B023", # Function doesn't bind loop variable
"PLC0415", # Import not at top level
"F841", # Unused variable
]Problems:
- D107 and D103 are globally ignored yet still failing pre-commit pydocstyle checks
- 23+ global exceptions mask code quality issues across 120 optimizer files
- No per-file justifications - unclear which exceptions are temporary vs. permanent
- Complexity codes (PLR0912, PLR0915, C901) are blanket-ignored instead of being documented per-algorithm
Goal: Reduce global ignore list by 50% within 2 sprints with measurable per-category targets.
Implementation Steps
Phase 1: Validation (10 Exemplar Files)
- Run batch script to generate FIXME templates
- For each file:
- Research original paper for metadata (authors, year, DOI)
- Document mathematical formulation with LaTeX equations
- Add BBOB-recommended hyperparameters
- Create working doctest with
seed=42 - Add complexity analysis
- Document BBOB performance characteristics if literature available
- Verify all 10 files pass validation criteria
- Extract lessons learned and refine batch script
Phase 2: Systematic Remediation (All Files)
-
Tier 1 Remediation (Weeks 1-2):
- Fix all 24 files with pydocstyle + COCO/BBOB failures
- Target: 100% of Tier 1 files compliant
-
Tier 2 Remediation (Weeks 2-4):
- Batch process 40-50 files with COCO/BBOB only failures
- Use validated batch script approach from Phase 1
- Target: 100% of Tier 2 files compliant
-
Ruff Ignore Reduction (Weeks 2-4 - Parallel):
- Remove D107, D103 from global ignores
- Reduce PLR2004 (already migrated in 4 files, expand to 20 more)
- Fix minor style issues (SIM109, SIM110, PLR1714, PERF401, RET504) across 15 files
- Document permanent exceptions with per-algorithm justifications
- Target: Global ignore list reduced from 23 → 12 codes (50% reduction)
-
Automation & Prevention (Week 4):
- Update batch docstring script with all COCO/BBOB sections
- Add pre-commit hook examples to
.github/copilot-instructions.md - Create docstring template checklist in
.github/PULL_REQUEST_TEMPLATE.md
Acceptance Criteria
Phase 1: Validation Checkpoints
- All 10 files completed with full COCO/BBOB compliance
- All files pass
uv run ruff check opt/ - All doctests execute successfully with seeds logged
- No FIXME markers remain in any file
- Metadata validation passes:
for file in particle_swarm adamw simulated_annealing genetic_algorithm harmony_search ant_colony sgd_momentum nelder_mead differential_evolution gravitational_search; do
uv run python -c "
import importlib, inspect
module = importlib.import_module('opt.${file%%.*}')
doc = inspect.getdoc(getattr(module, '$(echo $file | sed 's/_/ /g' | awk '{for(i=1;i<=NF;i++) $i=toupper(substr($i,1,1)) substr($i,2)}1' | sed 's/ //g')'))
assert 'Algorithm Metadata' in doc
assert 'COCO/BBOB' in doc
assert 'seed' in doc.lower()
print(f'✅ ${file} validated')
"
donePhase 2: Complete Coverage
-
pre-commit run -apasses on main branch (CRITICAL) - Zero pydocstyle D107/D103/D417 violations across all 120 optimizer files
- Zero COCO/BBOB compliance violations across all 60-80 affected files
- Global Ruff ignore list ≤ 12 codes (50% reduction from 23)
- All remaining global ignores have documented justifications in pyproject.toml comments
- Per-file ignores tracked in separate issue for phased removal
- Batch docstring script updated and documented
Quality Metrics
- 100% of Tier 1 files (24 files) fully compliant
- 100% of Tier 2 files (40-50 files) fully compliant
- 100% of validation files (10 files) serve as working reference implementations
- Documentation coverage: 120/120 optimizer files with complete docstrings
Complexity
High - Requires systematic research + careful implementation across 60-80 diverse algorithms spanning 10 categories. Estimated 4-6 weeks for complete remediation.
Dependencies
Depends on: #3 (batch script generates initial templates)
Blocks: #5-#14 (validates approach before scaling to all categories), #96 (documentation dependency migration - needs clean CI first)
Related: #52 (documentation quality tracking), #83 (CI/documentation improvements)
Context: Manually Deleted Commits
Three commits were proposed but manually deleted by maintainer:
scripts/__init__.py+check_google_docstring_inline_descriptions.pychangesdocs/package-lock.json+.npmrcchanges (legacy-peer-deps workaround)
Content from deleted commits preserved in issue comments for reference.