-
Notifications
You must be signed in to change notification settings - Fork 1
PR Part 1: Add sanity test harness and new plotter/dashboard module #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… for env. test_sanity_env.py is a standalone sanity/check harness that builds the env like train.py and runs invariants, gym compliance, and determinism checks with CLI options and optional external data. test_env.py is a minimal smoke test that instantiates the env with fixed weights and runs random actions, printing rewards; no invariants, determinism, or CLI control.
…raining. Add plotter.py with reusable plotting helpers for per-hour dashboards and cumulative savings analysis. Update train.py to import plotter utilities and add CLI flags to drive dashboard/cumulative plots. Hook plotting into evaluation and training flows, including saving figures to session plots. Addendum: Added plotter logic in train_iter.py
📝 WalkthroughWalkthroughAdds a new plotting module and comprehensive environment sanity tests; updates training scripts to optionally produce dashboards and timing info; and allows a test-time override for the external price start index in environment resets. Changes
Sequence Diagram(s)sequenceDiagram
participant Trainer as Training Loop
participant Env as Environment
participant Plot as Plotter
participant FS as File System
Trainer->>Env: reset(seed, options)
Env-->>Trainer: initial_obs, info
loop episode steps
Trainer->>Env: step(action)
Env-->>Trainer: obs, reward, done, info
Trainer->>Trainer: accumulate per-step costs -> episode_costs
end
Trainer->>Plot: plot_dashboard(env, num_hours, max_nodes, episode_costs, ...)
Plot->>Env: query metrics (prices, nodes, queue, rewards)
Plot->>Plot: compute series, cumulative savings, monthly pct
Plot->>FS: save figures (PNG)
Plot-->>Trainer: return summary metrics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
train_iter.py (1)
84-108: Critical:plot_dashboardanddashboard_hoursare undefined inrun()function.The function references
plot_dashboardanddashboard_hoursat lines 100-101, but these are neither function parameters nor global variables. Additionally, line 164 passesplot_dashboardanddashboard_hoursas keyword arguments, but the function signature doesn't accept them—this will raise aTypeError.🐛 Proposed fix: Add missing parameters to function signature
-def run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, iter_limit_per_step, session, prices, job_durations, jobs, hourly_jobs): +def run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, iter_limit_per_step, session, prices, job_durations, jobs, hourly_jobs, plot_dashboard=False, dashboard_hours=24*14): python_executable = sys.executable command = [ python_executable, "train.py",
🤖 Fix all issues with AI agents
In `@src/plotter.py`:
- Around line 152-161: The conditions for showing reward panels are inverted:
change the checks that use if not getattr(env, "plot_eff_reward", False) (and
similarly plot_price_reward, plot_idle_penalty, plot_job_age_penalty,
plot_total_reward) so they evaluate the flag directly (remove the leading not)
before calling add_panel; i.e., use getattr(env, "plot_eff_reward", False) etc.
This will make add_panel calls (function add_panel) occur only when the
corresponding env.plot_* flag is True.
In `@test/test_sanity_env.py`:
- Line 164: Replace the inconsistent CLI option name in p.add_argument by using
"--drop-weight" instead of "--drop_weight", and confirm any places that read the
parsed value (e.g., make_env_from_args) use args.drop_weight (argparse converts
hyphens to underscores), updating references if they currently use a different
name.
- Around line 1-10: Move the shebang line (#!/usr/bin/env python3) to the very
first line of the file so it appears before the module-level
docstring/triple-quoted string; update the top of test/test_sanity_env.py so the
shebang is line 1 and the existing docstring immediately follows.
🧹 Nitpick comments (7)
test/test_sanity_env.py (3)
126-132: Determinism test fingerprint relies on info keys not returned by the environment.The
rollout()function records values frominfodict (step_cost,num_unprocessed_jobs,num_on_nodes,dropped_this_episode), butComputeClusterEnv.step()returns an empty info dict{}. The test will pass (both rollouts get the same default values), but the fingerprint won't actually verify these metrics.Consider either:
- Enhancing
ComputeClusterEnv.step()to populate the info dict with these values, or- Removing these from the fingerprint since they don't add verification value currently.
54-70: Consider using imported constants instead of hardcoded values.The file imports
MAX_JOB_DURATION,CORES_PER_NODE, etc. at lines 21-26, butcheck_invariantsuses hardcoded values (170, 96, 168, 16). The comment at line 82 mentions avoiding imports, but they're already imported.♻️ Suggested improvement
# ---- Node bounds ---- assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" - assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" + assert np.all(nodes <= MAX_JOB_DURATION), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" ... - assert np.all((ca >= 0) & (ca <= 96)), f"cores_available out of bounds..." + assert np.all((ca >= 0) & (ca <= CORES_PER_NODE)), f"cores_available out of bounds..."
203-207: Fix indentation and remove commented code.Line 204 has inconsistent indentation (extra space), and line 231 has commented-out code.
🔧 Proposed fix
- # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. + # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. # To be safe: normalize "" -> None here.src/plotter.py (1)
289-291: Rename ambiguous variableltoline.Single-letter
lis easily confused with1orI.🔧 Proposed fix
# Combine legends lines = line1 + line1b + line2 + line2b - labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines] ax1.legend(lines, labels, loc="center right", fontsize=9)train.py (3)
156-157: Remove unnecessary f-string prefix.The f-string at line 156 has no placeholders.
🔧 Proposed fix
- print(f"Starting a new model training...") + print("Starting a new model training...")
315-315: Convert suffix to string for consistency.
suffixis passed as an integer here but as a string at line 274. While Python's f-string formatting handles this, it's cleaner to be explicit.🔧 Proposed fix
- suffix=STEPS_PER_ITERATION * iters, + suffix=f"train_{STEPS_PER_ITERATION * iters}",
288-289: Progress message is confusing.The message mixes iteration count and step count in a confusing way.
iters + 1is the iteration number, but the message implies it's a step count.🔧 Proposed fix
if (iters+1)%10==0: - print(f"Running... at {iters + 1} of {STEPS_PER_ITERATION * (iters + 1)} steps") + print(f"Running iteration {iters + 1}, total steps so far: {STEPS_PER_ITERATION * iters}")
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/environment.pysrc/plotter.pytest/test_sanity_env.pytrain.pytrain_iter.py
🧰 Additional context used
📓 Path-based instructions (2)
train.py
📄 CodeRabbit inference engine (CLAUDE.md)
train.py: Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Files:
train.py
{train.py,weights.py}
📄 CodeRabbit inference engine (CLAUDE.md)
Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Files:
train.py
🧠 Learnings (6)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
src/plotter.pytrain.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
train_iter.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Implement stable-baselines3 PPO integration in the training script with tensorboard logging and model checkpointing every 100K steps
Applied to files:
train.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Set environment constraints as: 2-week episodes (336 hours), max 1000 jobs in queue, max 1500 new jobs per hour, max 170 hours job runtime
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to environment.py : Configure cluster parameters as: 335 nodes maximum, 96 cores per node, up to 16 nodes per job, 150W idle power, 450W active power per node
Applied to files:
src/environment.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to sampler_hourly.py : Sample job characteristics independently (job count, duration, nodes, cores-per-node) from hour-of-day distributions in the Hourly Sampler
Applied to files:
src/environment.py
🧬 Code graph analysis (2)
train_iter.py (1)
src/plotter.py (1)
plot_dashboard(81-222)
train.py (1)
src/plotter.py (2)
plot_dashboard(81-222)plot_cumulative_savings(225-316)
🪛 Ruff (0.14.11)
src/plotter.py
81-81: Unused function argument: episode_costs
(ARG001)
290-290: Ambiguous variable name: l
(E741)
train_iter.py
100-100: Undefined name plot_dashboard
(F821)
101-101: Undefined name dashboard_hours
(F821)
test/test_sanity_env.py
10-10: Shebang should be at the beginning of the file
(EXE005)
124-124: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
241-241: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
241-241: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
297-297: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
300-300: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
339-339: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
339-339: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
train.py
156-156: f-string without any placeholders
Remove extraneous f prefix
(F541)
276-276: Do not catch blind exception: Exception
(BLE001)
294-294: f-string without any placeholders
Remove extraneous f prefix
(F541)
317-317: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (10)
train_iter.py (1)
138-140: LGTM on CLI arguments.The new
--plot-dashboardand--dashboard-hoursarguments are properly defined and will forward correctly totrain.pyonce therun()function signature is fixed.src/environment.py (2)
19-21: LGTM on import additions.Importing
MIN_CORES_PER_JOBandMIN_NODES_PER_JOBfromsrc.configmakes these constants available for environment configuration and aligns with the learnings specifying environment constraints. Based on learnings, these should support job constraints.
222-224: LGTM on price_start_index override.The optional
price_start_indexoverride viaoptionsis a clean approach for testing determinism. The modulo operation ensures the index stays within bounds.test/test_sanity_env.py (1)
277-359: LGTM on main test harness structure.The test harness provides good coverage:
- Basic determinism comparison (lines 290-315)
- Optional Gymnasium API compliance check
- Invariant validation loop
- Optional extended determinism test
The unused variable warnings from static analysis are acceptable for test code readability.
src/plotter.py (2)
8-16: LGTM on_as_serieshelper.Clean utility function that handles None inputs gracefully and pads shorter arrays with NaN values.
19-78: LGTM on_compute_cumulative_savings.The function correctly computes cumulative savings against two baselines with proper handling of empty inputs and array length alignment.
train.py (4)
48-52: LGTM on default weights.The default weights align with coding guidelines: efficiency=0.7, price=0.2, idle=0.1, job-age=0.0.
276-277: Broad exception handling is acceptable here.Catching
Exceptionfor non-fatal plotting operations is reasonable—it prevents visualization failures from interrupting training/evaluation. The error is logged for debugging.
214-251: LGTM on evaluation output structure.The evaluation output aligns well with coding guidelines, including:
- Per-episode metrics (cost, savings, completion rate, wait time, queue size)
- Cumulative analysis (total savings, monthly reduction, annual projections)
- Job processing metrics for both agent and baseline
152-159: Verify PPO hyperparameter choices for training consistency.
n_steps=64is significantly smaller than the stable-baselines3 PPO default (2048). WithSTEPS_PER_ITERATIONset to 100,000 steps per checkpoint, this results in approximately 1,562 policy updates per checkpoint cycle instead of the typical ~49 updates. While this enables more frequent gradient updates, it can increase training variance. Confirm this aggressive update frequency aligns with your episode structure and training objectives.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@train_iter.py`:
- Around line 119-122: The subprocess.run call stores its result in result but
didn't request captured output, so result.stdout will always be None; update the
subprocess.run(...) invocation in train_iter.py (the call that assigns to
result) to include capture_output=True (or stdout=subprocess.PIPE) if you intend
to return the subprocess output, and then return result.stdout, otherwise remove
the misleading return result.stdout (or explicitly return None) and adjust
callers accordingly.
🧹 Nitpick comments (4)
test/test_sanity_env.py (2)
47-100: Use imported constants instead of hardcoded magic numbers.The file imports
MAX_JOB_DURATION,MIN_NODES_PER_JOB,MAX_NODES_PER_JOB,MIN_CORES_PER_JOB, andCORES_PER_NODEbut then uses hardcoded values (170, 96, 16, 168) in the invariant checks. This creates maintenance risk if constants change.♻️ Suggested refactor to use imported constants
# ---- Node bounds ---- assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" - assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" + assert np.all(nodes <= MAX_JOB_DURATION), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" ... # ---- cores_available invariants (from env, not obs) ---- ca = env.cores_available assert ca.shape == nodes.shape - assert np.all((ca >= 0) & (ca <= 96)), f"cores_available out of bounds min={ca.min()} max={ca.max()}" + assert np.all((ca >= 0) & (ca <= CORES_PER_NODE)), f"cores_available out of bounds min={ca.min()} max={ca.max()}" ... - assert np.all(ca[idle] == 96), "idle nodes must have full cores_available" + assert np.all(ca[idle] == CORES_PER_NODE), "idle nodes must have full cores_available" ... - assert np.all((dur >= 0) & (dur <= 170)), f"duration out of bounds min={dur.min()} max={dur.max()}" + assert np.all((dur >= 0) & (dur <= MAX_JOB_DURATION)), f"duration out of bounds min={dur.min()} max={dur.max()}" ... if np.any(active): - assert np.all((nn[active] >= 1) & (nn[active] <= 16)), f"job nnodes out of bounds nn={nn[active]}" - assert np.all((cpn[active] >= 1) & (cpn[active] <= 96)), f"cores_per_node out of bounds cpn={cpn[active]}" + assert np.all((nn[active] >= MIN_NODES_PER_JOB) & (nn[active] <= MAX_NODES_PER_JOB)), f"job nnodes out of bounds nn={nn[active]}" + assert np.all((cpn[active] >= MIN_CORES_PER_JOB) & (cpn[active] <= CORES_PER_NODE)), f"cores_per_node out of bounds cpn={cpn[active]}"You'll also need to import
MAX_JOB_AGEfor the age bound check at line 85.
205-208: Fix inconsistent indentation.Line 205 has 5 spaces of indentation instead of 4, which is inconsistent with the rest of the codebase and could cause issues with strict linters.
🔧 Proposed fix
- # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. - # To be safe: normalize "" -> None here. + # Train.py passes strings; the env treats "" as falsy in some places and truthy in others. + # To be safe: normalize "" -> None here.src/plotter.py (1)
288-291: Rename ambiguous variableltoline.The variable name
lis flagged as ambiguous (easily confused with1orI). Consider renaming for clarity.🔧 Proposed fix
# Combine legends lines = line1 + line1b + line2 + line2b - labels = [l.get_label() for l in lines] + labels = [line.get_label() for line in lines] ax1.legend(lines, labels, loc="center right", fontsize=9)train_iter.py (1)
178-178: Consider breaking up long function call for readability.The line is quite long. While functional, breaking it across multiple lines would improve readability.
♻️ Suggested format
- run(efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, args.iter_limit_per_step, args.session, args.prices, args.job_durations, args.jobs, args.hourly_jobs,plot_dashboard=args.plot_dashboard,dashboard_hours=args.dashboard_hours) + run( + efficiency_weight, price_weight, idle_weight, job_age_weight, drop_weight, + args.iter_limit_per_step, args.session, + args.prices, args.job_durations, args.jobs, args.hourly_jobs, + plot_dashboard=args.plot_dashboard, + dashboard_hours=args.dashboard_hours, + )
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/plotter.pytest/test_sanity_env.pytrain_iter.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
test/test_sanity_env.pytrain_iter.py
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to train.py : Structure evaluation output to include per-episode metrics (cost, savings, completion rate, wait time, queue size) and cumulative analysis (total savings, monthly reduction, annual projections)
Applied to files:
src/plotter.py
🧬 Code graph analysis (2)
test/test_sanity_env.py (4)
src/environment.py (3)
reset(202-256)step(258-456)render(51-52)src/weights.py (1)
Weights(5-16)src/workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)test/test_determinism_workloadgen.py (1)
check_determinism(8-19)
train_iter.py (1)
src/plotter.py (1)
plot_dashboard(81-222)
🪛 Ruff (0.14.11)
test/test_sanity_env.py
125-125: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
src/plotter.py
81-81: Unused function argument: episode_costs
(ARG001)
290-290: Ambiguous variable name: l
(E741)
train_iter.py
119-119: subprocess call: check for execution of untrusted input
(S603)
🔇 Additional comments (16)
test/test_sanity_env.py (8)
1-10: LGTM!The shebang is now correctly placed at line 1, and the docstrings provide clear usage examples.
12-27: LGTM!Imports are well-organized with standard library, third-party, and local imports properly grouped.
30-36: LGTM!The function is straightforward for its test utility purpose.
102-108: LGTM!Good defensive check to ensure observations are copies and don't alias internal environment state.
110-141: LGTM with minor lint suggestion.The determinism test logic is solid. The unused
obsvariable at line 125 could be prefixed with_to satisfy linters.🔧 Optional lint fix
- obs, r, term, trunc, info = env.step(actions[i]) + _obs, r, term, trunc, info = env.step(actions[i])
147-177: LGTM!The CLI argument naming is now consistent with hyphens throughout, and the argument structure mirrors train.py appropriately.
236-274: LGTM with minor suggestion.The debug printing utility is well-structured. Consider prefixing unused unpacked variables with
_at line 242.🔧 Optional lint fix
- nodes, q, prices = _extract(obs) + _nodes, q, _prices = _extract(obs)
278-361: LGTM overall. Consider uncommenting or removing check_obs_is_copy.The main function structure is sound with good test coverage. Line 334 has
check_obs_is_copycommented out - if this check is no longer needed, consider removing it entirely; if it's temporarily disabled for debugging, add a comment explaining why.The unused variables from tuple unpacking (lines 298-302, 340) could be prefixed with
_to satisfy linters, though this is optional for test code.src/plotter.py (6)
1-16: LGTM!Imports are appropriate and
_as_seriesis a clean utility for normalizing array data with NaN padding.
19-78: LGTM!The cumulative savings computation is well-structured with proper handling of edge cases. The 2-week episode assumption and 4.33 weeks/month approximation are clearly documented.
81-86: Acknowledged: unused parameter for backwards compatibility.The
episode_costsparameter is documented as kept for backwards compatibility. Consider adding a# noqa: ARG001comment or using_episode_coststo suppress linter warnings more explicitly.
131-161: LGTM!The panel display logic correctly implements both patterns:
skip_plot_*flags:if not getattr(...)(skip=False → show panel)plot_*flags:if getattr(...)(plot=True → show panel)This resolves the previously flagged inverted logic issue.
163-222: LGTM!The dynamic grid layout, panel rendering, and figure save/show/close handling are well-implemented.
225-316: LGTM overall!The cumulative savings visualization with dual y-axes, summary box, and combined legends is well-implemented. The return dictionary provides useful metrics for downstream analysis.
train_iter.py (2)
84-98: LGTM!The function signature extension cleanly adds dashboard forwarding with sensible defaults.
152-154: LGTM!The new CLI arguments are properly defined with appropriate types and defaults, matching the
run()function signature.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| result = subprocess.run(command, text=True, env=current_env) | ||
| if result.returncode != 0: | ||
| print(f"Error occurred: {result.stderr}") | ||
| print("Error occurred: train.py returned a non-zero exit code.") | ||
| return result.stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n train_iter.py | sed -n '110,130p'Repository: FairRootGroup/powersched
Length of output: 937
🏁 Script executed:
# Find where run() is called and its return value is used
rg -n "\.run\(" train_iter.py -A 5 -B 2Repository: FairRootGroup/powersched
Length of output: 423
🏁 Script executed:
# Search for assignments or usages of the run() function call results
rg -n "result\s*=" train_iter.pyRepository: FairRootGroup/powersched
Length of output: 136
🏁 Script executed:
# Check if run() return value is used in function calls
rg -n "run\(" train_iter.pyRepository: FairRootGroup/powersched
Length of output: 413
🏁 Script executed:
# Search for all calls to the run() function across the codebase
rg -n "run\(" . --type py -B 2 -A 2 | grep -E "(run\(|=.*run\()" | head -30Repository: FairRootGroup/powersched
Length of output: 461
🏁 Script executed:
# More specifically, search for assignments where run() is called
rg -n "=\s*run\(" . --type pyRepository: FairRootGroup/powersched
Length of output: 50
result.stdout is always None when not capturing output.
Since capture_output was not set in the subprocess.run() call at line 119, result.stdout will always be None. The return statement at line 122 is misleading since the return value is not used by callers. Consider changing to return None or removing the return statement entirely.
🧰 Tools
🪛 Ruff (0.14.11)
119-119: subprocess call: check for execution of untrusted input
(S603)
🤖 Prompt for AI Agents
In `@train_iter.py` around lines 119 - 122, The subprocess.run call stores its
result in result but didn't request captured output, so result.stdout will
always be None; update the subprocess.run(...) invocation in train_iter.py (the
call that assigns to result) to include capture_output=True (or
stdout=subprocess.PIPE) if you intend to return the subprocess output, and then
return result.stdout, otherwise remove the misleading return result.stdout (or
explicitly return None) and adjust callers accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@test/test_sanity_env.py`:
- Around line 125-133: The test is relying on info keys that the environment
currently doesn't populate; update the environment's step() implementation (the
function named step in environment.py) to include the keys "step_cost",
"num_unprocessed_jobs", "num_on_nodes", and "dropped_this_episode" in the
returned info dict with sensible values (defaults when not applicable) so
info.get(...) in the test yields real data, and in the test (where obs is
unused) rename obs to _obs to satisfy static analysis; alternatively, if you
prefer changing the test instead of the env, simplify the fingerprint to only
use float(r) and drop all info.get(...) calls and rename obs to _obs.
♻️ Duplicate comments (1)
test/test_sanity_env.py (1)
17-17: ImportWeightsfrom its source module.
Weightsis defined insrc/weights.py, notsrc/environment. Import it directly:-from src.environment import ComputeClusterEnv, Weights +from src.environment import ComputeClusterEnv +from src.weights import WeightsBased on past review comments indicating EPISODE_HOURS and Weights should come from their corresponding source files.
🧹 Nitpick comments (6)
test/test_sanity_env.py (6)
55-91: Use imported constants instead of hardcoded magic numbers.The comment at line 83 says "avoid importing env constants," but you're already importing
MAX_JOB_DURATION,CORES_PER_NODE, etc. on lines 22-27. These hardcoded values (170, 96, 168, 16) could drift out of sync with actual config.♻️ Suggested refactor
+from src.config import ( + MAX_JOB_DURATION, MAX_JOB_AGE, MAX_QUEUE_SIZE, + MIN_NODES_PER_JOB, MAX_NODES_PER_JOB, + MIN_CORES_PER_JOB, CORES_PER_NODE, EPISODE_HOURS +) + # ---- Node bounds ---- assert np.all(nodes >= -1), f"nodes < -1 exists, min={nodes.min()}" -assert np.all(nodes <= 170), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" +assert np.all(nodes <= MAX_JOB_DURATION), f"nodes > MAX_JOB_DURATION exists, max={nodes.max()}" # ---- cores_available invariants (from env, not obs) ---- ca = env.cores_available assert ca.shape == nodes.shape -assert np.all((ca >= 0) & (ca <= 96)), f"cores_available out of bounds min={ca.min()} max={ca.max()}" +assert np.all((ca >= 0) & (ca <= CORES_PER_NODE)), f"cores_available out of bounds" -assert np.all(ca[idle] == 96), "idle nodes must have full cores_available" +assert np.all(ca[idle] == CORES_PER_NODE), "idle nodes must have full cores_available"
232-232: Remove commented-out code or add TODO.- # plot_total_reward=False,If this parameter is planned for future use, add a TODO comment; otherwise remove it.
242-242: Prefix unused variables with underscore.Per static analysis,
nodesandpricesare unpacked but not used:- nodes, q, prices = _extract(obs) + _nodes, q, _prices = _extract(obs)
291-316: Consider consolidating or documenting the inline determinism check.This section (lines 291-316) performs a quick determinism comparison that partially overlaps with
determinism_test(). Additionally:
- Line 293 uses hardcoded
seed = 123instead ofargs.seed- Multiple unused variables (
o1,o2,t1,t2,tr1,tr2) per static analysisIf this is intentional debug code, consider gating it behind a flag or documenting its purpose. Otherwise, consolidate with the
--check-determinismpath:- seed = 123 + seed = args.seedFor unused variables:
- o1, _ = env.reset(seed=seed) - o1s, r1, t1, tr1, i1 = env.step(action) + _, _ = env.reset(seed=seed) + o1s, r1, _t1, _tr1, i1 = env.step(action)
334-334: Enable or remove the commented-out copy check.
check_obs_is_copyis defined but never called. Either enable it to improve test coverage or remove it with explanation:- `#check_obs_is_copy`(env, obs) + check_obs_is_copy(env, obs)
340-340: Prefix unused loop variables.Per static analysis,
randinfoare unused in the loop:- obs, r, term, trunc, info = env.step(action) + obs, _r, term, trunc, _info = env.step(action)
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/environment.pytest/test_sanity_env.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-04T08:51:13.779Z
Learnt from: CR
Repo: FairRootGroup/powersched PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-04T08:51:13.779Z
Learning: Applies to {train.py,weights.py} : Use weighted reward components for training: efficiency-weight (default 0.7), price-weight (default 0.2), idle-weight (default 0.1), and job-age-weight (default 0.0)
Applied to files:
test/test_sanity_env.py
🧬 Code graph analysis (1)
test/test_sanity_env.py (3)
src/environment.py (4)
ComputeClusterEnv(46-456)reset(202-256)step(258-456)render(51-52)src/weights.py (1)
Weights(5-16)src/workloadgen.py (2)
WorkloadGenerator(56-110)WorkloadGenConfig(35-53)
🪛 Ruff (0.14.11)
test/test_sanity_env.py
125-125: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable nodes is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
242-242: Unpacked variable prices is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
298-298: Unpacked variable o1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable t1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
299-299: Unpacked variable tr1 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
301-301: Unpacked variable o2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable t2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
302-302: Unpacked variable tr2 is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable r is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
340-340: Unpacked variable info is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🔇 Additional comments (3)
src/environment.py (1)
222-223: LGTM! Test-time override for price start index.The logic is correct—using modulo ensures the index stays within bounds. Minor nit: "Purposes" should be lowercase "purposes" for consistency.
test/test_sanity_env.py (2)
30-36: LGTM!Clean price-loading helper with appropriate fallback for missing path.
102-107: LGTM!Good defensive check for observation buffer aliasing. Consider enabling this check in
main()(currently commented out at line 334).
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| obs, r, term, trunc, info = env.step(actions[i]) | ||
| # record a small fingerprint | ||
| traj.append(( | ||
| float(r), | ||
| float(info.get("step_cost", 0.0)), | ||
| int(info.get("num_unprocessed_jobs", -1)), | ||
| int(info.get("num_on_nodes", -1)), | ||
| int(info.get("dropped_this_episode", -1)), | ||
| )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info dict keys are not populated by the environment.
The environment's step() method returns an empty info dict {} (line 456 in environment.py), so info.get("step_cost", 0.0), info.get("num_unprocessed_jobs", -1), etc. will always return default values. This makes the fingerprint comparison less meaningful—only float(r) varies.
Either populate these keys in the environment's step() return, or simplify the fingerprint:
traj.append((
float(r),
- float(info.get("step_cost", 0.0)),
- int(info.get("num_unprocessed_jobs", -1)),
- int(info.get("num_on_nodes", -1)),
- int(info.get("dropped_this_episode", -1)),
))Also, prefix unused obs with underscore per static analysis:
-obs, r, term, trunc, info = env.step(actions[i])
+_obs, r, term, trunc, info = env.step(actions[i])🧰 Tools
🪛 Ruff (0.14.11)
125-125: Unpacked variable obs is never used
Prefix it with an underscore or any other dummy variable pattern
(RUF059)
🤖 Prompt for AI Agents
In `@test/test_sanity_env.py` around lines 125 - 133, The test is relying on info
keys that the environment currently doesn't populate; update the environment's
step() implementation (the function named step in environment.py) to include the
keys "step_cost", "num_unprocessed_jobs", "num_on_nodes", and
"dropped_this_episode" in the returned info dict with sensible values (defaults
when not applicable) so info.get(...) in the test yields real data, and in the
test (where obs is unused) rename obs to _obs to satisfy static analysis;
alternatively, if you prefer changing the test instead of the env, simplify the
fingerprint to only use float(r) and drop all info.get(...) calls and rename obs
to _obs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be addressed in the following PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enlorenz, understood! Would you like me to create a GitHub issue to track this work for the next PR?
No description provided.