Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Prometheus and VLLM visualizations, validates pull-secret readiness during RHOAI preparation, introduces multi‑flavor per‑flavor LLM inference test flows and cleanup, redirects cluster preload artifacts into an artifacts subdirectory, and applies defensive/string fixes across plotting and store code. Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(200,255,200,0.5)
actor PrepUser as Test Preparation
participant PrepFlow as prepare_rhoai()
participant Secret as K8s Secret
participant MCP as MachineConfigPool
participant Worker as Worker Node
participant Registry as Container Registry
end
PrepUser->>PrepFlow: start prepare_rhoai()
PrepFlow->>Secret: apply pull-secret
PrepFlow->>Secret: wait_for_pull_secret_ready(registry, test_image)
Secret->>Secret: verify registry entry & auth JSON
Secret->>MCP: verify MCP propagation/rollout
alt test_image provided
PrepFlow->>Worker: run test container to pull image
Worker->>Registry: attempt authenticated pull
Registry-->>Worker: allow/deny
end
Secret-->>PrepFlow: checks passed
PrepFlow->>PrepUser: continue test preparation
sequenceDiagram
rect rgba(200,200,255,0.5)
actor Tester as User
participant TestFlow as test_llmd.py
participant K8sAPI as Kubernetes API
participant Prom as Prometheus
participant Parser as parsers.get_llmd_main_metrics
participant PlotReg as Plotting Registry
end
Tester->>TestFlow: start per-flavor tests()
TestFlow->>K8sAPI: cleanup_llm_inference_resources()
K8sAPI->>K8sAPI: delete llminferenceservice resources
loop Poll (retries)
TestFlow->>K8sAPI: check for remaining resources
end
TestFlow->>Prom: reset/collect metrics
TestFlow->>Parser: get_llmd_main_metrics(register=True)
Parser->>Prom: query metrics (cluster_role="sutest")
Prom-->>Parser: metrics data
Parser->>PlotReg: register plots & reports
PlotReg->>Prom: render reports (Prometheus/VLLM)
Prom-->>Tester: deliver HTML visualization
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/test jump-ci llm-d scale_up |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
projects/matrix_benchmarking/visualizations/helpers/store/prom.py (1)
315-323: Guardcontrol_planeas well to fully avoid attribute crashes.The new check protects missing
cluster_info, but both lines still dereference.control_planeunguarded. Ifcluster_infois present but incomplete, this still raisesAttributeError.Proposed hardening + dedup
+ def _control_plane_nodes(entry): + cluster_info = getattr(entry.results, "cluster_info", None) + control_plane = getattr(cluster_info, "control_plane", []) if cluster_info else [] + return [node.name for node in control_plane] + def only_control_plane_nodes(entry, metrics): - - control_plane_nodes = [node.name for node in (entry.results.cluster_info.control_plane if getattr(entry.results, "cluster_info", []) else [])] + control_plane_nodes = _control_plane_nodes(entry) @@ def no_control_plane_nodes(entry, metrics): - control_plane_nodes = [node.name for node in (entry.results.cluster_info.control_plane if getattr(entry.results, "cluster_info", []) else [])] + control_plane_nodes = _control_plane_nodes(entry)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/matrix_benchmarking/visualizations/helpers/store/prom.py` around lines 315 - 323, The code assumes entry.results.cluster_info.control_plane exists; change both places (the list comprehensions inside the generator and inside no_control_plane_nodes) to first fetch cluster_info = getattr(entry.results, "cluster_info", None) and then control_plane = getattr(cluster_info, "control_plane", []) (or use hasattr(cluster_info, "control_plane") ? []), and build control_plane_nodes = [node.name for node in control_plane]; optionally factor this into a small helper function like get_control_plane_nodes(entry) and call it from both the generator and no_control_plane_nodes to avoid duplication.projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py (1)
189-199: Dead code behindif False:— consider using a TODO or removing.This API Server Request Latency section is permanently unreachable. If it's intentionally deferred, a
# TODO:comment explaining why (e.g., metrics not yet available) would improve clarity and make it discoverable for future cleanup.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py` around lines 189 - 199, The API Server Request Latency block is dead code because it's guarded by if False; either remove this unreachable block entirely or make it intentionally conditional (e.g., driven by a feature flag/config or a constant) and add a clear TODO explaining why it's disabled; locate the block that builds header using html.H3/html.P and calls report.Plot_and_Text (the lines inside the if False) and replace the hard false guard with a named flag or delete the block and add a // TODO: comment mentioning why the plots (Prom: Sutest GET/PUT/LIST/PATCH Requests duration) are omitted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@projects/llm-d/testing/prepare_llmd.py`:
- Around line 125-132: The MCP check currently assumes result.stdout is valid;
update the logic in the loop that calls run.run (the variable result) so you
first verify the command succeeded and result.stdout is non-empty (e.g., check
result.exit_code/returncode or result.stdout truthiness) before interpreting
values; if the command failed or stdout is empty, log a warning/error and
sleep/retry instead of setting machine_config_updated, and only set
machine_config_updated = True when the command succeeded and the parsed status
indicates update completion (no "False" and/or contains "True").
- Around line 89-104: The loop currently only fetches secret_content once
(guarded by the locals() check), causing stale data to be re-parsed across
retries; update the block around secret_json_valid/secret_content so that you
always re-run the oc get secret command each iteration before attempting
json.loads (i.e., remove the if not 'secret_content' in locals() guard), parse
into secret_data, set secret_json_valid when parsing succeeds, and keep checking
registry in secret_data["auths"] so each retry uses fresh secret_content and can
observe updates.
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 304-321: The loop currently treats a non-zero result.returncode as
a successful cleanup; change the logic so non-zero returncode is considered a
failed oc call and does not break the wait loop. Specifically, in the block
using run.run(...) (variable result) and the loop that computes
remaining_services/remaining_count, only break when result.returncode == 0 AND
result.stdout.strip() is empty; if result.returncode != 0, log the error
(include result.returncode and result.stderr/stdout) and continue retrying (and
on the last iteration raise a RuntimeError indicating the oc command failed),
and keep the existing behavior for the case where result.returncode == 0 but
resources still exist.
In `@projects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.py`:
- Around line 25-42: The Plot definitions in vllm_metrics.py currently embed
PromQL expressions (e.g., the dict value
"rate(vllm_e2e_latency_seconds_count[5m])" for the "vllm_e2e_request_rate" plot)
which are not used by the parser pipeline and conflict with the canonical
queries in parsers.py; update the plotting_prom.Plot calls (the two Plot
invocations that reference "vllm_e2e_latency_average" and
"vllm_e2e_request_rate") to either remove the dict values and pass plain metric
names only, or replace the dict values with the exact canonical query strings
from parsers.py (e.g., the kserve_vllm:e2e_request_latency_seconds_count rate
form) so they match get_uwm_metrics/parsers.py; ensure no PromQL expression
remains in the Plot dict unless it exactly matches the parser canonical query.
In `@projects/llm-d/visualizations/llmd_inference/store/parsers.py`:
- Around line 294-317: SUTEST_CONTAINER_EXTRA_METRICS_NAMES currently doesn't
match the keys used in SUTEST_CONTAINER_LABELS which _get_network_usage_metrics
uses to filter network metrics; update SUTEST_CONTAINER_EXTRA_METRICS_NAMES to
include the same container label keys (e.g., "LLM Inference Service" and "LLM
Inference Gateway") so that helper_prom_store.get_cluster_metrics called from
get_llmd_main_metrics will not skip network metrics when
_get_network_usage_metrics checks if plot_name is in network_metrics_names.
---
Nitpick comments:
In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py`:
- Around line 189-199: The API Server Request Latency block is dead code because
it's guarded by if False; either remove this unreachable block entirely or make
it intentionally conditional (e.g., driven by a feature flag/config or a
constant) and add a clear TODO explaining why it's disabled; locate the block
that builds header using html.H3/html.P and calls report.Plot_and_Text (the
lines inside the if False) and replace the hard false guard with a named flag or
delete the block and add a // TODO: comment mentioning why the plots (Prom:
Sutest GET/PUT/LIST/PATCH Requests duration) are omitted.
In `@projects/matrix_benchmarking/visualizations/helpers/store/prom.py`:
- Around line 315-323: The code assumes entry.results.cluster_info.control_plane
exists; change both places (the list comprehensions inside the generator and
inside no_control_plane_nodes) to first fetch cluster_info =
getattr(entry.results, "cluster_info", None) and then control_plane =
getattr(cluster_info, "control_plane", []) (or use hasattr(cluster_info,
"control_plane") ? []), and build control_plane_nodes = [node.name for node in
control_plane]; optionally factor this into a small helper function like
get_control_plane_nodes(entry) and call it from both the generator and
no_control_plane_nodes to avoid duplication.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (14)
projects/cluster/toolbox/cluster_preload_image/tasks/main.ymlprojects/llm-d/testing/prepare_llmd.pyprojects/llm-d/testing/test_llmd.pyprojects/llm-d/visualizations/llmd_inference/data/plots.yamlprojects/llm-d/visualizations/llmd_inference/plotting/__init__.pyprojects/llm-d/visualizations/llmd_inference/plotting/prometheus.pyprojects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.pyprojects/llm-d/visualizations/llmd_inference/plotting/report.pyprojects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.pyprojects/llm-d/visualizations/llmd_inference/store/parsers.pyprojects/matrix_benchmarking/subprojectprojects/matrix_benchmarking/visualizations/helpers/plotting/prom/__init__.pyprojects/matrix_benchmarking/visualizations/helpers/plotting/prom/cpu_memory.pyprojects/matrix_benchmarking/visualizations/helpers/store/prom.py
|
🟢 Test of 'llm-d test test_ci' succeeded after 00 hours 22 minutes 20 seconds. 🟢 • Link to the test results. • Link to the reports index. Test configuration: |
…e no-metric message more informative
…h if there's no cluster_info
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
projects/llm-d/testing/prepare_llmd.py (1)
124-131:⚠️ Potential issue | 🟠 MajorDon’t mark MCP updated on empty/ambiguous status output
Current logic sets
machine_config_updated = Truewhenever"False"is absent. Empty or malformed output can incorrectly pass this gate.🔧 Proposed fix
result = run.run("oc get mcp -o jsonpath='{.items[*].status.conditions[?(@.type==\"Updated\")].status}'", capture_stdout=True) - if "False" in result.stdout: + statuses = result.stdout.strip().split() + if not statuses: + logging.warning("MCP Updated status output is empty, waiting...") + time.sleep(15) + continue + if "False" in statuses: logging.info("Machine config pools still updating, waiting...") time.sleep(30) continue - else: + elif "True" in statuses: logging.info("✓ Machine config pools updated") machine_config_updated = True + else: + logging.warning(f"Unexpected MCP Updated status output: {result.stdout!r}, waiting...") + time.sleep(15) + continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/prepare_llmd.py` around lines 124 - 131, The current check treats any non-"False" output as success which allows empty or malformed output to pass; change the logic that reads result = run.run(...) to validate result.stdout more robustly by stripping and parsing the output tokens (result.stdout.strip()), ensure it's non-empty and contains at least one explicit "True" (or that every token is a known status like "True"/"False" and none are "False") before setting machine_config_updated = True, and otherwise log the ambiguous output and continue sleeping/retrying; keep references to the same run.run call, result.stdout, and the machine_config_updated assignment so the change is localized.
🧹 Nitpick comments (4)
projects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.py (2)
266-267: Consider prefixing unused variables with underscore.The unpacked variables
ordered_vars,variables, andcfgare not used in this method. While this follows the interface contract, prefixing with underscores signals intent and silences linter warnings.♻️ Proposed fix
- ordered_vars, settings, setting_lists, variables, cfg = args + _ordered_vars, settings, setting_lists, _variables, _cfg = args🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.py` around lines 266 - 267, The unpacked variables ordered_vars, variables, and cfg in the function where you call "entries = list(common.Matrix.all_records(settings, setting_lists))" are unused; rename them to _ordered_vars, _variables, and _cfg when unpacking args to indicate intentional unused variables and silence linters (leave settings and setting_lists as-is so the function contract and behavior remain unchanged).
34-42: Dict format in Plot() is inconsistent with other calls.Based on learnings,
Plot()expects a list of metric names, and dict values are ignored during metric extraction. Line 36 uses[{"vllm_e2e_request_rate": "rate(vllm_e2e_latency_seconds_count[5m])"}]while all otherPlot()calls in this file use plain string lists like["vllm_e2e_latency_average"].For consistency and to avoid confusion, use the plain string format:
♻️ Proposed fix
# E2E Request Rate - Requests per second plotting_prom.Plot( - [{"vllm_e2e_request_rate": "rate(vllm_e2e_latency_seconds_count[5m])"}], + ["vllm_e2e_request_rate"], "Prom: VLLM E2E Request Rate", title="VLLM End-to-End Request Rate", y_title="Requests/sec", get_metrics=get_uwm_metrics, as_timestamp=True )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.py` around lines 34 - 42, The Plot call is using a dict instead of the expected list of metric name/expression strings; replace the list [{"vllm_e2e_request_rate": "rate(vllm_e2e_latency_seconds_count[5m])"}] with a plain string list containing the expression, e.g. ["rate(vllm_e2e_latency_seconds_count[5m])"], so the Plot(...) invocation (referenced by Plot, vllm_e2e_request_rate and get_uwm_metrics) matches the other calls and the metric extractor will pick up the expression correctly.projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py (2)
30-30: Consider prefixing unused variables with underscore.The unpacked variables
ordered_vars,variables, andcfgare unused in all threedo_plotmethods (lines 30, 88, 158). This is the same pattern as invllm_metrics.py. Prefixing with underscores signals intent and silences linter warnings.♻️ Proposed fix for all three methods
# Line 30: - ordered_vars, settings, setting_lists, variables, cfg = args + _ordered_vars, settings, setting_lists, _variables, _cfg = args # Line 88: - ordered_vars, settings, setting_lists, variables, cfg = args + _ordered_vars, settings, setting_lists, _variables, _cfg = args # Line 158: - ordered_vars, settings, setting_lists, variables, cfg = args + _ordered_vars, settings, setting_lists, _variables, _cfg = args🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py` at line 30, In the three do_plot methods in prometheus_reports.py, the unpacked variables ordered_vars, variables, and cfg are unused; rename them to _ordered_vars, _variables, and _cfg (or prefix with underscores) in the unpacking line (where args are unpacked) to signal they are intentionally unused and silence linters—apply the same change for each do_plot variant referenced (match the pattern used in vllm_metrics.py) so only settings and setting_lists remain as the used names.
189-199: Dead code block withif False:.This section is unreachable and appears to be placeholder code for API Server Request Latency plots. Consider either removing it or converting to a TODO comment if this is planned future work.
♻️ Option 1: Remove dead code
header.append(html.Br()) - if False: - # API Server Request Latency Section - header.append(html.H3("⏱️ API Server Request Latency")) - header.append(html.P("API request duration analysis by operation type")) - - header += report.Plot_and_Text("Prom: Sutest GET Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest PUT Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest LIST Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest PATCH Requests duration", args) - - header.append(html.Br()) - # ETCD Performance Section♻️ Option 2: Convert to TODO if planned
header.append(html.Br()) - if False: - # API Server Request Latency Section - header.append(html.H3("⏱️ API Server Request Latency")) - header.append(html.P("API request duration analysis by operation type")) - - header += report.Plot_and_Text("Prom: Sutest GET Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest PUT Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest LIST Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest PATCH Requests duration", args) - - header.append(html.Br()) + # TODO: Add API Server Request Latency section once duration metrics are available + # Plots: GET/PUT/LIST/PATCH Requests duration # ETCD Performance Section🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py` around lines 189 - 199, Remove the unreachable dead code guarded by "if False:" that appends the API Server Request Latency section (the block that references header, html.H3, html.P, report.Plot_and_Text and html.Br()); either delete the entire block or replace it with a single TODO comment referencing the intended plots (e.g., "TODO: add API Server Request Latency plots using report.Plot_and_Text for Sutest GET/PUT/LIST/PATCH") so the codebase has no dead branches and the intended future work is documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@projects/llm-d/testing/prepare_llmd.py`:
- Line 63: There's a duplicate shadowing import of time in prepare_llmd.py
(module already imports time at top); remove the redundant "import time" that
triggers Ruff F811 or, if a different module was intended, replace it with the
correct module name or an alias. Locate the second import (the one at line
referenced in the review) and delete it so only the module-level import remains,
or change it to a unique identifier if truly needed.
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 305-314: The cleanup verification uses "oc get llminferenceservice
--no-headers" and checks result.stdout, which can be unreliable; update the
run.run invocation in the test (the call that assigns to result) to use a
resource-stable output format (e.g., add "-o name") and then compute
remaining_services and remaining_count by splitting non-empty lines of
result.stdout so the test accurately counts resources (adjust the variables
result, remaining_services, remaining_count and the logging message
accordingly).
---
Duplicate comments:
In `@projects/llm-d/testing/prepare_llmd.py`:
- Around line 124-131: The current check treats any non-"False" output as
success which allows empty or malformed output to pass; change the logic that
reads result = run.run(...) to validate result.stdout more robustly by stripping
and parsing the output tokens (result.stdout.strip()), ensure it's non-empty and
contains at least one explicit "True" (or that every token is a known status
like "True"/"False" and none are "False") before setting machine_config_updated
= True, and otherwise log the ambiguous output and continue sleeping/retrying;
keep references to the same run.run call, result.stdout, and the
machine_config_updated assignment so the change is localized.
---
Nitpick comments:
In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py`:
- Line 30: In the three do_plot methods in prometheus_reports.py, the unpacked
variables ordered_vars, variables, and cfg are unused; rename them to
_ordered_vars, _variables, and _cfg (or prefix with underscores) in the
unpacking line (where args are unpacked) to signal they are intentionally unused
and silence linters—apply the same change for each do_plot variant referenced
(match the pattern used in vllm_metrics.py) so only settings and setting_lists
remain as the used names.
- Around line 189-199: Remove the unreachable dead code guarded by "if False:"
that appends the API Server Request Latency section (the block that references
header, html.H3, html.P, report.Plot_and_Text and html.Br()); either delete the
entire block or replace it with a single TODO comment referencing the intended
plots (e.g., "TODO: add API Server Request Latency plots using
report.Plot_and_Text for Sutest GET/PUT/LIST/PATCH") so the codebase has no dead
branches and the intended future work is documented.
In `@projects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.py`:
- Around line 266-267: The unpacked variables ordered_vars, variables, and cfg
in the function where you call "entries =
list(common.Matrix.all_records(settings, setting_lists))" are unused; rename
them to _ordered_vars, _variables, and _cfg when unpacking args to indicate
intentional unused variables and silence linters (leave settings and
setting_lists as-is so the function contract and behavior remain unchanged).
- Around line 34-42: The Plot call is using a dict instead of the expected list
of metric name/expression strings; replace the list [{"vllm_e2e_request_rate":
"rate(vllm_e2e_latency_seconds_count[5m])"}] with a plain string list containing
the expression, e.g. ["rate(vllm_e2e_latency_seconds_count[5m])"], so the
Plot(...) invocation (referenced by Plot, vllm_e2e_request_rate and
get_uwm_metrics) matches the other calls and the metric extractor will pick up
the expression correctly.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
projects/llm-d/testing/prepare_llmd.pyprojects/llm-d/testing/test_llmd.pyprojects/llm-d/visualizations/llmd_inference/data/plots.yamlprojects/llm-d/visualizations/llmd_inference/plotting/__init__.pyprojects/llm-d/visualizations/llmd_inference/plotting/prometheus.pyprojects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.pyprojects/llm-d/visualizations/llmd_inference/plotting/report.pyprojects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.pyprojects/llm-d/visualizations/llmd_inference/store/parsers.pyprojects/matrix_benchmarking/subprojectprojects/matrix_benchmarking/visualizations/helpers/plotting/prom/__init__.pyprojects/matrix_benchmarking/visualizations/helpers/plotting/prom/cpu_memory.pyprojects/matrix_benchmarking/visualizations/helpers/store/prom.py
🚧 Files skipped from review as they are similar to previous changes (6)
- projects/llm-d/visualizations/llmd_inference/plotting/report.py
- projects/matrix_benchmarking/visualizations/helpers/store/prom.py
- projects/matrix_benchmarking/visualizations/helpers/plotting/prom/cpu_memory.py
- projects/llm-d/visualizations/llmd_inference/data/plots.yaml
- projects/llm-d/visualizations/llmd_inference/plotting/prometheus.py
- projects/matrix_benchmarking/subproject
|
/test jump-ci llm-d scale_up |
5ca3417 to
124907d
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
projects/llm-d/testing/test_llmd.py (1)
423-433:⚠️ Potential issue | 🟠 Major
oc get --no-headersis not a stable emptiness check for cleanup verification.When no resources exist, output can still be non-empty (“No resources found ...”), which can falsely fail cleanup.
🔧 Proposed fix
- result = run.run(f"oc get llminferenceservice -n {namespace} --no-headers", + result = run.run(f"oc get llminferenceservice -n {namespace} -o name", capture_stdout=True) - if not result.stdout.strip(): + remaining_services = [s for s in result.stdout.splitlines() if s.strip()] + if not remaining_services: logging.info("No llminferenceservice resources found - cleanup successful") break else: - remaining_services = result.stdout.strip().split('\n') - remaining_count = len([s for s in remaining_services if s.strip()]) + remaining_count = len(remaining_services) logging.info(f"Still found {remaining_count} llminferenceservice resources, waiting...")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 423 - 433, The emptiness check using result.stdout.strip() is unreliable because "oc get ..." can print "No resources found" even when there are none; update the loop that runs the command ("oc get llminferenceservice -n {namespace} --no-headers") to treat that message as empty by checking result.stdout.lower() for "no resources" (or similar phrasing) in addition to an empty string, e.g. replace the if condition with one that succeeds when result.stdout.strip() == "" or "no resources" in result.stdout.lower(), and keep the existing remaining_services/remaining_count logic for the non-empty case.
🧹 Nitpick comments (2)
projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py (1)
189-199: Consider removing or documenting the disabled API latency section.The
if False:block contains dead code that will never execute. If this functionality is intentionally disabled pending future work, add a comment explaining why. Otherwise, consider removing it to improve code clarity.♻️ Option 1: Add explanatory comment if keeping for future use
- if False: + if False: # TODO: Disabled pending API latency metrics implementation # API Server Request Latency Section♻️ Option 2: Remove if not planned for use
header.append(html.Br()) - if False: - # API Server Request Latency Section - header.append(html.H3("⏱️ API Server Request Latency")) - header.append(html.P("API request duration analysis by operation type")) - - header += report.Plot_and_Text("Prom: Sutest GET Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest PUT Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest LIST Requests duration", args) - header += report.Plot_and_Text("Prom: Sutest PATCH Requests duration", args) - - header.append(html.Br()) - # ETCD Performance Section🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py` around lines 189 - 199, The block guarded by "if False:" (which builds the API latency UI using header.append(html.H3(...)), header.append(html.P(...)), report.Plot_and_Text(...) and header.append(html.Br())) is dead code; either remove this whole conditional and its contents to eliminate unused code, or keep it but replace "if False:" with an explanatory comment above (e.g., "Disabled: API latency plots pending backend metrics" ) so future readers know it’s intentionally disabled; locate the block around the use of header and report.Plot_and_Text in prometheus_reports.py and apply one of these two changes.projects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.yml (1)
54-60: Combine yq calls into a single operation for efficiency.The two
yqinvocations can be merged into one using jq's pipe syntax, reducing file read/write overhead. The suggested combined form uses valid jq syntax supported by the Python yq version used throughout this project:♻️ Suggested improvement: combine yq calls
- name: Set the name and namespace in the LLM inference service YAML shell: | - yq -Yi '.metadata.name = "{{ llmd_deploy_llm_inference_service_name }}"' \ - "{{ artifact_extra_logs_dir }}/src/llm_inference_service.yaml" - yq -Yi '.metadata.namespace = "{{ llmd_deploy_llm_inference_service_namespace }}"' \ + yq -Yi '.metadata.name = "{{ llmd_deploy_llm_inference_service_name }}" | .metadata.namespace = "{{ llmd_deploy_llm_inference_service_namespace }}"' \ "{{ artifact_extra_logs_dir }}/src/llm_inference_service.yaml"Note: The
-Yiflags are Python yq specific. If environments migrate to Go yq (mikefarah/yq), the flag syntax would need updating (Go yq uses-ionly, without-Y). Consider documenting the yq version requirement if not already specified.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.yml` around lines 54 - 60, Replace the two separate yq invocations that set .metadata.name and .metadata.namespace with a single yq invocation that applies both updates in one expression (use the pipe/chain syntax to set .metadata.name = "{{ llmd_deploy_llm_inference_service_name }}" and .metadata.namespace = "{{ llmd_deploy_llm_inference_service_namespace }}" in the same call), keep the Python yq flags (-Yi) and target the same YAML file referenced in the task; this reduces file I/O while preserving the existing variables llmd_deploy_llm_inference_service_name and llmd_deploy_llm_inference_service_namespace.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 203-205: Guard the deletion of spec.router to avoid KeyError by
checking for its existence before deleting: replace direct del
isvc_data['spec']['router'] with a conditional check like if 'router' in
isvc_data.get('spec', {}): and then delete (or pop) it; apply the same fix for
the second occurrence (the other deletion at lines 214-216) so both code paths
that handle the 'simple' and 'simple-x2' flavors safely remove spec.router only
when present.
- Around line 71-74: The code raises an undefined exception class
RuntimeException when the simple inference check in the conditional using
config.project.get_config("tests.llmd.inference_service.do_simple_test") fails;
update the raise statement to use the built-in RuntimeError instead (i.e.,
replace RuntimeException with RuntimeError) in the block that calls
test_llm_inference_simple(endpoint_url, llmisvc_name, namespace).
In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus.py`:
- Around line 3-10: The module-level getattr(projects, "llm-d") causes
import-time AttributeError; move dynamic project resolution into register():
inside register() import importlib and call
importlib.import_module("projects.llm-d") (or the correct package path), then
obtain get_llmd_main_metrics from the imported module (e.g.,
imported_module.visualizations.llmd_inference.store.parsers.get_llmd_main_metrics)
and call it with register=True; add try/except to catch
ImportError/AttributeError and raise or log a clear error message indicating the
hyphenated package cannot be resolved at import time and that register() failed
to locate get_llmd_main_metrics.
---
Duplicate comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 423-433: The emptiness check using result.stdout.strip() is
unreliable because "oc get ..." can print "No resources found" even when there
are none; update the loop that runs the command ("oc get llminferenceservice -n
{namespace} --no-headers") to treat that message as empty by checking
result.stdout.lower() for "no resources" (or similar phrasing) in addition to an
empty string, e.g. replace the if condition with one that succeeds when
result.stdout.strip() == "" or "no resources" in result.stdout.lower(), and keep
the existing remaining_services/remaining_count logic for the non-empty case.
---
Nitpick comments:
In `@projects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.yml`:
- Around line 54-60: Replace the two separate yq invocations that set
.metadata.name and .metadata.namespace with a single yq invocation that applies
both updates in one expression (use the pipe/chain syntax to set .metadata.name
= "{{ llmd_deploy_llm_inference_service_name }}" and .metadata.namespace = "{{
llmd_deploy_llm_inference_service_namespace }}" in the same call), keep the
Python yq flags (-Yi) and target the same YAML file referenced in the task; this
reduces file I/O while preserving the existing variables
llmd_deploy_llm_inference_service_name and
llmd_deploy_llm_inference_service_namespace.
In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.py`:
- Around line 189-199: The block guarded by "if False:" (which builds the API
latency UI using header.append(html.H3(...)), header.append(html.P(...)),
report.Plot_and_Text(...) and header.append(html.Br())) is dead code; either
remove this whole conditional and its contents to eliminate unused code, or keep
it but replace "if False:" with an explanatory comment above (e.g., "Disabled:
API latency plots pending backend metrics" ) so future readers know it’s
intentionally disabled; locate the block around the use of header and
report.Plot_and_Text in prometheus_reports.py and apply one of these two
changes.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
docs/toolbox.generated/Llmd.run_guidellm_benchmark.rstdocs/toolbox.generated/Llmd.run_multiturn_benchmark.rstprojects/llm-d/testing/config.yamlprojects/llm-d/testing/llmisvcs/llama-3-1-8b-instruct-fp8.yamlprojects/llm-d/testing/prepare_llmd.pyprojects/llm-d/testing/test_llmd.pyprojects/llm-d/toolbox/llmd.pyprojects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.ymlprojects/llm-d/toolbox/llmd_run_guidellm_benchmark/defaults/main/config.ymlprojects/llm-d/toolbox/llmd_run_guidellm_benchmark/tasks/main.ymlprojects/llm-d/toolbox/llmd_run_multiturn_benchmark/defaults/main/config.ymlprojects/llm-d/toolbox/llmd_run_multiturn_benchmark/tasks/main.ymlprojects/llm-d/visualizations/llmd_inference/data/plots.yamlprojects/llm-d/visualizations/llmd_inference/plotting/__init__.pyprojects/llm-d/visualizations/llmd_inference/plotting/prometheus.pyprojects/llm-d/visualizations/llmd_inference/plotting/prometheus_reports.pyprojects/llm-d/visualizations/llmd_inference/plotting/vllm_metrics.pyprojects/llm-d/visualizations/llmd_inference/store/parsers.pyprojects/matrix_benchmarking/subproject
✅ Files skipped from review due to trivial changes (2)
- docs/toolbox.generated/Llmd.run_guidellm_benchmark.rst
- projects/llm-d/testing/llmisvcs/llama-3-1-8b-instruct-fp8.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
- projects/matrix_benchmarking/subproject
- projects/llm-d/visualizations/llmd_inference/plotting/init.py
- projects/llm-d/visualizations/llmd_inference/store/parsers.py
| if config.project.get_config("tests.llmd.inference_service.do_simple_test"): | ||
| if not test_llm_inference_simple(endpoint_url, llmisvc_name, namespace): | ||
| raise RuntimeException("Simple inference test failed :/") | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Verify RuntimeException usage and definition in Python sources
rg -nP '\bRuntimeException\b' --type=py
rg -nP '^\s*class\s+RuntimeException\b|^\s*def\s+RuntimeException\b' --type=pyRepository: openshift-psap/topsail
Length of output: 190
🏁 Script executed:
# Check imports and custom definitions in test_llmd.py
head -50 projects/llm-d/testing/test_llmd.py | grep -E '^import|^from|class RuntimeException'Repository: openshift-psap/topsail
Length of output: 341
Use RuntimeError instead of RuntimeException; the latter is undefined in Python.
At line 73, RuntimeException is not a built-in Python exception and is not defined in this codebase. This will raise NameError at runtime when the code path is executed. Use RuntimeError instead.
🔧 Proposed fix
- if not test_llm_inference_simple(endpoint_url, llmisvc_name, namespace):
- raise RuntimeException("Simple inference test failed :/")
+ if not test_llm_inference_simple(endpoint_url, llmisvc_name, namespace):
+ raise RuntimeError("Simple inference test failed :/")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if config.project.get_config("tests.llmd.inference_service.do_simple_test"): | |
| if not test_llm_inference_simple(endpoint_url, llmisvc_name, namespace): | |
| raise RuntimeException("Simple inference test failed :/") | |
| if config.project.get_config("tests.llmd.inference_service.do_simple_test"): | |
| if not test_llm_inference_simple(endpoint_url, llmisvc_name, namespace): | |
| raise RuntimeError("Simple inference test failed :/") |
🧰 Tools
🪛 Ruff (0.15.2)
[warning] 73-73: Abstract raise to an inner function
(TRY301)
[warning] 73-73: Avoid specifying long messages outside the exception class
(TRY003)
[error] 73-73: Undefined name RuntimeException
(F821)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@projects/llm-d/testing/test_llmd.py` around lines 71 - 74, The code raises an
undefined exception class RuntimeException when the simple inference check in
the conditional using
config.project.get_config("tests.llmd.inference_service.do_simple_test") fails;
update the raise statement to use the built-in RuntimeError instead (i.e.,
replace RuntimeException with RuntimeError) in the block that calls
test_llm_inference_simple(endpoint_url, llmisvc_name, namespace).
| # Remove spec.router if it exists | ||
| del isvc_data['spec']['router'] | ||
| logging.info("Removed spec.router for 'simple' flavor") |
There was a problem hiding this comment.
Guard spec.router removal to avoid KeyError.
Both simple and simple-x2 paths unconditionally delete spec.router, but comments say “if it exists”.
🔧 Proposed fix
- del isvc_data['spec']['router']
+ isvc_data.get('spec', {}).pop('router', None)
@@
- del isvc_data['spec']['router']
+ isvc_data.get('spec', {}).pop('router', None)Also applies to: 214-216
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@projects/llm-d/testing/test_llmd.py` around lines 203 - 205, Guard the
deletion of spec.router to avoid KeyError by checking for its existence before
deleting: replace direct del isvc_data['spec']['router'] with a conditional
check like if 'router' in isvc_data.get('spec', {}): and then delete (or pop)
it; apply the same fix for the second occurrence (the other deletion at lines
214-216) so both code paths that handle the 'simple' and 'simple-x2' flavors
safely remove spec.router only when present.
| llmd = getattr(projects, "llm-d") | ||
|
|
||
| def register(): | ||
| """ | ||
| Register Prometheus metrics plots for LLM-D inference visualization | ||
| """ | ||
| get_llmd_main_metrics = llmd.visualizations.llmd_inference.store.parsers.get_llmd_main_metrics | ||
| get_llmd_main_metrics(register=True) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Find all dynamic access sites for "llm-d"
echo "=== Step 1: Dynamic access sites for 'llm-d' ==="
rg -n -C3 'getattr\s*\(\s*projects\s*,\s*"llm-d"\s*\)' || echo "No matches found"
# 2) Inspect how the projects package exposes dynamic attributes
echo -e "\n=== Step 2: Projects package __init__.py files ==="
fd "__init__.py" projects --exec rg -n -C3 '__getattr__|setattr|llm-d' {} + || echo "No matches found"
# 3) Confirm where get_llmd_main_metrics is defined and accessed
echo -e "\n=== Step 3: get_llmd_main_metrics definition and usage ==="
rg -n -C3 '\bget_llmd_main_metrics\b' || echo "No matches found"
# 4) Check the actual file structure
echo -e "\n=== Step 4: File structure check ==="
ls -la projects/llm-d/visualizations/llmd_inference/plotting/prometheus.py 2>/dev/null || echo "File not found or path incorrect"
# 5) Check if projects is a namespace package or regular package
echo -e "\n=== Step 5: Projects package structure ==="
find projects -name "__init__.py" -type f | head -20Repository: openshift-psap/topsail
Length of output: 7458
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if projects/__init__.py exists and its content
echo "=== Projects __init__.py ==="
if [ -f projects/__init__.py ]; then
wc -l projects/__init__.py
echo "--- Content ---"
cat -n projects/__init__.py
else
echo "No projects/__init__.py found"
fi
# Check for namespace package indicators
echo -e "\n=== Directory structure of projects ==="
ls -la projects/ | head -20
# Check if there's a setup.py or pyproject.toml that might define namespace packages
echo -e "\n=== Setup files ==="
find . -maxdepth 2 -name "setup.py" -o -name "pyproject.toml" | head -10
# Try to understand the import: check if llm-d can be imported directly
echo -e "\n=== Check if llm-d is a valid python identifier in projects ==="
python3 << 'PYEOF'
import sys
import os
sys.path.insert(0, '/tmp/repo')
# Check if the import works
try:
import projects
print("projects imported successfully")
print(f"projects type: {type(projects)}")
print(f"projects.__file__: {getattr(projects, '__file__', 'N/A')}")
print(f"projects.__path__: {getattr(projects, '__path__', 'N/A')}")
# Try to get llm-d
try:
llmd = getattr(projects, "llm-d")
print(f"✓ getattr(projects, 'llm-d') works: {llmd}")
except AttributeError as e:
print(f"✗ getattr(projects, 'llm-d') failed: {e}")
except Exception as e:
print(f"Error: {e}")
PYEOFRepository: openshift-psap/topsail
Length of output: 1622
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Check if llm-d has __init__.py
echo "=== llm-d structure ==="
ls -la projects/llm-d/
# Check the problematic prometheus.py file content
echo -e "\n=== Full content of prometheus.py ==="
cat -n projects/llm-d/visualizations/llmd_inference/plotting/prometheus.py
# Search for where this module is imported
echo -e "\n=== Where is prometheus.py imported? ==="
rg -n "from.*prometheus import|import.*prometheus" || echo "No imports found"
# Search for usage of register() function from this module
echo -e "\n=== Uses of register() ==="
rg -n "prometheus.*register\(\)" || echo "No calls found"
# Check if importlib is used to load llm-d
echo -e "\n=== Importlib usage for dynamic import ==="
rg -n "importlib.*llm" || echo "No importlib llm imports found"
# Check for any __getattr__ in projects that might make this work
echo -e "\n=== __getattr__ definitions ==="
rg -n "__getattr__" projects/ || echo "No __getattr__ found"Repository: openshift-psap/topsail
Length of output: 1615
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== projects/llm-d/visualizations/llmd_inference/plotting/__init__.py ==="
cat -n projects/llm-d/visualizations/llmd_inference/plotting/__init__.py
echo -e "\n=== Check if this __init__.py is ever actually executed ==="
rg -n "plotting" projects/ --type py -A2 | head -50Repository: openshift-psap/topsail
Length of output: 5468
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== Test if importlib can access llm-d ==="
python3 << 'PYEOF'
import sys
sys.path.insert(0, '/tmp/repo')
import importlib
try:
llmd = importlib.import_module("projects.llm-d")
print(f"✓ importlib.import_module('projects.llm-d') works: {llmd}")
except Exception as e:
print(f"✗ importlib.import_module('projects.llm-d') failed: {type(e).__name__}: {e}")
# Check what the proper way to access it would be
try:
import projects
# Check what attributes are available
print(f"\nAvailable attributes in projects: {dir(projects)[:5]}...")
except Exception as e:
print(f"Error: {e}")
PYEOF
echo -e "\n=== Check if directory name should be llm_d instead ==="
ls -la projects/ | grep -E "llm|LLM"Repository: openshift-psap/topsail
Length of output: 537
Move dynamic project resolution into register() with proper guards.
getattr(projects, "llm-d") on Line 3 executes at module import time and will fail with AttributeError because namespace packages don't expose hyphenated subdirectories as attributes—only the import system can resolve them. This breaks the entire visualizations module when register() is called. Move the resolution into register() using importlib.import_module() with clear error messaging instead.
Proposed fix
import projects
-llmd = getattr(projects, "llm-d")
def register():
"""
Register Prometheus metrics plots for LLM-D inference visualization
"""
+ import importlib
+ try:
+ llmd = importlib.import_module("projects.llm-d")
+ except (ImportError, ModuleNotFoundError) as exc:
+ raise RuntimeError(
+ "Cannot register LLM-D Prometheus plots: project 'llm-d' is unavailable."
+ ) from exc
+
try:
get_llmd_main_metrics = llmd.visualizations.llmd_inference.store.parsers.get_llmd_main_metrics
except AttributeError as exc:
raise RuntimeError(
"Cannot register LLM-D Prometheus plots: "
"missing visualizations.llmd_inference.store.parsers.get_llmd_main_metrics."
) from exc
+
get_llmd_main_metrics(register=True)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| llmd = getattr(projects, "llm-d") | |
| def register(): | |
| """ | |
| Register Prometheus metrics plots for LLM-D inference visualization | |
| """ | |
| get_llmd_main_metrics = llmd.visualizations.llmd_inference.store.parsers.get_llmd_main_metrics | |
| get_llmd_main_metrics(register=True) | |
| def register(): | |
| """ | |
| Register Prometheus metrics plots for LLM-D inference visualization | |
| """ | |
| import importlib | |
| try: | |
| llmd = importlib.import_module("projects.llm-d") | |
| except (ImportError, ModuleNotFoundError) as exc: | |
| raise RuntimeError( | |
| "Cannot register LLM-D Prometheus plots: project 'llm-d' is unavailable." | |
| ) from exc | |
| try: | |
| get_llmd_main_metrics = llmd.visualizations.llmd_inference.store.parsers.get_llmd_main_metrics | |
| except AttributeError as exc: | |
| raise RuntimeError( | |
| "Cannot register LLM-D Prometheus plots: " | |
| "missing visualizations.llmd_inference.store.parsers.get_llmd_main_metrics." | |
| ) from exc | |
| get_llmd_main_metrics(register=True) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@projects/llm-d/visualizations/llmd_inference/plotting/prometheus.py` around
lines 3 - 10, The module-level getattr(projects, "llm-d") causes import-time
AttributeError; move dynamic project resolution into register(): inside
register() import importlib and call importlib.import_module("projects.llm-d")
(or the correct package path), then obtain get_llmd_main_metrics from the
imported module (e.g.,
imported_module.visualizations.llmd_inference.store.parsers.get_llmd_main_metrics)
and call it with register=True; add try/except to catch
ImportError/AttributeError and raise or log a clear error message indicating the
hyphenated package cannot be resolved at import time and that register() failed
to locate get_llmd_main_metrics.
|
/test jump-ci llm-d scale_up |
|
🔴 Test of 'llm-d test test_ci' failed after 01 hours 15 minutes 39 seconds. 🔴 • Link to the test results. • Link to the reports index. Test configuration: |
|
/test jump-ci llm-d scale_up multi-flavor |
|
🔴 Test of 'llm-d test test_ci' failed after 00 hours 00 minutes 01 seconds. 🔴 • Link to the test results. • No reports index generated... Test configuration: Failure indicator: Empty. (See run.log) |
|
/test jump-ci llm-d scale_up multi-flavor |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
projects/llm-d/toolbox/llmd_run_multiturn_benchmark/tasks/main.yml (1)
56-63:⚠️ Potential issue | 🟠 MajorAvoid parsing tabular
oc getoutput for terminal job state.Line 59 depends on field positions from a human-readable table, which is fragile. Use structured JSONPath from job status to emit
Complete/Faileddeterministically.Proposed patch
- name: Wait for the multiturn benchmark job to complete or fail shell: | set -o pipefail; - oc get job "{{ llmd_run_multiturn_benchmark_name }}" -n "{{ target_namespace }}" --no-headers | awk '{print $2; print "Job duration:", $4 > "/dev/stderr"}' + succeeded=$(oc get job "{{ llmd_run_multiturn_benchmark_name }}" -n "{{ target_namespace }}" -o jsonpath='{.status.succeeded}') + failed=$(oc get job "{{ llmd_run_multiturn_benchmark_name }}" -n "{{ target_namespace }}" -o jsonpath='{.status.failed}') + if [ "${succeeded:-0}" -gt 0 ]; then + echo Complete + elif [ "${failed:-0}" -gt 0 ]; then + echo Failed + else + echo Running + fi register: job_status_result until: job_status_result.stdout in ['Complete', 'Failed']Run this in a cluster context to confirm the current parser behavior:
#!/bin/bash set -euo pipefail job="${1:?job name}" ns="${2:?namespace}" echo "Raw table output:" oc get job "$job" -n "$ns" --no-headers echo "Status conditions:" oc get job "$job" -n "$ns" -o jsonpath='{range .status.conditions[*]}{.type}{" "}{end}{"\n"}' echo "Succeeded/Failed counters:" oc get job "$job" -n "$ns" -o jsonpath='succeeded={.status.succeeded} failed={.status.failed}{"\n"}'🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/toolbox/llmd_run_multiturn_benchmark/tasks/main.yml` around lines 56 - 63, The current task reads human-readable oc get job table output (in the shell that registers job_status_result) which is fragile; change the shell invocation that references llmd_run_multiturn_benchmark_name and target_namespace to use structured JSON/jsonpath (e.g. check .status.conditions[*].type and .status.succeeded/.status.failed) and emit a deterministic string such as "Complete" or "Failed" (and keep any duration to stderr if needed), then keep the until condition on job_status_result.stdout checking for 'Complete' or 'Failed' as before; update the task that registers job_status_result so the deterministic JSON/jsonpath output is produced instead of parsing table columns.
♻️ Duplicate comments (2)
projects/llm-d/testing/test_llmd.py (2)
71-74:⚠️ Potential issue | 🔴 CriticalUse
RuntimeErrorinstead ofRuntimeException; the latter is undefined in Python.
RuntimeExceptionis not a built-in Python exception and will raiseNameErrorat runtime. This was flagged in a previous review but remains unfixed.🐛 Proposed fix
if config.project.get_config("tests.llmd.inference_service.do_simple_test"): if not test_llm_inference_simple(endpoint_url, llmisvc_name, namespace): - raise RuntimeException("Simple inference test failed :/") + raise RuntimeError("Simple inference test failed :/")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 71 - 74, The code raises an undefined RuntimeException in the conditional that calls test_llm_inference_simple; change the exception class to the built-in RuntimeError so the failure path works correctly (replace RuntimeException with RuntimeError in the block that checks config.project.get_config("tests.llmd.inference_service.do_simple_test") and the call to test_llm_inference_simple(endpoint_url, llmisvc_name, namespace)).
200-206:⚠️ Potential issue | 🟠 MajorGuard
spec.routerdeletion to avoidKeyError.Line 204 unconditionally deletes
spec.router, but the comment says "if it exists". If the YAML doesn't have this key, aKeyErrorwill be raised. This was flagged in a previous review but remains unfixed.🔧 Proposed fix
if "simple" in flavor: logging.info("Applying 'simple' flavor modifications") # Remove spec.router if it exists - del isvc_data['spec']['router'] + isvc_data.get('spec', {}).pop('router', None) logging.info("Removed spec.router for 'simple' flavor")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 200 - 206, The code unconditionally deletes isvc_data['spec']['router'] which raises KeyError when absent; update the "simple" flavor branch to check for the existence of the 'spec' and 'router' keys before deleting (e.g., verify 'spec' in isvc_data and 'router' in isvc_data['spec'] or use dict.pop with a default) so that removal is safe—locate the deletion around the conditional handling of the "simple" flavor where isvc_data is modified and replace it with a guarded check or safe removal operation.
🧹 Nitpick comments (3)
projects/llm-d/testing/test_llmd.py (2)
497-499: Dead code:returncodecheck is unreachable.Similar to the earlier observation, if
run.run()raises on failure, line 497's checkif result.returncode != 0will never be true. Consider removing this branch.Based on learnings:
run.run()raises an exception on failure, so this check is unnecessary.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 497 - 499, The return-code check after calling run.run() is dead/unreachable because run.run() raises on failure; remove the unreachable branch that tests result.returncode != 0 in test_llmd.py and instead rely on the existing exception handling around run.run() (or catch the exception where appropriate). Locate the block that calls run.run() (the variable/result named result and the if result.returncode != 0 branch) and delete that conditional and its return False, keeping any meaningful logging inside the exception handler if needed.
263-270: Redundantreturncodecheck afterrun.run().Based on the learning that
run.run()raises an exception on failure, the check at line 266 (if url_result.returncode == 0) is always true when reached. Consider removing this check or using the alternate approach withcheck=Falseif you need to handle non-zero exit codes gracefully.♻️ Suggested simplification
url_result = run.run(f"oc get llminferenceservice {llmisvc_name} -n {namespace} " f"-ojsonpath='{{.status.url}}'", capture_stdout=True) - endpoint_url = url_result.stdout.strip() if url_result.returncode == 0 else "" + endpoint_url = url_result.stdout.strip()Based on learnings:
run.run()raises an exception on failure, so checkingresult.returncode != 0after arun.run()call is unnecessary.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 263 - 270, The code redundantly checks url_result.returncode after calling run.run(), which raises on non-zero exits; update the snippet to either (A) remove the returncode check and set endpoint_url = url_result.stdout.strip() and return it if non-empty, or (B) call run.run(..., check=False) if you intend to handle failures manually and then inspect url_result.returncode; adjust the block around run.run, url_result and endpoint_url accordingly so there is no unreachable/always-true returncode check.projects/llm-d/toolbox/llmd_run_guidellm_benchmark/tasks/main.yml (1)
27-30: Fail fast ifendpoint_urlis empty or invalid.This path now fully depends on user input, so adding an upfront assert will prevent delayed job failures with less actionable errors.
Proposed patch
+- name: Validate the benchmark endpoint URL + assert: + that: + - llmd_run_guidellm_benchmark_endpoint_url is string + - (llmd_run_guidellm_benchmark_endpoint_url | trim | length) > 0 + - llmd_run_guidellm_benchmark_endpoint_url is match('^https?://') + fail_msg: "llmd_run_guidellm_benchmark_endpoint_url must be a non-empty http(s) URL" + - name: Set the benchmark endpoint URL set_fact: benchmark_endpoint_url: "{{ llmd_run_guidellm_benchmark_endpoint_url }}"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/toolbox/llmd_run_guidellm_benchmark/tasks/main.yml` around lines 27 - 30, Add a fail-fast validation before the "Set the benchmark endpoint URL" task: verify that the input variable llmd_run_guidellm_benchmark_endpoint_url is present and a non-empty valid URL, and if not fail the play with a clear message; update the task sequence around the set_fact that sets benchmark_endpoint_url to first run an assert/fail check (using Ansible's assert or fail) referencing llmd_run_guidellm_benchmark_endpoint_url and include a descriptive error message so the run aborts immediately when the endpoint is empty or invalid.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 214-216: The incomplete assignment to
isvc_data['spec']['replicas'] causes a SyntaxError; fix by assigning an integer
value (e.g., 2) when the "x2" flavor is detected and keep the logging
call—update the block that checks if "x2" in flavor to set
isvc_data['spec']['replicas'] = 2 and then call logging.info("Setting
spec.replicas = 2") so the test builds correctly.
---
Outside diff comments:
In `@projects/llm-d/toolbox/llmd_run_multiturn_benchmark/tasks/main.yml`:
- Around line 56-63: The current task reads human-readable oc get job table
output (in the shell that registers job_status_result) which is fragile; change
the shell invocation that references llmd_run_multiturn_benchmark_name and
target_namespace to use structured JSON/jsonpath (e.g. check
.status.conditions[*].type and .status.succeeded/.status.failed) and emit a
deterministic string such as "Complete" or "Failed" (and keep any duration to
stderr if needed), then keep the until condition on job_status_result.stdout
checking for 'Complete' or 'Failed' as before; update the task that registers
job_status_result so the deterministic JSON/jsonpath output is produced instead
of parsing table columns.
---
Duplicate comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 71-74: The code raises an undefined RuntimeException in the
conditional that calls test_llm_inference_simple; change the exception class to
the built-in RuntimeError so the failure path works correctly (replace
RuntimeException with RuntimeError in the block that checks
config.project.get_config("tests.llmd.inference_service.do_simple_test") and the
call to test_llm_inference_simple(endpoint_url, llmisvc_name, namespace)).
- Around line 200-206: The code unconditionally deletes
isvc_data['spec']['router'] which raises KeyError when absent; update the
"simple" flavor branch to check for the existence of the 'spec' and 'router'
keys before deleting (e.g., verify 'spec' in isvc_data and 'router' in
isvc_data['spec'] or use dict.pop with a default) so that removal is safe—locate
the deletion around the conditional handling of the "simple" flavor where
isvc_data is modified and replace it with a guarded check or safe removal
operation.
---
Nitpick comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 497-499: The return-code check after calling run.run() is
dead/unreachable because run.run() raises on failure; remove the unreachable
branch that tests result.returncode != 0 in test_llmd.py and instead rely on the
existing exception handling around run.run() (or catch the exception where
appropriate). Locate the block that calls run.run() (the variable/result named
result and the if result.returncode != 0 branch) and delete that conditional and
its return False, keeping any meaningful logging inside the exception handler if
needed.
- Around line 263-270: The code redundantly checks url_result.returncode after
calling run.run(), which raises on non-zero exits; update the snippet to either
(A) remove the returncode check and set endpoint_url = url_result.stdout.strip()
and return it if non-empty, or (B) call run.run(..., check=False) if you intend
to handle failures manually and then inspect url_result.returncode; adjust the
block around run.run, url_result and endpoint_url accordingly so there is no
unreachable/always-true returncode check.
In `@projects/llm-d/toolbox/llmd_run_guidellm_benchmark/tasks/main.yml`:
- Around line 27-30: Add a fail-fast validation before the "Set the benchmark
endpoint URL" task: verify that the input variable
llmd_run_guidellm_benchmark_endpoint_url is present and a non-empty valid URL,
and if not fail the play with a clear message; update the task sequence around
the set_fact that sets benchmark_endpoint_url to first run an assert/fail check
(using Ansible's assert or fail) referencing
llmd_run_guidellm_benchmark_endpoint_url and include a descriptive error message
so the run aborts immediately when the endpoint is empty or invalid.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
docs/toolbox.generated/Llmd.run_guidellm_benchmark.rstdocs/toolbox.generated/Llmd.run_multiturn_benchmark.rstprojects/llm-d/testing/config.yamlprojects/llm-d/testing/llmisvcs/llama-3-1-8b-instruct-fp8.yamlprojects/llm-d/testing/test_llmd.pyprojects/llm-d/toolbox/llmd.pyprojects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.ymlprojects/llm-d/toolbox/llmd_run_guidellm_benchmark/defaults/main/config.ymlprojects/llm-d/toolbox/llmd_run_guidellm_benchmark/tasks/main.ymlprojects/llm-d/toolbox/llmd_run_multiturn_benchmark/defaults/main/config.ymlprojects/llm-d/toolbox/llmd_run_multiturn_benchmark/tasks/main.yml
🚧 Files skipped from review as they are similar to previous changes (4)
- projects/llm-d/testing/llmisvcs/llama-3-1-8b-instruct-fp8.yaml
- projects/llm-d/toolbox/llmd_run_multiturn_benchmark/defaults/main/config.yml
- projects/llm-d/toolbox/llmd.py
- projects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.yml
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
projects/llm-d/toolbox/llmd.py (1)
51-63:⚠️ Potential issue | 🟠 MajorAvoid hard-breaking callers on renamed public kwargs.
Changing the public method kwargs from
llmisvc_nametoendpoint_url(Line 53, Line 77) is a backward-incompatible API break for existing keyword callers. Please add a temporary compatibility shim (accept old kwarg and map it) or explicitly ship this as a documented breaking change with migration guidance.Suggested compatibility pattern
- def run_multiturn_benchmark( - self, - endpoint_url, + def run_multiturn_benchmark( + self, + endpoint_url=None, llmisvc_name=None, name="multi-turn-benchmark", namespace="", image="quay.io/hayesphilip/multi-turn-benchmark", version="0.0.1", timeout=900, parallel=9 ): + endpoint_url = endpoint_url or llmisvc_name + if not endpoint_url: + raise ValueError("endpoint_url is required") return RunAnsibleRole(locals())- def run_guidellm_benchmark( - self, - endpoint_url, + def run_guidellm_benchmark( + self, + endpoint_url=None, llmisvc_name=None, name="guidellm-benchmark", namespace="", image="ghcr.io/vllm-project/guidellm", version="v0.5.3", timeout=900, profile="sweep", max_seconds=30, processor="RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8-dynamic", data="prompt_tokens=256,output_tokens=128" ): + endpoint_url = endpoint_url or llmisvc_name + if not endpoint_url: + raise ValueError("endpoint_url is required") return RunAnsibleRole(locals())Also applies to: 75-89
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/toolbox/llmd.py` around lines 51 - 63, The public method run_multiturn_benchmark was renamed from llmisvc_name to endpoint_url and this breaks existing keyword callers; add a compatibility shim that accepts the old keyword (llmisvc_name) and maps it to endpoint_url when provided (e.g., add llmisvc_name=None to the signature or capture **kwargs and set endpoint_url = endpoint_url or kwargs.get("llmisvc_name")), ensure that if both are provided endpoint_url takes precedence, and add a brief inline comment referencing run_multiturn_benchmark to remove the shim after a deprecation period.
♻️ Duplicate comments (2)
projects/llm-d/testing/test_llmd.py (2)
203-205:⚠️ Potential issue | 🟠 MajorGuard
spec.routerremoval to avoidKeyError.The
delstatement will raiseKeyErrorifspec.routerdoesn't exist in the YAML. This was flagged in a previous review.🔧 Proposed fix
if "simple" in flavor: logging.info("Applying 'simple' flavor modifications") # Remove spec.router if it exists - del isvc_data['spec']['router'] + isvc_data.get('spec', {}).pop('router', None) logging.info("Removed spec.router for 'simple' flavor")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 203 - 205, The code unconditionally deletes isvc_data['spec']['router'], which can raise KeyError; update the test to safely remove the key by first ensuring 'spec' exists on isvc_data and then using a safe removal (e.g., dict.pop with a default) for 'router' so no exception is raised if it's absent; reference the isvc_data dict and the spec.router key in the test_llmd.py snippet when making this change.
71-74:⚠️ Potential issue | 🔴 CriticalUse
RuntimeErrorinstead ofRuntimeException; the latter is undefined in Python.
RuntimeExceptionis not a built-in Python exception and will raiseNameErrorat runtime. This was flagged in a previous review and is confirmed by static analysis (F821).🐛 Proposed fix
if config.project.get_config("tests.llmd.inference_service.do_simple_test"): if not test_llm_inference_simple(endpoint_url, llmisvc_name, namespace): - raise RuntimeException("Simple inference test failed :/") + raise RuntimeError("Simple inference test failed")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 71 - 74, Replace the undefined Python exception RuntimeException with the built-in RuntimeError where the simple inference test failure is raised (inside the conditional that calls test_llm_inference_simple with endpoint_url, llmisvc_name, namespace); update the raise statement to use RuntimeError("Simple inference test failed :/") so it doesn't trigger a NameError/F821 at runtime.
🧹 Nitpick comments (3)
projects/llm-d/testing/test_llmd.py (3)
223-224: Chain the exception to preserve traceback.Using
raise ... from epreserves the original exception context for debugging.♻️ Suggested improvement
except Exception as e: - raise ValueError(f"Failed to write reshaped ISVC file to {output_path}: {e}") + raise ValueError(f"Failed to write reshaped ISVC file to {output_path}") from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 223 - 224, The except block that currently re-raises a ValueError for failed writes to the reshaped ISVC file should chain the original exception to preserve the traceback; update the exception re-raise in the except Exception as e handler to use "raise ValueError(f'Failed to write reshaped ISVC file to {output_path}: {e}') from e" so the original exception context (e) is preserved when raising the new ValueError.
336-372: Unused parameterllmisvc_name(same asrun_multiturn_benchmark).Same issue as above. Consider removing or documenting the retained parameter.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` around lines 336 - 372, The function run_guidellm_benchmark currently accepts an unused parameter llmisvc_name (same situation as run_multiturn_benchmark); either remove llmisvc_name from the function signature and any callers, or if it must be retained for API compatibility, add a brief docstring note and use a no-op reference (e.g., mark as unused with a comment) to avoid linter complaints; update run_guidellm_benchmark's signature and any tests/call sites or add the documentation change accordingly to reflect the chosen approach.
301-301: Unused parameterllmisvc_name.The
llmisvc_nameparameter is not used in this function. Consider removing it or documenting why it's retained (e.g., for API consistency).♻️ Option: Remove unused parameter
-def run_multiturn_benchmark(endpoint_url, llmisvc_name, namespace): +def run_multiturn_benchmark(endpoint_url, namespace):Note: This would require updating the call site at line 77.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@projects/llm-d/testing/test_llmd.py` at line 301, The function run_multiturn_benchmark has an unused parameter llmisvc_name — either remove llmisvc_name from the function signature and all call sites that pass it, or if it must be kept for API consistency, document it in the function docstring and use it (e.g., include it in logging or as a nop-flag) so linters/test coverage stop flagging it; update all callers of run_multiturn_benchmark accordingly to match the new signature or to pass a meaningful value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 490-495: The inline shell command building the curl request in the
run.run call is fragile because payload_json may contain single quotes; change
the invocation so the JSON is passed safely via stdin or a heredoc/base64
instead of embedding it inside single quotes. Locate the run.run(...) that
references payload_json, namespace, deployment_name and endpoint_url and replace
the inline '-d '<payload_json>'' usage with a safe transfer (e.g., base64-encode
payload_json and decode in the container piped to curl -d `@-` or use a quoted
heredoc like <<'EOF' to feed curl), ensuring the curl command reads the body
from stdin or a file to avoid shell-quoting issues.
- Around line 99-101: The code currently overwrites the exception object by
setting overall_failed = True when a flavor fails, so the later
isinstance(overall_failed, Exception) check never passes; change the logic to
preserve the first exception (e.g., set overall_failed = flavor_failed or
introduce first_exception and assign first_exception = flavor_failed when it's
None) and keep a separate boolean if needed (e.g., had_failure = True) so the
existing re-raise path (isinstance(..., Exception)) will receive the original
Exception object and can re-raise it.
- Around line 263-279: The subprocess call handling is wrong because run.run()
defaults to check=True so returncode checks in the url_result handling are
unreachable; update both calls to run.run(...) that populate url_result (the two
calls fetching .status.url and .status.addresses[0].url) to pass check=False so
you can inspect url_result.returncode and url_result.stderr, and keep the
existing logic that sets endpoint_url from url_result.stdout and raises
RuntimeError using url_result.stderr when returncode != 0; alternatively, if you
prefer exceptions, remove the returncode checks and wrap the run.run calls in a
try/except for subprocess.CalledProcessError around the block that uses
llmisvc_name and namespace.
---
Outside diff comments:
In `@projects/llm-d/toolbox/llmd.py`:
- Around line 51-63: The public method run_multiturn_benchmark was renamed from
llmisvc_name to endpoint_url and this breaks existing keyword callers; add a
compatibility shim that accepts the old keyword (llmisvc_name) and maps it to
endpoint_url when provided (e.g., add llmisvc_name=None to the signature or
capture **kwargs and set endpoint_url = endpoint_url or
kwargs.get("llmisvc_name")), ensure that if both are provided endpoint_url takes
precedence, and add a brief inline comment referencing run_multiturn_benchmark
to remove the shim after a deprecation period.
---
Duplicate comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 203-205: The code unconditionally deletes
isvc_data['spec']['router'], which can raise KeyError; update the test to safely
remove the key by first ensuring 'spec' exists on isvc_data and then using a
safe removal (e.g., dict.pop with a default) for 'router' so no exception is
raised if it's absent; reference the isvc_data dict and the spec.router key in
the test_llmd.py snippet when making this change.
- Around line 71-74: Replace the undefined Python exception RuntimeException
with the built-in RuntimeError where the simple inference test failure is raised
(inside the conditional that calls test_llm_inference_simple with endpoint_url,
llmisvc_name, namespace); update the raise statement to use RuntimeError("Simple
inference test failed :/") so it doesn't trigger a NameError/F821 at runtime.
---
Nitpick comments:
In `@projects/llm-d/testing/test_llmd.py`:
- Around line 223-224: The except block that currently re-raises a ValueError
for failed writes to the reshaped ISVC file should chain the original exception
to preserve the traceback; update the exception re-raise in the except Exception
as e handler to use "raise ValueError(f'Failed to write reshaped ISVC file to
{output_path}: {e}') from e" so the original exception context (e) is preserved
when raising the new ValueError.
- Around line 336-372: The function run_guidellm_benchmark currently accepts an
unused parameter llmisvc_name (same situation as run_multiturn_benchmark);
either remove llmisvc_name from the function signature and any callers, or if it
must be retained for API compatibility, add a brief docstring note and use a
no-op reference (e.g., mark as unused with a comment) to avoid linter
complaints; update run_guidellm_benchmark's signature and any tests/call sites
or add the documentation change accordingly to reflect the chosen approach.
- Line 301: The function run_multiturn_benchmark has an unused parameter
llmisvc_name — either remove llmisvc_name from the function signature and all
call sites that pass it, or if it must be kept for API consistency, document it
in the function docstring and use it (e.g., include it in logging or as a
nop-flag) so linters/test coverage stop flagging it; update all callers of
run_multiturn_benchmark accordingly to match the new signature or to pass a
meaningful value.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
docs/toolbox.generated/Llmd.run_guidellm_benchmark.rstdocs/toolbox.generated/Llmd.run_multiturn_benchmark.rstprojects/llm-d/testing/config.yamlprojects/llm-d/testing/llmisvcs/llama-3-1-8b-instruct-fp8.yamlprojects/llm-d/testing/test_llmd.pyprojects/llm-d/toolbox/llmd.pyprojects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.ymlprojects/llm-d/toolbox/llmd_run_guidellm_benchmark/defaults/main/config.ymlprojects/llm-d/toolbox/llmd_run_guidellm_benchmark/tasks/main.ymlprojects/llm-d/toolbox/llmd_run_multiturn_benchmark/defaults/main/config.ymlprojects/llm-d/toolbox/llmd_run_multiturn_benchmark/tasks/main.yml
🚧 Files skipped from review as they are similar to previous changes (5)
- projects/llm-d/toolbox/llmd_run_multiturn_benchmark/tasks/main.yml
- projects/llm-d/testing/llmisvcs/llama-3-1-8b-instruct-fp8.yaml
- projects/llm-d/toolbox/llmd_run_guidellm_benchmark/tasks/main.yml
- docs/toolbox.generated/Llmd.run_guidellm_benchmark.rst
- projects/llm-d/toolbox/llmd_deploy_llm_inference_service/tasks/main.yml
| if flavor_failed: | ||
| overall_failed = True | ||
| logging.error(f"Test failed for flavor: {flavor}") |
There was a problem hiding this comment.
Exception object is lost; re-raise logic at line 120-121 will never trigger.
At line 100, overall_failed is set to True (boolean), but line 87 stores the actual exception in flavor_failed. Later at lines 120-121, the code checks isinstance(overall_failed, Exception), which will always be False since overall_failed is a boolean.
If the intent is to re-raise the first exception encountered, preserve it:
🔧 Proposed fix
if flavor_failed:
- overall_failed = True
+ if not overall_failed:
+ overall_failed = flavor_failed if isinstance(flavor_failed, Exception) else True
logging.error(f"Test failed for flavor: {flavor}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@projects/llm-d/testing/test_llmd.py` around lines 99 - 101, The code
currently overwrites the exception object by setting overall_failed = True when
a flavor fails, so the later isinstance(overall_failed, Exception) check never
passes; change the logic to preserve the first exception (e.g., set
overall_failed = flavor_failed or introduce first_exception and assign
first_exception = flavor_failed when it's None) and keep a separate boolean if
needed (e.g., had_failure = True) so the existing re-raise path (isinstance(...,
Exception)) will receive the original Exception object and can re-raise it.
| url_result = run.run(f"oc get llminferenceservice {llmisvc_name} -n {namespace} " | ||
| f"-o jsonpath='{{.status.url}}'", capture_stdout=True) | ||
| f"-ojsonpath='{{.status.url}}'", capture_stdout=True) | ||
|
|
||
| if url_result.returncode == 0: | ||
| logging.info(f"LLM inference service URL: {url_result.stdout.strip()}") | ||
| endpoint_url = url_result.stdout.strip() if url_result.returncode == 0 else "" | ||
|
|
||
| return llmisvc_name, namespace | ||
| if endpoint_url: | ||
| logging.info(f"LLM inference service URL: {endpoint_url}") | ||
| return endpoint_url | ||
|
|
||
| # If that didn't work or is empty, try .status.addresses[0].url | ||
| logging.info("Trying alternate URL location at .status.addresses[0].url") | ||
| url_result = run.run(f"oc get llminferenceservice {llmisvc_name} -n {namespace} " | ||
| f"-ojsonpath='{{.status.addresses[0].url}}'", capture_stdout=True) | ||
|
|
||
| if url_result.returncode != 0: | ||
| logging.error("Failed to get LLM inference service URL from both locations") | ||
| raise RuntimeError(f"Failed to get LLM inference service URL: {url_result.stderr}") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and examine the run.run() function implementation
find . -name "run.py" -path "*/core/library/*" -type fRepository: openshift-psap/topsail
Length of output: 96
🏁 Script executed:
# Search for the run.run() function definition
rg -n "def run\(" --type=py | grep -E "(core|library)" | head -20Repository: openshift-psap/topsail
Length of output: 396
🏁 Script executed:
# Read the run.run() function implementation
sed -n '105,200p' projects/core/library/run.pyRepository: openshift-psap/topsail
Length of output: 3292
Returncode checks are unreachable dead code.
The run.run() function uses check=True by default, which causes subprocess.run() to raise a CalledProcessError exception if the command fails. The returncode checks at lines 266 and 277 will never be reached on failure—the exception will be raised first.
Either remove these checks or pass check=False to run.run() if you want to inspect the return code instead.
🧰 Tools
🪛 Ruff (0.15.2)
[warning] 279-279: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@projects/llm-d/testing/test_llmd.py` around lines 263 - 279, The subprocess
call handling is wrong because run.run() defaults to check=True so returncode
checks in the url_result handling are unreachable; update both calls to
run.run(...) that populate url_result (the two calls fetching .status.url and
.status.addresses[0].url) to pass check=False so you can inspect
url_result.returncode and url_result.stderr, and keep the existing logic that
sets endpoint_url from url_result.stdout and raises RuntimeError using
url_result.stderr when returncode != 0; alternatively, if you prefer exceptions,
remove the returncode checks and wrap the run.run calls in a try/except for
subprocess.CalledProcessError around the block that uses llmisvc_name and
namespace.
| result = run.run(f""" | ||
| curl -s "{url}/v1/completions" \\ | ||
| oc rsh -n {namespace} -c main deploy/{deployment_name} \\ | ||
| curl -k -sSf "{endpoint_url}/v1/completions" \\ | ||
| -H "Content-Type: application/json" \\ | ||
| -d @{payload_file} | ||
| -d '{payload_json}' | ||
| """, capture_stdout=True) |
There was a problem hiding this comment.
Shell quoting fragility with inline JSON payload.
The JSON payload is embedded directly in a shell command using single quotes. If the payload ever contains single quotes (e.g., in prompts), the command will break. Current payload is safe but this is fragile for future modifications.
♻️ Safer alternative using heredoc or base64
# Option 1: Use base64 encoding
import base64
payload_b64 = base64.b64encode(payload_json.encode()).decode()
result = run.run(f"""
oc rsh -n {namespace} -c main deploy/{deployment_name} \\
sh -c 'echo {payload_b64} | base64 -d | curl -k -sSf "{endpoint_url}/v1/completions" \\
-H "Content-Type: application/json" -d `@-`'
""", capture_stdout=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@projects/llm-d/testing/test_llmd.py` around lines 490 - 495, The inline shell
command building the curl request in the run.run call is fragile because
payload_json may contain single quotes; change the invocation so the JSON is
passed safely via stdin or a heredoc/base64 instead of embedding it inside
single quotes. Locate the run.run(...) that references payload_json, namespace,
deployment_name and endpoint_url and replace the inline '-d '<payload_json>''
usage with a safe transfer (e.g., base64-encode payload_json and decode in the
container piped to curl -d `@-` or use a quoted heredoc like <<'EOF' to feed
curl), ensuring the curl command reads the body from stdin or a file to avoid
shell-quoting issues.
|
🔴 Test of 'llm-d test test_ci' failed after 01 hours 22 minutes 09 seconds. 🔴 • Link to the test results. • Link to the reports index. Test configuration: |
|
/test ? |
1 similar comment
|
🔴 Test of 'llm-d test generate_plots_from_pr_args' failed after 00 hours 00 minutes 00 seconds. 🔴 • Link to the test results. • No reports index generated... Test configuration: Failure indicator: Empty. (See run.log) |
|
🔴 Test of 'llm-d test generate_plots_from_pr_args' failed after 00 hours 00 minutes 01 seconds. 🔴 • Link to the test results. • No reports index generated... Test configuration: Failure indicator: Empty. (See run.log) |
|
@kpouget: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
🟢 Test of 'llm-d test generate_plots_from_pr_args' succeeded after 00 hours 00 minutes 32 seconds. 🟢 • Link to the test results. • Link to the reports index. Test configuration: |
Summary by CodeRabbit
New Features
Improvements
Chores