Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,9 @@ markers = [
"storage: Disk related, like I/O or disk space",
"question-answer: Simple question-answer tests where Holmes answers straightforward questions about the system",
"prometheus: Tests involving Prometheus metrics",
"loki: Tests involving Loki logs",
"newrelic: New Relic toolset",
"embeds: Ability of holmes to include embeds like << { promql... } >> in its answers",
"no-cicd: Tests to skip in the GitHub action because we're missing prerequisites in the KIND cluster like a Prometheus instance",
]

Expand Down
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@

user_prompt: "The payment-api pod in namespace app-100a had issues on August 2, 2025 around 13:45 UTC. What happened?"

expected_output:
Expand All @@ -11,6 +12,7 @@ tags:
- logs
- kubernetes
- medium
- loki

setup_timeout: 600 # Increased to handle Datadog init containers

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ tags:
- logs
- kubernetes
- medium
- loki

setup_timeout: 600 # Increased to handle Datadog init containers

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ tags:
- logs
- kubernetes
- hard
- loki

setup_timeout: 360 # Increased from 300 to handle pod deletion and Loki ingestion variability

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ tags:
- logs
- kubernetes
- toolset-limitation
- loki

port_forwards:
- namespace: app-102
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,4 @@ tags:
- logs
- transparency
- medium
- loki
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,11 @@ expected_output: |
random_key may be different than the above example so long as its a random looking key, but all other parameters (type and tool_name) must be as described

Output must NOT tell the user it doesn't have access to metrics or that they should use another tool

skip: true
skip_reason: "missing setup - need to write data to datadog"
tags:
- datadog
- metrics
- medium
- embeds
Comment on lines +28 to +32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify all tags are declared pytest markers.

The test adds four new tags: datadog, metrics, medium, and embeds. While embeds is confirmed as a new marker in this PR, ensure all other tags (datadog, metrics, medium) are properly declared in pyproject.toml.

Based on learnings: Only use pytest markers that are declared in pyproject.toml.


🏁 Script executed:

#!/bin/bash
# Verify that all tags used in test fixtures are declared as pytest markers in pyproject.toml

# Extract markers from pyproject.toml
echo "=== Declared pytest markers in pyproject.toml ==="
rg -A 50 '^\[tool\.pytest\.ini_options\]' pyproject.toml | rg 'markers\s*=' -A 20

echo ""
echo "=== Tags used in this test case ==="
echo "datadog, metrics, medium, embeds"

Length of output: 1836


Add missing pytest markers datadog and embeds
Add both datadog and embeds to the markers list under [tool.pytest.ini_options] in pyproject.toml so all tags used in tests are declared.

🤖 Prompt for AI Agents
In
tests/llm/fixtures/test_ask_holmes/110_cpu_graph_robusta_runner/test_case.yaml
around lines 28 to 32, the test tags include "datadog" and "embeds" but those
markers are not declared; open pyproject.toml and under
[tool.pytest.ini_options] add "datadog" and "embeds" to the markers list so
every tag used in tests is declared (ensure syntax matches existing marker
entries and save).

Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ tags:
- traces
- newrelic
- network
- embeds
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
description: "Test if holmes pulls logs reliably, expects env var NEW_RELIC_LICENSE_KEY"


user_prompt:
- fetch error logs for app payment-gateway-s1fdsa-231ds from newrelic

Expand All @@ -10,6 +9,7 @@ expected_output:
tags:
- newrelic
- logs
- medium

# Custom timeout for setup (needs 5 min sleep + setup time)
setup_timeout: 360 # 6 minutes total
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
description: "Test if holmes pulls logs reliably"

description: "Test if holmes pulls metrics reliably"

user_prompt:
- give me a graph of the custom newrelic metric sales.amount and the currency
Expand All @@ -13,6 +12,7 @@ expected_output:
tags:
- newrelic
- metrics
- medium

# Custom timeout for setup (needs 2 min sleep + setup time)
setup_timeout: 240 # 4 minutes total
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ expected_output:
tags:
- newrelic
- traces
- medium
Comment on lines 9 to +12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Verify that "medium" tag is a declared pytest marker.

The "medium" tag is being added. Per the learnings, only pytest markers declared in pyproject.toml should be used.

Based on learnings.

Run the following script to verify this tag is declared:


🏁 Script executed:

#!/bin/bash
# Description: Verify "medium" is a declared pytest marker

rg -A 1 'markers\s*=' pyproject.toml | grep -i medium || echo "⚠️ 'medium' marker not found in pyproject.toml"

Length of output: 46


Declare the “medium” pytest marker or remove its usage
The “medium” tag isn’t listed under [tool.pytest.ini_options] markers in pyproject.toml. Add it there or eliminate the tag from test_case.yaml to comply with pytest marker declarations.

🤖 Prompt for AI Agents
In tests/llm/fixtures/test_ask_holmes/120_new_relic_traces2/test_case.yaml
around lines 9 to 12 the test uses the "medium" tag but that marker is not
declared in [tool.pytest.ini_options] markers in pyproject.toml; either add a
"medium" marker entry to pyproject.toml under markers (e.g. 'medium:
description') or remove the "medium" tag from this test_case.yaml so pytest
marker declarations remain consistent.


before_test: |
[ -n "${NEW_RELIC_ACCOUNT_ID:-}" ] && [ -n "${NEW_RELIC_API_KEY:-}" ] && [ -n "${NEW_RELIC_LICENSE_KEY:-}" ] || { for v in NEW_RELIC_ACCOUNT_ID NEW_RELIC_API_KEY NEW_RELIC_LICENSE_KEY; do [ -n "${!v:-}" ] || echo "Missing env var: $v"; done; exit 1; }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ tags:
- hard
- chain-of-causation
- traces
- newrelic

setup_timeout: 600 # 10 minutes for pod scheduling

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ tags:
- hard
- chain-of-causation
- traces
- newrelic

setup_timeout: 600 # 10 minutes for pod scheduling

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ tags:
- hard
- chain-of-causation
- traces
- newrelic

setup_timeout: 600 # 10 minutes for pod scheduling

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ tags:
- context_window
- medium
- prometheus
- embeds
- metrics

# Custom timeout for setup (needs time for metrics + setup)
setup_timeout: 480 # 8 minutes total (120s setup + 300s metrics + buffer)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,9 @@ expected_output:
tags:
- prometheus
- kubernetes
- embeds
- metrics
- medium

# Port forward configuration for Prometheus
port_forwards:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ expected_output:
tags:
- prometheus
- kubernetes
- embeds
- metrics
- medium

# Port forward configuration for Prometheus
port_forwards:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,9 @@ expected_output:
tags:
- prometheus
- kubernetes
- embeds
- metrics
- medium

# Port forward configuration for Prometheus
port_forwards:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
user_prompt: "Show me a graph (with prometheus) of the memory usage for the kubernetes cluster over the last 30 minutes"

expected_output:
- 'The answer should include something similar to << { "type": "promql", "tool_name": "execute_prometheus_range_query", "random_key": "AtEJ" } >>. Both tool_name and random_key MUST absolutely be present, but tool_name must be execute_prometheus_range_query while random_key can have any value.'

tags:
- prometheus
evaluation:
correctness: 1
- embeds
- metrics
- medium

port_forwards:
- namespace: default
service: robusta-kube-prometheus-st-prometheus
Expand Down
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
user_prompt: "Show me a graph of the CPU usage for the pod robusta-holmes over the last 30 minutes"

expected_output:
- 'The answer should include something similar to << { "type": "promql", "tool_name": "execute_prometheus_range_query", "random_key": "HBGf" } >>. Both tool_name and random_key MUST absolutely be present. random_key can have any value, tool_name must be execute_prometheus_range_query.'

tags:
- prometheus
- embeds
- metrics
- medium

port_forwards:
- namespace: default
service: robusta-kube-prometheus-st-prometheus
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ expected_output:
tags:
- prometheus
- easy
- metrics
- no-cicd

include_tool_calls: true
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
user_prompt: "Give me a graph of `container_memory_working_set_bytes`"
expected_output:
- 'The answer should include something similar to << { "type": "promql", "tool_name": "execute_prometheus_range_query", "random_key": "<alphanumeric_characters>" } >>. Both tool_name and random_key MUST absolutely be present and the same as mentioned here.'

tags:
- prometheus
evaluation:
correctness: 1
- embeds
- metrics
- medium

Comment on lines 4 to +10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Verify that all tags are declared pytest markers.

The tags prometheus, embeds, metrics, and medium are being added to this test fixture. According to project learnings, only pytest markers declared in pyproject.toml should be used.

Based on learnings.

Run the following script to verify these markers are declared:


🏁 Script executed:

#!/bin/bash
# Description: Check if the tags used in this test fixture are declared as pytest markers in pyproject.toml

# Search for pytest marker declarations
rg -A 20 '^\[tool\.pytest\.ini_options\]|^markers\s*=' pyproject.toml

Length of output: 1742


🏁 Script executed:

#!/bin/bash
# Show lines around port_forwards in the YAML file
rg -n 'port_forwards' -C3 tests/llm/fixtures/test_ask_holmes/34_memory_graph/test_case.yaml

Length of output: 152


Fix undeclared pytest markers and namespace

  • Remove or declare the prometheus and embeds tags in pyproject.toml—only metrics and medium are valid markers.
  • Change port_forwards.namespace from default to the required dedicated namespace app-34_memory_graph (must use app-<testid>).
🤖 Prompt for AI Agents
In tests/llm/fixtures/test_ask_holmes/34_memory_graph/test_case.yaml around
lines 5 to 10, the YAML uses undeclared pytest markers and the wrong namespace:
remove or declare the prometheus and embeds tags (only metrics and medium are
valid markers per pyproject.toml) — either delete the prometheus and embeds
lines or add them to pyproject.toml under [tool.pytest.ini_options].markers;
also change port_forwards.namespace from default to the required dedicated
namespace app-34_memory_graph (must use app-<testid>) so the test uses the
correct namespace.

port_forwards:
- namespace: default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use a dedicated namespace following the required naming convention.

The namespace default violates the coding guideline requiring each LLM test to use a dedicated Kubernetes namespace of the form app-<testid>. For this test, the namespace should be app-34_memory_graph.

As per coding guidelines.

Apply this diff to fix the namespace:

-  - namespace: default
+  - namespace: app-34_memory_graph
     service: robusta-kube-prometheus-st-prometheus
     local_port: 9090
     remote_port: 9090
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- namespace: default
- namespace: app-34_memory_graph
service: robusta-kube-prometheus-st-prometheus
local_port: 9090
remote_port: 9090
🤖 Prompt for AI Agents
In tests/llm/fixtures/test_ask_holmes/34_memory_graph/test_case.yaml around line
12, the namespace is set to "default" which violates the guideline requiring a
dedicated namespace per test; update the namespace value to
"app-34_memory_graph" (i.e., replace the existing namespace entry with
namespace: app-34_memory_graph) so the test uses the correct dedicated
Kubernetes namespace following the naming convention.

service: robusta-kube-prometheus-st-prometheus
Expand Down
Loading
Loading