Skip to content
Open
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
43bd886
feat: ensure all necessary bash commands are supported, bash by defau…
nherment Sep 9, 2025
85c3ab6
feat: ensure all necessary bash commands are supported, bash by defau…
nherment Sep 10, 2025
f601e8a
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Sep 12, 2025
c9ce45f
feat: add support for kubectl cluster-info
nherment Sep 15, 2025
6cbb456
feat: finalize bash toolset enabled by default
nherment Sep 16, 2025
a6150d7
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Sep 16, 2025
2796bc4
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Sep 17, 2025
ac90fd5
chore: linting
nherment Sep 17, 2025
7a0b46f
chore: fix tests
nherment Sep 17, 2025
247b1c1
fix: add toolset docs urls + remove prints
nherment Sep 17, 2025
6984a87
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Sep 17, 2025
d88fb45
chore: address PR comments
nherment Sep 17, 2025
19c2969
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Sep 25, 2025
3a63eb9
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
moshemorad Sep 25, 2025
0446f87
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Sep 26, 2025
389756a
fix: some toolsets should be enabled by default
nherment Sep 26, 2025
fb687be
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Sep 26, 2025
f56a80a
mock files
nherment Sep 26, 2025
0fed363
refresh mock files
nherment Sep 26, 2025
7923c00
mock files
nherment Sep 26, 2025
9463cf0
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Oct 1, 2025
4d48942
Merge branch 'master' into ROB-2085_bash_tool_enabled_by_default
nherment Oct 1, 2025
cababff
rm merge files
nherment Oct 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ def storage_dal_mock():
def responses():
with responses_.RequestsMock() as rsps:
rsps.add_passthru("https://www.braintrust.dev")
rsps.add_passthru("https://api.braintrust.dev") # Allow Braintrust API calls
rsps.add_passthru("https://api.braintrust.dev")
rsps.add_passthru("http://localhost")

# Allow all Datadog API calls to pass through (all regions and endpoints)
Expand Down
28 changes: 15 additions & 13 deletions docs/data-sources/builtin-toolsets/bash.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
# Bash Toolset
# Bash Toolset ✓

!!! info "Enabled by Default"
This toolset is enabled by default and should typically remain enabled.

The bash toolset provides secure execution of common command-line tools used for troubleshooting and system analysis. It replaces multiple YAML-based toolsets with a single, comprehensive toolset that includes safety validation and command parsing.

**⚠️ Security Note**: This toolset executes commands on the system where Holmes is running. Only validated, safe commands are allowed, and the toolset is disabled by default for security reasons.

**⚠️ Security Note**: This toolset executes commands on the system where Holmes is running. Only validated, safe commands are allowed. The toolset includes built-in safety validation and command parsing.

## Supported Commands

Expand Down Expand Up @@ -70,17 +74,6 @@ The bash toolset supports the following categories of commands:
- `tr` - Character translation and deletion
- `base64` - Base64 encoding/decoding

### Special Tools

**kubectl_run_image**

Creates temporary debug pods in Kubernetes clusters for diagnostic commands:

- Runs commands in specified container images
- Automatically cleans up temporary pods
- Supports custom namespaces and timeouts
- Useful for network debugging, DNS resolution, and environment inspection

## Command Validation

All commands undergo security validation before execution:
Expand All @@ -89,3 +82,12 @@ All commands undergo security validation before execution:
- Dangerous operations are blocked (file writes, system calls, etc.)
- Commands are parsed and validated for safety
- Pipe operations between supported commands are allowed

## Configuration

The bash tool can be configured with the following environment variables:

| Env var | Default value | Description |
|----------------------------|:-------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| BASH_TOOL_UNSAFE_ALLOW_ALL | false | Disables safety checks and allow Holmes to run any bash command immediately and without warning. This is unsafe because Holmes could mutate state, share secrets or even irreparably delete production environments. |
| ENABLE_CLI_TOOL_APPROVAL | true | Allow Holmes to ask for approval before running potentially unsafe commands. |
2 changes: 2 additions & 0 deletions docs/data-sources/builtin-toolsets/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ HolmesGPT includes pre-built integrations for popular monitoring and observabili
- [:material-aws:{ .lg .middle } **AWS**](aws.md)
- [:material-microsoft-azure:{ .lg .middle } **Azure Kubernetes Service**](aks.md)
- [:material-database:{ .lg .middle } **Azure SQL Database**](azure-sql.md)
- [:material-bash:{ .lg .middle } **Bash**](bash.md)
- [:simple-confluence:{ .lg .middle } **Confluence**](confluence.md)
- [:material-chart-line:{ .lg .middle } **Coralogix logs**](coralogix-logs.md)
- [:simple-datadog:{ .lg .middle } **Datadog**](datadog.md)
Expand All @@ -21,6 +22,7 @@ HolmesGPT includes pre-built integrations for popular monitoring and observabili
- [:material-web:{ .lg .middle } **Internet**](internet.md)
- [:simple-apachekafka:{ .lg .middle } **Kafka**](kafka.md)
- [:simple-kubernetes:{ .lg .middle } **Kubernetes**](kubernetes.md)
- [:simple-kubernetes:{ .lg .middle } **Kubectl Run Image**](kubectl-run-image.md)
- [:simple-grafana:{ .lg .middle } **Loki**](grafanaloki.md)
- [:simple-mongodb:{ .lg .middle } **MongoDB Atlas**](mongodb-atlas.md)
- [:simple-newrelic:{ .lg .middle } **New Relic**](newrelic.md)
Expand Down
225 changes: 225 additions & 0 deletions docs/data-sources/builtin-toolsets/kubectl-run-image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
# Kubectl Run Image Toolset

The kubectl run image toolset provides secure execution of temporary containers in Kubernetes clusters for diagnostic and troubleshooting purposes. It creates temporary debug pods that are automatically cleaned up after execution.

**⚠️ Security Note**: This toolset can create pods in your Kubernetes cluster. It requires careful configuration with whitelisted images and command patterns to ensure security.

## Overview

This toolset uses `kubectl run` to create temporary containers that:

- Execute diagnostic commands in specified container images
- Automatically clean up pods after execution (using `--rm` flag)
- Support custom namespaces and timeouts
- Provide isolated environments for network debugging, DNS resolution, and environment inspection

## Use Cases

- **Network Debugging**: Test connectivity between services using network utilities
- **DNS Resolution**: Verify DNS configuration and resolution from within the cluster
- **Environment Inspection**: Check environment variables, file systems, and configuration
- **Service Testing**: Test HTTP endpoints, database connections, or API calls
- **Resource Analysis**: Examine cluster resources from a pod's perspective

## Configuration

The toolset requires explicit configuration of allowed images and command patterns for security:

=== "YAML"

```yaml
toolsets:
kubectl_run_image:
enabled: true
config:
allowed_images:
- image: "busybox"
allowed_commands:
- "nslookup .*"
- "cat /etc/resolv.conf"
- "echo .*"
- image: "curlimages/curl"
allowed_commands:
- "curl -s http://.*"
- "curl -I .*"
- image: "registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3"
allowed_commands:
- "nslookup .*"
- "dig .*"
- "host .*"
```

=== "Python"

```python
from holmes import Config

config = Config(
toolsets={
"kubectl_run_image": {
"enabled": True,
"config": {
"allowed_images": [
{
"image": "busybox",
"allowed_commands": [
"nslookup .*",
"cat /etc/resolv.conf",
"echo .*"
]
},
{
"image": "curlimages/curl",
"allowed_commands": [
"curl -s http://.*",
"curl -I .*"
]
}
]
}
}
}
)
```

## Configuration Options

### `allowed_images`
List of image configurations that define which container images can be used.

**Required**: Yes

### Image Configuration

Each image entry supports:

- **`image`** (string, required): The container image name
- **`allowed_commands`** (list, required): Regular expression patterns for allowed commands

### Command Pattern Matching

Commands are validated against regex patterns:

- `"echo .*"` - Allows any echo command
- `"curl -s http://.*"` - Allows curl with -s flag to HTTP URLs
- `"nslookup [a-zA-Z0-9.-]+"` - Allows nslookup with domain names
- `"cat /etc/resolv.conf"` - Allows reading the DNS resolver configuration

## Tool Parameters

### `kubectl_run_image`

Creates and runs a temporary pod with the specified image and command.

**Parameters:**

- **`image`** (string, required): Container image to run (must be in allowed list)
- **`command`** (string, required): Command to execute (must match allowed patterns)
- **`namespace`** (string, optional): Kubernetes namespace (defaults to "default")
- **`timeout`** (integer, optional): Command timeout in seconds (defaults to 60)

## Example Usage

### Network Connectivity Test

```bash
# Test connectivity to a service
kubectl_run_image(
image="curlimages/curl",
command="curl -s http://my-service:8080/health",
namespace="production"
)
```

### DNS Resolution Check

```bash
# Check DNS resolution
kubectl_run_image(
image="busybox",
command="nslookup my-service.production.svc.cluster.local",
namespace="production"
)
```

### Environment Inspection

```bash
# Check environment variables
kubectl_run_image(
image="busybox",
command="echo $KUBERNETES_SERVICE_HOST",
namespace="default"
)
```

## Security Considerations

### Image Whitelisting

- Only pre-approved container images can be used
- Images should be from trusted registries
- Consider using specific image tags rather than `latest`

### Command Validation

- All commands are validated against regex patterns
- Dangerous commands (file writes, network changes) should not be allowed
- Use restrictive patterns that only allow necessary operations

### Namespace Restrictions

- The toolset validates namespace names for safety
- Namespaces must match safe naming patterns
- Consider restricting to specific namespaces in production

### Resource Management

- Pods are automatically cleaned up using `--rm` flag
- Set appropriate timeouts to prevent hanging pods
- Monitor resource usage and set limits if needed

## Common Image Recommendations

### Network Debugging
- `busybox` - Basic utilities including nslookup, ping, telnet
- `curlimages/curl` - HTTP testing and API calls
- `registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3` - DNS utilities

### Database Testing
- `postgres:alpine` - PostgreSQL client tools
- `mysql:8.0` - MySQL client tools
- `redis:alpine` - Redis client tools

### Security Scanning
- `aquasec/trivy` - Vulnerability scanning
- `clair-scanner` - Container security scanning

## Troubleshooting

### Permission Issues

Ensure the Holmes service account has the necessary RBAC permissions:

```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: holmes-kubectl-run
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["create", "get", "list", "delete"]
```

### Image Pull Errors

- Verify images exist and are accessible from the cluster
- Check image registry authentication if using private images
- Ensure image names include full registry paths when needed

### Command Validation Failures

- Check that commands match the configured regex patterns exactly
- Test regex patterns separately to ensure they work as expected
- Remember that patterns are matched against the entire command string
5 changes: 1 addition & 4 deletions docs/data-sources/builtin-toolsets/kubernetes.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
# Kubernetes Toolsets

## Core ✓

!!! info "Enabled by Default"
This toolset is enabled by default and should typically remain enabled.
## Core

By enabling this toolset, HolmesGPT will be able to describe and find Kubernetes resources like nodes, deployments, pods, etc.

Expand Down
12 changes: 1 addition & 11 deletions helm/holmes/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,17 +42,7 @@ serviceAccount:
imagePullSecrets: []
annotations: {}

toolsets:
kubernetes/core:
enabled: true
kubernetes/logs:
enabled: true
robusta:
enabled: true
internet:
enabled: true
prometheus/metrics:
enabled: true
toolsets: {}
mcp_servers: {}

resources:
Expand Down
13 changes: 13 additions & 0 deletions holmes/core/toolset_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,10 +107,18 @@ def _list_all_toolsets(
toolsets_from_config,
toolsets_by_name,
)
print("****************** toolset_manager._list_all_toolsets 1.2")
for _, toolset in toolsets_by_name.items():
print(f"toolset={toolset.name} enabled={toolset.enabled}")

# custom toolset should not override built-in toolsets
# to test the new change of built-in toolset, we should make code change and re-compile the program
custom_toolsets = self.load_custom_toolsets(builtin_toolsets_names)
print(
"****************** toolset_manager._list_all_toolsets custom_toolsets"
)
for toolset in custom_toolsets:
print(f"toolset={toolset.name} enabled={toolset.enabled}")
self.add_or_merge_onto_toolsets(
custom_toolsets,
toolsets_by_name,
Expand Down Expand Up @@ -355,6 +363,10 @@ def _load_toolsets_from_paths(
logging.debug("No toolsets configured, skipping loading toolsets")
return []

print(
f"** ** _load_toolsets_from_paths \n\ttoolset_paths={toolset_paths}\n\t{builtin_toolsets_names}"
)

loaded_custom_toolsets: List[Toolset] = []
for toolset_path in toolset_paths:
if not os.path.isfile(toolset_path):
Expand All @@ -369,6 +381,7 @@ def _load_toolsets_from_paths(
toolsets_config: dict[str, dict[str, Any]] = parsed_yaml.get("toolsets", {})
mcp_config: dict[str, dict[str, Any]] = parsed_yaml.get("mcp_servers", {})

print(f"** ** toolsets_config={toolsets_config}")
for server_config in mcp_config.values():
server_config["type"] = ToolsetType.MCP.value

Expand Down
1 change: 0 additions & 1 deletion holmes/plugins/toolsets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,6 @@ def load_builtin_toolsets(dal: Optional[SupabaseDal] = None) -> List[Toolset]:
toolset.type = ToolsetType.BUILTIN
# dont' expose build-in toolsets path
toolset.path = None

return all_toolsets # type: ignore


Expand Down
1 change: 1 addition & 0 deletions holmes/plugins/toolsets/aks-node-health.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
toolsets:
aks/node-health:
description: "Set of tools to troubleshoot AKS node health issues"
enabled: False
tags:
- cli
prerequisites:
Expand Down
1 change: 1 addition & 0 deletions holmes/plugins/toolsets/aks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ toolsets:
description: "Set of tools to read Azure Kubernetes Service resources"
tags:
- cli
enabled: False
prerequisites:
- command: "az account show"
- command: "az aks --help"
Expand Down
1 change: 1 addition & 0 deletions holmes/plugins/toolsets/argocd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ toolsets:
description: "Set of tools to get argocd metadata like list of apps, repositories, projects, etc."
docs_url: "https://holmesgpt.dev/data-sources/builtin-toolsets/argocd/"
icon_url: "https://argo-cd.readthedocs.io/en/stable/assets/logo.png"
enabled: False
llm_instructions: |
You have access to a set of ArgoCD tools for debugging Kubernetes application deployments.
If an application's name does not exist in kubernetes, it may exist in argocd: call the tool `argocd_app_list` to find it.
Expand Down
Loading
Loading