Skip to content

Consider including hello-world example eval #112

@MattFisher

Description

@MattFisher

When getting started with k8s and Inspect Evals, it can be difficult to debug the basics.

I was trying to get inspect_evals/swebench working with k8s but that had lots of extra complexity, so I wrote a small test task that might be a useful inclusion for others going through the same thing.

# k8s_task.py

from inspect_ai import Task, task
from inspect_ai.dataset import MemoryDataset, Sample
from inspect_ai.scorer import includes
from inspect_ai.solver import generate, use_tools
from inspect_ai.tool import bash


@task
def hello_k8s() -> Task:
    samples = [
        Sample(
            input="Get the OS version codename using `cat /etc/os-release`.",
            target="bookworm",
        ),
        Sample(
            input="Get info on the web server version running at my-web-server.com.",
            target="nginx/1.27.0",
        ),
    ]
    return Task(
        dataset=MemoryDataset(samples=samples),
        solver=[
            use_tools([bash()]),
            generate(),
        ],
        sandbox=("k8s", "values.yaml"),
        scorer=includes(),
    )
# values.yaml

services:
  default:
    image: python:3.12-bookworm
    command: ["tail", "-f", "/dev/null"]
  server:
    image: nginx:1.27.0
    dnsRecord: true
    additionalDnsRecords:
      - "my-web-server.com"
    readinessProbe:
      tcpSocket:
        port: 80

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions