Skip to content

Feat/add testing structure to v2#36

Draft
victorstevansuse wants to merge 26 commits intostackpack-v2from
feat/add-testing-structure-to-v2
Draft

Feat/add testing structure to v2#36
victorstevansuse wants to merge 26 commits intostackpack-v2from
feat/add-testing-structure-to-v2

Conversation

@victorstevansuse
Copy link
Copy Markdown
Collaborator

@victorstevansuse victorstevansuse commented Mar 20, 2026

Add comprehensive test suite, developer tooling, and local test infrastructure

Summary

  • A bunch of static validation tests that parse and validate all STY templates, Groovy scripts, monitors, metric bindings, and views .
  • snapshot tests that detect any unintended change to the extension's 199 topology nodes
  • fuzz tests that verified parser robustness.
  • bats tests for init.sh install/uninstall logic with mocked sts CLI
  • Groovy linting integrated via npm-groovy-lint
  • Local test environment via K3d with SUSE Observability, production-mirror OTel Collector, and pluggable AI components (QDrant, Ollama, Milvus, OpenSearch, vLLM)
  • Demo apps integration for manual exploration of the full telemetry pipeline. (suse-ai-demo-apps)
  • task check — single command to validate everything before pushing. Kind of a very primitive static analysis.
  • task deploy — safe deployment that validates before uploading

Still W.I.P

  • Integrations tests, currently just stubs of what I wanna do.
  • Although the infra runs ok in k3d, I'm still tweaking the collector and demos config.

Bugs found and fixed

  • products.sty referenced ...request-sucess-rate (typo) instead of ...success-rate — the Milvus success rate metric was silently missing from the UI
  • suse-ai-product-id-extractor.groovy had an unused typeName variable flagged by the linter
  • init.sh used xargs which is not available in the container base image

What's included

Test infrastructure (tests/)

tests/
├── static/                     # 34 tests — no infra needed
│   ├── certains_test.go        # CERTAINS.md + knowledge/ fact enforcement
│   ├── config_test.go          # stackpack.conf file reference validation
│   ├── packaging_test.go       # Include resolution, circular deps
│   ├── sty_test.go             # Unique IDs and identifiers
│   ├── monitors_test.go        # Remediation hint references 
│   ├── groovy_test.go          # Product catalog, type mappings, toString safety
│   ├── crossref_test.go        # Metric binding cross-references
│   ├── views_test.go           # View → ViewType → Menu cross-references
│   └── snapshot_test.go        # Golden file regression detection
├── init/
│   └── init_test.bats          # init.sh tests with mocked sts CLI 
├── integration/                # Stubs for deploy-and-verify tests 
├── internal/
│   ├── parser/                 # STY, HOCON, Groovy parsers + fuzz tests
│   ├── snapshot/               # Golden file comparison utility
│   ├── stackstate/             # REST API client
│   ├── otel/                   # OTel fixture builder
│   └── testutil/               # Path resolution helpers
├── testdata/snapshots/         # 6 golden files
└── infra/
    ├── setup.sh                # K3d lifecycle manager
    ├── otel-values.yaml        # Production-mirror OTel Collector Helm values
    ├── otel-config.yaml        # Standalone OTel config for docker-compose
    ├── docker-compose.yaml     # OTel Collector for use with external StackState
    ├── components/             # K8s manifests for AI components
    │   ├── qdrant.yaml         # Vector database (default)
    │   ├── ollama.yaml         # Inference engine, CPU mode (default)
    │   ├── milvus.yaml         # Vector database (opt-in)
    │   ├── opensearch.yaml     # Search engine (opt-in)
    │   └── vllm.yaml           # Inference engine, CPU mode (opt-in)
    └── demo-apps/
        └── values.yaml         # Helm values for suse-ai-demo-apps

Taskfile targets

Command Purpose
task check Lint + static tests + init tests
task check SILENT=1 Same, with minimal output
task deploy check → version-up → upload
task lint Groovy linting
task lint-fix Auto-fix Groovy lint issues
task test-static 34 Go static validation tests
task test-init 10 bats tests for init.sh
task test-integration Integration tests (needs running StackState)
task infra-up Provision K3d cluster with full stack
task infra-down Tear down K3d cluster
task infra-status Show pod status across both namespaces

Local test environment

task infra-up deploys into a K3d cluster:

  • SUSE Observability (suse-observability namespace) — full trial deployment via Helm
  • OTel Collector (suse-private-ai namespace) — mirrors production config with all GenAI inference pipelines, tail sampling, spanmetrics, and component-specific transforms
  • QDrant + Ollama (default) — vector database and inference engine
  • Demo apps (default) — RAG pipeline generating realistic GenAI telemetry for manual exploration
  • Milvus, OpenSearch, vLLM (opt-in via DEPLOY_MILVUS=true, etc.)

The OTel collector image defaults to otel/opentelemetry-collector-contrib:0.147.0 and can be switched to the custom build (e.g. otelcol-suse-ai) via OTEL_COLLECTOR_IMAGE.

Note:
I don't think this will ever be suitable for robust tests given the GPU constraints of an AI Stack, but still helps with some tests.

What the static tests enforce

From knowledge/CERTAINS.md:

  • Icon base64 has valid prefix, encoding, and is single-line
  • Sync nodes have componentActions field
  • Include paths don't have double provisioning/ prefix
  • QueryViews have queryVersion field
  • ComponentType highlights have about section

From knowledge/RECOVERY_PROTOCOL.md and knowledge/MONITOR_CREATION_GUIDE.md:

  • ComponentType highlights have events, externalComponent, relatedResources
  • Monitors have description, status, intervalSeconds, arguments.metric
  • Child STY files don't have nodes: root key
  • Metric binding URN references resolve to existing bindings

Structural validation:

  • All include paths resolve to real files (recursive), no circular includes
  • All node IDs and identifiers are unique
  • Groovy scripts handle all known products, types, and use .toString() on externalId
  • QueryViews reference existing ViewTypes, MainMenuGroup items reference existing QueryViews

Snapshot regression detection (golden files):

  • Component types (27), metric bindings (110), monitors (15), views (23), include graph (38), Groovy switch cases (3 scripts)

Test plan

  • task check passes (lint + 34 static tests + 10 init tests)
  • task check SILENT=1 runs with minimal output
  • task lint-fix auto-fixes Groovy style issues
  • Fuzz tests run 10s each without crashes
  • Integration test stubs compile
  • Golden files are reviewed and committed
  • task infra-up deploys successfully and task infra-status shows all pods running
  • Run task deploy against a test instance to verify the Milvus metric fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant