Skip to content

Commit 471de5b

Browse files
michaeldwanMatt Dwanmarkphelps
authored
Add Go-based integration test framework using testscript (#2622)
* Add Go-based integration test framework using testscript Set up foundation for new integration tests: - integration-tests/go.mod with testscript dependency - harness package for cog binary resolution and custom commands - suite_test.go with TestIntegration that discovers fixtures - string-echo fixture with basic.txtar test case The framework copies fixture files into isolated temp directories, provides a custom 'cog' command for testscript, and supports parallel test execution across fixtures. Run tests with: cd integration-tests && go test -v Closes cog-iws.1 * refactor(integration-tests): use embedded fixtures in txtar files - Embed cog.yaml and predict.py directly in txtar test files - Remove fixtures/ directory and file copying logic - Simplify harness to just configure env vars (HOME, TEST_IMAGE) - Simplify suite_test.go to point testscript at tests/ dir - Group multiple assertions per fixture (build once, test many) - Add unique image names and automatic cleanup per test * feat(integration-tests): add on-demand binary building and more tests - Add automatic cog binary building when COG_BINARY not set - Builds wheels if needed, caches binary in .bin/cog - Finds repo root by looking for go.mod with correct module - Add 4 new test cases: async_predictor, env_vars, file_input, int_predictor - Add .gitignore for cached binary directory * feat(integration-tests): add Makefile target and 9 more test cases - Add test-integration-go Makefile target with -parallel 4 throttling - Add tests: optional_input, path_output, path_list_output, path_list_input, string_list_input, subdirectory_predictor, union_type, many_inputs, train_basic Total: 14 integration tests now passing * Port 13 additional integration tests to Go testscript framework New tests: - path_input_output: Path input/output with setup method - path_input: Path input type reading file content - file_list_input: list[File] input type - complex_output: Pydantic BaseModel output - function_predictor: Function-based predictor (no class) - python313: Python 3.13 support - pydantic2: Explicit Pydantic 2 dependency - optional_path_input: Optional Path with None default - future_annotations: __future__ annotations support - int_none_output: Int return type returning None - string_none_output: Str return type returning None - invalid_int_validation: Schema validation for invalid defaults - no_predictor: Error case for missing predictor Total tests now: 27 (up from 14) * Add Go integration tests to CI with runtime matrix Runs the new testscript-based integration tests with: - cog runtime (blocking) - coglet-alpha runtime (non-blocking, may have error message differences) * Port 22 additional integration tests to Go testscript framework Phase 1 (8 tests): Simple build tests - apt_packages, ffmpeg_package, zsh_package, local_whl_install - install_requires_packaging, bad_dockerignore, pydantic2_output, python37_deprecated Phase 2 (7 tests): Advanced features - secrets, overrides, training_setup, fast_build - migration, migration_no_python_changes, migration_gpu Phase 3 (7 tests): Edge cases - cog_runtime_float, cog_runtime_int, glb_project, granite_project - async_sleep, complex_types, complex_types_list Total ported: 49 tests (up from 27) * Remove Python integration tests that are now ported to Go Removed 39 Python tests and 39 fixture directories that have been fully ported to the Go testscript framework: - Deleted test_migrate.py (3 tests → migration*.txtar) - Deleted test_train.py (3 tests → train_basic, pydantic2_output, training_setup) - Removed 12 tests from test_build.py - Removed 21 tests from test_predict.py Remaining Python tests: 46 (build: 18, config: 1, predict: 20, run: 7) These cover functionality not yet ported: base images, labels, torch/ tensorflow, subprocess handling, JSON I/O, cog run, pipeline tests. * Fix COG_BINARY resolution to use repo root for relative paths * Fix local_whl_install test to include proper WHEEL file in package * Set BUILDKIT_PROGRESS=plain to reduce Docker output noise in tests * Port framework/GPU tests to Go testscript with [slow] skip condition - Add [slow] condition to harness (skip with COG_TEST_FAST=1) - Add 7 new txtar tests for torch/tensorflow builds - Remove ported Python tests and fixtures - Update .gitignore for local planning files * Increase test parallelism for Go integration tests - Add TEST_PARALLEL env var to control concurrency (default 4) - Set TEST_PARALLEL=8 in CI for 16-core runners - Remove continue-on-error for coglet-alpha tests * Add HTTP server testing support and port subprocess tests Extends the Go testscript harness with HTTP server testing capabilities and completes migration of subprocess handling tests from Python. ## Harness Enhancements - Add 'serve' command: starts cog serve in background with automatic port allocation, health checking, and cleanup - Add 'curl' command: makes HTTP requests to running server for testing predictions via HTTP API - Improve condition system: clarify [slow] condition to skip tests when COG_TEST_FAST=1 ## New Tests (4 subprocess tests) - setup_subprocess_simple.txtar: subprocess with SIGUSR1 signals - setup_subprocess_double_fork.txtar: double fork daemonization - setup_subprocess_double_fork_http.txtar: double fork + HTTP server - setup_subprocess_multiprocessing.txtar: Python multiprocessing (currently skipped - needs debugging) ## Python Test Cleanup Removed obsolete Python tests now covered by Go tests: - Deleted test_predict_with_subprocess_in_setup (4 parameterized tests) - Removed 4 subprocess fixture directories - Reduced Python test count from 46 to 37 ## Test Coverage Total Go integration tests: 60 (up from 56) Remaining Python tests: 37 (focus on CLI flags, cog run, JSON I/O) The test suites are now complementary: - Go tests: core predictor functionality, builds, types, server behavior - Python tests: CLI flags (--json, -o), commands (cog run/init/install) * Update CONTRIBUTING.md with new Go integration test workflow - Document Go integration tests as primary test suite (60 tests) - Add instructions for running Go tests with testscript - Explain COG_TEST_FAST for skipping slow tests - Show how to write new integration tests with .txtar format - Add examples for basic predictor and server testing - Update project structure to reflect integration-tests/ directory - Clarify Python tests are supplementary (CLI flags & tooling) * Make cog subcommand syntax consistent across tests Use 'cog serve' instead of 'serve' for consistency with other cog subcommands (build, predict, etc.). This makes the test syntax clearer and sets up for adding more subcommands in the future. Changes: - Refactor cmdCog to use switch statement for subcommand routing - Add comment showing where to add future subcommands (cog run, etc.) - Update all subprocess tests to use 'cog serve' - Update CONTRIBUTING.md examples - Update PR description examples The switch statement makes it easy to add special handling for other subcommands like 'cog run' in the future. * chore: update gitignore Signed-off-by: Mark Phelps <[email protected]> * Restore async-sleep-project fixture for test_concurrent_predictions The test_concurrent_predictions Python test requires the async-sleep-project fixture to test concurrent async predictions with server shutdown. This test is unique and not covered by the Go async_sleep.txtar test. Recreated fixture from Go test for Python test compatibility. * Skip flaky setup_subprocess_double_fork test in CI This test consistently fails in CI with connection refused errors, suggesting the double forked process doesn't start reliably in the CI environment. The test passes locally but fails in CI for both cog and coglet integration tests. Root cause needs investigation - likely related to timing/environment differences between local and CI, or how the double fork interacts with Docker in CI. Skipping for now to unblock CI while we investigate. * Fix flaky subprocess integration tests with wait-for and retry-curl commands Replace hard-coded sleep delays with proper synchronization mechanisms: - Add wait-for command: Poll for file/http/content conditions with timeout - Add retry-curl command: HTTP requests with automatic retry logic - Update subprocess tests to signal readiness via files - Remove skip markers from previously flaky tests Subprocess test improvements: - setup_subprocess_simple: Wait for .ready file, use retry-curl - setup_subprocess_double_fork: Wait for .forked-ready file, 60s timeout - setup_subprocess_double_fork_http: Wait for HTTP endpoint availability - setup_subprocess_multiprocessing: Wait for .ponger-ready file Benefits: - Eliminates race conditions from fixed sleep delays - CI-friendly with 60s timeouts for slower environments - Self-documenting readiness requirements - Resilient to timing variations between local and CI Updated CONTRIBUTING.md with documentation and examples for new commands. * Suppress Docker build output in integration tests using BUILDKIT_PROGRESS=quiet Fix cog CLI to respect BUILDKIT_PROGRESS environment variable for the --progress flag default. Previously the CLI always defaulted to 'auto', ignoring the env var. Now the env var takes precedence. Changes: - pkg/cli/build.go: Check BUILDKIT_PROGRESS env var before defaulting - integration-tests/harness/harness.go: Set BUILDKIT_PROGRESS=quiet This makes integration test output much cleaner by hiding the verbose Docker build step-by-step progress (#1 [internal] load..., etc.) while still showing build status messages and any errors. * Fix subprocess tests: remove wait-for file (doesn't work with Docker) The wait-for file command checks for files on the host, but subprocess tests create files inside Docker containers started by cog serve. This caused all subprocess tests to timeout waiting for files that would never appear on the host. Fix: Remove wait-for file usage and rely on retry-curl with generous retries (30 attempts, 1s delay) to handle subprocess initialization. The cog server's health check ensures the server is ready, and retries handle any additional subprocess startup time. Changes: - Remove wait-for file from all subprocess tests - Remove file-based readiness signaling from Python scripts - Increase retry-curl attempts to 30 for first prediction - Update CONTRIBUTING.md to remove wait-for examples * Add README for integration tests Comprehensive documentation covering: - Quick start commands - Directory structure - Writing tests (txtar format, embedded fixtures) - Environment variables - Custom commands (cog, curl, retry-curl, wait-for) - Conditions ([slow]) - Built-in testscript commands - Common test patterns - Debugging tips - Common issues and solutions * Add editor support section to integration tests README Document syntax highlighting options for .txtar files: - VS Code: twpayne.vscode-testscript and brody715/vscode-txtar - Zed: FollowTheProcess/zed-txtar - Vim/Neovim: basic suggestions * Fix health check to wait for READY status before returning The waitForServer function was only checking for HTTP 200 status, but the cog server returns 200 even during STARTING state while setup() is still running. This caused race conditions where tests would start making predictions before setup completed. Changes: - Update waitForServer to parse the JSON response and wait for status=READY (meaning setup completed successfully) - Return early if status is SETUP_FAILED or DEFUNCT - Increase HTTP client timeout to 5s for more reliable health checks - Capture server stdout/stderr for better debugging on failures Also fix setup_subprocess_multiprocessing.txtar: - Skip directories when cleaning up *.tmp files (was failing on .tmp dir) - Update assertions to check logs instead of output format (Path returns base64-encoded content, not the file path) * Fix race condition and cleanup in test harness - Add mutex to protect serverProcs map from concurrent access - Key serverProcs by work directory instead of TestScript pointer - Fix cleanup to only stop current test's server, not all servers - Use errors.Is(err, io.EOF) instead of string comparison * Fix flaky double_fork_http test: wait for HTTP server to be ready The test spawns a background HTTP server during setup() but wasn't waiting for it to be ready before returning. This caused predict() to fail with connection refused when trying to connect to the server. Added a retry loop in setup() to wait up to 15 seconds for the background HTTP server to accept connections. --------- Signed-off-by: Mark Phelps <[email protected]> Co-authored-by: Matt Dwan <[email protected]> Co-authored-by: Mark Phelps <[email protected]> Co-authored-by: Mark Phelps <[email protected]>
1 parent 67ad421 commit 471de5b

File tree

191 files changed

+3967
-2515
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

191 files changed

+3967
-2515
lines changed

.github/workflows/ci.yaml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,7 @@ jobs:
6262
- test-go
6363
- test-python
6464
- test-integration
65+
- test-integration-go
6566
- test-coglet-go
6667
- test-coglet-python
6768
runs-on: ubuntu-latest
@@ -155,6 +156,43 @@ jobs:
155156
- name: Test coglet Python
156157
run: uv run --project coglet pytest coglet/python/tests -v
157158

159+
# Go-based integration tests using testscript framework
160+
test-integration-go:
161+
name: "Test integration Go (${{ matrix.runtime }})"
162+
needs: build-python
163+
runs-on: ubuntu-latest-16-cores
164+
timeout-minutes: 30
165+
strategy:
166+
fail-fast: false
167+
matrix:
168+
runtime: [cog, coglet-alpha]
169+
steps:
170+
- uses: actions/checkout@v6
171+
with:
172+
fetch-depth: 0
173+
- name: Login to Docker Hub
174+
uses: docker/login-action@v3
175+
with:
176+
registry: index.docker.io
177+
username: ${{ secrets.DOCKERHUB_USERNAME }}
178+
password: ${{ secrets.DOCKERHUB_TOKEN }}
179+
- name: Download built wheels
180+
uses: actions/download-artifact@v6
181+
with:
182+
path: dist
183+
merge-multiple: true
184+
- uses: actions/setup-go@v5
185+
with:
186+
go-version-file: go.mod
187+
- name: Build cog binary
188+
run: make cog
189+
- name: Run Go integration tests
190+
env:
191+
COG_WHEEL: ${{ matrix.runtime }}
192+
COG_BINARY: ./cog
193+
TEST_PARALLEL: 8
194+
run: make test-integration-go
195+
158196
# TODO[md]: This is a gross hack, remove once this is sorted out: https://github.com/replicate/cog/pull/2353
159197
# cannot run this on mac due to licensing issues: https://github.com/actions/virtual-environments/issues/2150
160198
test-integration:

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,6 @@ coglet/python/coglet/_version.py
2929

3030
# Built coglet-server binaries
3131
coglet/python/cog/bin/
32+
33+
# Local planning files
34+
docs/plans/**

ARCHITECTURE.md

Lines changed: 600 additions & 0 deletions
Large diffs are not rendered by default.

CONTRIBUTING.md

Lines changed: 96 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,8 @@ As much as possible, this is attempting to follow the [Standard Go Project Layou
144144
- `pkg/predict/` - Runs predictions on models.
145145
- `pkg/util/` - Various packages that aren't part of Cog. They could reasonably be separate re-usable projects.
146146
- `python/` - The Cog Python library.
147-
- `test-integration/` - High-level integration tests for Cog.
147+
- `integration-tests/` - Go-based integration tests using testscript (primary test suite).
148+
- `test-integration/` - Legacy Python integration tests (supplementary - CLI flags and tooling).
148149
- `tools/compatgen/` - Tool for generating CUDA/PyTorch/TensorFlow compatibility matrices.
149150

150151
## Updating compatibility matrices
@@ -188,7 +189,7 @@ There are a few concepts used throughout Cog that might be helpful to understand
188189
script/test # see also: make test
189190
```
190191

191-
**To run just the Golang tests:**
192+
**To run just the Go unit tests:**
192193

193194
```sh
194195
script/test-go # see also: make test-go
@@ -203,38 +204,114 @@ script/test-python # see also: make test-python
203204
> [!INFO]
204205
> This runs the Python test suite using the default Python version. To run a more comprehensive test across multiple Python versions, use `make test-python`.
205206
206-
**To run just the integration tests:**
207+
### Integration Tests
208+
209+
Cog has two integration test suites that are complementary:
210+
211+
**Go integration tests (primary - 60 tests):**
212+
213+
Tests core predictor functionality using [testscript](https://pkg.go.dev/github.com/rogpeppe/go-internal/testscript). Each test is a self-contained `.txtar` file in `integration-tests/tests/`.
207214

208215
```sh
209-
make test-integration
216+
# Run all Go integration tests
217+
make test-integration-go
218+
219+
# Run fast tests only (skip slow GPU/framework tests)
220+
COG_TEST_FAST=1 make test-integration-go
221+
222+
# Run a specific test
223+
cd integration-tests && go test -v -run TestIntegration/string_predictor
224+
225+
# Run with a custom cog binary
226+
COG_BINARY=/path/to/cog make test-integration-go
210227
```
211228

212-
**To run a specific Python test:**
229+
**Python integration tests (supplementary - 37 tests):**
230+
231+
Tests CLI flags, `cog run`, and other tooling features using pytest.
213232

214233
```sh
215-
script/test-python python/tests/server/test_http.py::test_openapi_specification_with_yield
234+
# Run all Python integration tests
235+
make test-integration
236+
237+
# Run a specific Python integration test
238+
cd test-integration && uv run tox -e integration -- test_integration/test_build.py::test_build_gpu_model_on_cpu
216239
```
217240

218-
**To run a specific Python test under a specific environment**
241+
**Integration test coverage:**
242+
- **Go tests**: Core predictors, types, builds, training, subprocess behavior, HTTP server testing
243+
- **Python tests**: CLI flags (`--json`, `-o`), commands (`cog run`, `cog init`), edge cases
219244

220-
```sh
221-
uv run tox -e py312-pydantic2-tests -- python/tests/server/test_http.py::test_openapi_specification_with_yield
245+
### Writing Integration Tests
246+
247+
When adding new functionality, prefer adding Go integration tests in `integration-tests/tests/`. They are:
248+
- Self-contained (embedded fixtures in `.txtar` files)
249+
- Faster to run (parallel execution with automatic cleanup)
250+
- Easier to read and write (simple command script format)
251+
252+
Example test structure:
253+
254+
```txtar
255+
# Test string predictor
256+
cog build -t $TEST_IMAGE
257+
cog predict $TEST_IMAGE -i s=world
258+
stdout 'hello world'
259+
260+
-- cog.yaml --
261+
build:
262+
python_version: "3.12"
263+
predict: "predict.py:Predictor"
264+
265+
-- predict.py --
266+
from cog import BasePredictor
267+
268+
class Predictor(BasePredictor):
269+
def predict(self, s: str) -> str:
270+
return "hello " + s
222271
```
223272

224-
_You can see all the available test environments under `env_list` in the tox.ini file_
273+
For testing `cog serve`, use `cog serve` and the `curl` command:
225274

226-
**To stand up a server for one of the integration tests:**
275+
```txtar
276+
cog build -t $TEST_IMAGE
277+
cog serve
278+
curl POST /predictions '{"input":{"s":"test"}}'
279+
stdout '"output":"hello test"'
280+
```
227281

228-
```sh
229-
make install
230-
pip install -r requirements-dev.txt
231-
make test
232-
cd test-integration/test_integration/fixtures/file-project
233-
cog build
234-
docker run -p 5001:5000 --init --platform=linux/amd64 cog-file-project
282+
#### Advanced Test Commands
283+
284+
For tests that require subprocess initialization or async operations, use `retry-curl`:
285+
286+
**`retry-curl` - HTTP request with automatic retries:**
287+
288+
```txtar
289+
# Make HTTP request with retry logic (useful for subprocess initialization delays)
290+
# retry-curl [method] [path] [body] [max-attempts] [retry-delay]
291+
retry-curl POST /predictions '{"input":{"s":"test"}}' 30 1s
292+
stdout '"output":"hello test"'
293+
```
294+
295+
**Example: Testing predictor with subprocess in setup**
296+
297+
```txtar
298+
cog build -t $TEST_IMAGE
299+
cog serve
300+
301+
# Use generous retries since setup spawns a background process
302+
retry-curl POST /predictions '{"input":{"s":"test"}}' 30 1s
303+
stdout '"output":"hello test"'
304+
305+
-- predict.py --
306+
class Predictor(BasePredictor):
307+
def setup(self):
308+
self.process = subprocess.Popen(["./background.sh"])
309+
310+
def predict(self, s: str) -> str:
311+
return "hello " + s
235312
```
236313

237-
Then visit [localhost:5001](http://localhost:5001) in your browser.
314+
See existing tests in `integration-tests/tests/`, especially `setup_subprocess_*.txtar`, for more examples.
238315

239316
## Running the docs server
240317

Makefile

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,14 @@ test-integration: $(COG_BINARIES)
5555
$(GO) test ./pkg/docker/...
5656
PATH="$(PWD):$(PATH)" $(TOX) -e integration
5757

58+
# Run Go-based integration tests (testscript)
59+
# Use TEST_PARALLEL to control concurrency (default 4 to avoid Docker overload)
60+
# CI with more cores can set TEST_PARALLEL=8 or higher
61+
TEST_PARALLEL ?= 4
62+
.PHONY: test-integration-go
63+
test-integration-go:
64+
cd integration-tests && $(GO) test -v -parallel $(TEST_PARALLEL) -timeout 30m $(ARGS) .
65+
5866
.PHONY: test-python
5967
test-python: generate
6068
$(TOX) run --installpkg $$(ls dist/cog-*.whl) -f tests

integration-tests/.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.bin/

0 commit comments

Comments
 (0)