Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
915dc91
Add start mariadb
sitaowang1998 Jan 7, 2026
6300ca0
Merge branch 'main' into storage-test
sitaowang1998 Jan 7, 2026
5e5f7a0
Format file
sitaowang1998 Jan 7, 2026
32e4462
Add stop
sitaowang1998 Jan 7, 2026
cd34070
Add get free port
sitaowang1998 Jan 7, 2026
50d0288
Add description
sitaowang1998 Jan 7, 2026
a622008
Refactor init_db into python script
sitaowang1998 Jan 7, 2026
aa7e551
Add db wait and bug fix
sitaowang1998 Jan 8, 2026
37e68cb
Use env for spider-py unit tests
sitaowang1998 Jan 8, 2026
dee2ecb
Add storage in tasks
sitaowang1998 Jan 8, 2026
6a28b85
Cpp integration tests use mariadb
sitaowang1998 Jan 8, 2026
38dee77
Make start mariadb internal
sitaowang1998 Jan 8, 2026
0705901
Use env for storage url in C++ unit tests
sitaowang1998 Jan 8, 2026
c115833
Fix clang-tidy
sitaowang1998 Jan 8, 2026
62d9add
Use general unit tests and add integration tests
sitaowang1998 Jan 8, 2026
82f882e
Update spider-py readme
sitaowang1998 Jan 8, 2026
0f72824
Update test doc
sitaowang1998 Jan 8, 2026
5e871bc
Fix timestamp
sitaowang1998 Jan 8, 2026
0173937
Fix typo
sitaowang1998 Jan 8, 2026
909de43
Fix lint
sitaowang1998 Jan 8, 2026
ef02ec8
Bug fix
sitaowang1998 Jan 8, 2026
ecac810
Fix import
sitaowang1998 Jan 8, 2026
55f7300
Fix GH workflow name
sitaowang1998 Jan 8, 2026
89c8b6b
Fix doc
sitaowang1998 Jan 8, 2026
9b3fa91
Bug fix
sitaowang1998 Jan 8, 2026
f03981e
Fix multiline string
sitaowang1998 Jan 8, 2026
4ca400c
Fix SQL format
sitaowang1998 Jan 9, 2026
b21e361
Restruct file
sitaowang1998 Jan 9, 2026
d0b32cc
Use variable
sitaowang1998 Jan 9, 2026
0678ea1
Stop container if wait fail
sitaowang1998 Jan 12, 2026
26b8cc6
Polishing test tasks.
LinZhihao-723 Jan 12, 2026
e25e09b
Polish docstring.
LinZhihao-723 Jan 12, 2026
068f454
->
LinZhihao-723 Jan 12, 2026
7d21f3c
CSC100: Test your changes before pushing.
LinZhihao-723 Jan 12, 2026
1ae488e
Apply suggestions from code review
sitaowang1998 Jan 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: "unit-tests"
name: "tests"

on:
pull_request:
Expand All @@ -17,7 +17,7 @@ concurrency:
cancel-in-progress: true

jobs:
non-storage-unit-tests:
tests:
strategy:
matrix:
os: ["ubuntu-22.04", "ubuntu-24.04"]
Expand Down Expand Up @@ -49,12 +49,14 @@ jobs:
task --version
uv --version

- name: "Install project dependencies "
- name: "Install project dependencies"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove space.

timeout-minutes: 10
env:
SPIDER_DEPS_MAX_PARALLELISM_PER_TASK: "1"
run: "task deps:lib_install"

- run: "task test:cpp-non-storage-unit-tests"
- run: "task test:cpp-unit-tests"

- run: "task test:spider-py-non-storage-unit-tests"
- run: "task test:spider-py-unit-tests"

- run: "task test:cpp-integration"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group unit tests before integration tests.

49 changes: 6 additions & 43 deletions docs/src/dev-docs/testing.md
Original file line number Diff line number Diff line change
@@ -1,56 +1,14 @@
# Testing

## Set up storage backend

Spider relies on a fault-tolerant storage to store metadata and data. Spider's unit tests also
require this storage backend.

### Set up MySQL as storage backend

1. Start a MySQL database running in background.
2. Create an empty database.
```sql
CREATE DATABASE <db_name>;
```
3. Set the password for `root` or create another user with password and grant access to database
created in step 2.
```sql
ALTER USER 'root'@'localhost' IDENTIFIED BY '<pwd>';
--- OR create a new user
CREATE USER '<usr>'@'localhost' IDENTIFIED BY '<pwd>';
GRANT ALL PRIVILEGES ON <db_name>.* TO '<usr>'@'localhost';
```
4. Set the `cStorageUrl` in `tests/storage/StorageTestHelper.hpp` to
`jdbc:mariadb://localhost:3306/<db_name>?user=<usr>&password=<pwd>`.

5. Set the `storage_url` in `tests/integration/client.py` to
`jdbc:mariadb://localhost:3306/<db_name>?user=<usr>&password=<pwd>`.

## Running unit tests

You can use the following tasks to run the set of unit tests that's appropriate.

| Task | Description |
|-----------------------------------------|-----------------------------------------------------------------------------|
| `test:cpp-unit-tests` | Runs all C++ unit tests. |
| `test:cpp-non-storage-unit-tests` | Runs all C++ unit tests which don't require a storage backend to run. |
| `test:cpp-storage-unit-tests` | Runs all C++ unit tests which require a storage backend to run. |
| `test:spider-py-unit-tests` | Runs all spider-py unit tests. |
| `test:spider-py-non-storage-unit-tests` | Runs all spider-py unit tests which don't require a storage backend to run. |
| `test:spider-py-storage-unit-tests` | Runs all spider-py unit tests which require a storage backend to run. |

If any tests show error messages for the connection function below, revisit the
[setup section](#set-up-mysql-as-storage-backend) and verify that `cStorageUrl` was set correctly.

```c++
REQUIRE( storage->connect(spider::test::cStorageUrl).success() )
```

## GitHub unit test workflow

The [unit_tests.yaml][gh-workflow-unit-tests] GitHub workflow runs the unit tests on push,
pull requests, and daily. Currently, it only runs unit tests that don't require a storage backend.

## Running integration tests

You can use the following tasks to run integration tests.
Expand All @@ -59,5 +17,10 @@ You can use the following tasks to run integration tests.
|------------------------|---------------------------------|
| `test:cpp-integration` | Runs all C++ integration tests. |

## GitHub test workflow

The [tests.yaml][gh-workflow-tests] GitHub workflow runs all unit tests and integration tests on
push, pull requests, and daily.


[gh-workflow-unit-tests]: https://github.com/y-scope/spider/blob/main/.github/workflows/unit-tests.yaml
[gh-workflow-tests]: https://github.com/y-scope/spider/blob/main/.github/workflows/tests.yaml
50 changes: 1 addition & 49 deletions python/spider-py/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,60 +26,12 @@ directory at the Spider project root.

## Testing

Unit tests are divided into two categories: storage and non-storage tests. Non-storage tests do not
require any external services, while storage tests require a MariaDB instance to be available.

### Non-Storage Unit Tests

To run all non-storage unit tests:

```shell
task test:spider-py-non-storage-unit-tests
```

### Setup MariaDB for Storage Unit Tests

To run storage unit tests, we need to create a MariaDB instance first.

```shell
docker run \
--detach \
--rm \
--name spider-storage \
--env MARIADB_USER=spider \
--env MARIADB_PASSWORD=password \
--env MARIADB_DATABASE=spider-storage \
--env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=true \
--publish 3306:3306 mariadb:latest
```

After the docker container starts, set up the database table manually by using the SQL script
`tools/scripts/storage/init_db.sql` from the project root.

```shell
mysql -h 127.0.0.1 -u spider -ppassword spider-storage < tools/scripts/storage/init_db.sql
```

### Storage Unit Tests

To run all storage unit tests:

```shell
task test:spider-py-storage-unit-tests
```

This requires a running MariaDB instance as described above.

### All Unit Tests

To run all unit tests (both storage and non-storage):
To run all unit tests:

```shell
task test:spider-py-unit-tests
```

This requires a running MariaDB instance as described above.

## Linting

To run all linting checks:
Expand Down
4 changes: 3 additions & 1 deletion python/spider-py/tests/client/test_driver.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Tests for the driver module."""

import os
from dataclasses import dataclass

import pytest
Expand All @@ -12,7 +13,8 @@
@pytest.fixture(scope="session")
def driver() -> Driver:
"""Fixture for the driver."""
return Driver(MariaDBTestUrl)
url = os.getenv("SPIDER_STORAGE_URL", MariaDBTestUrl)
return Driver(url)


def double(_: TaskContext, x: Int8) -> Int8:
Expand Down
4 changes: 3 additions & 1 deletion python/spider-py/tests/storage/test_mariadb.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Tests for the MariaDB storage backend."""

import os
from uuid import uuid4

import msgpack
Expand All @@ -15,7 +16,8 @@
@pytest.fixture(scope="session")
def mariadb_storage() -> MariaDBStorage:
"""Fixture to create a MariaDB storage instance."""
params = parse_jdbc_url(MariaDBTestUrl)
url = os.getenv("SPIDER_STORAGE_URL", MariaDBTestUrl)
params = parse_jdbc_url(url)
return MariaDBStorage(params)


Expand Down
1 change: 1 addition & 0 deletions taskfiles/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ tasks:
cmds:
- for:
- "tests/integration"
- "tools/scripts/mariadb"
cmd: |-
. "{{.G_LINT_VENV_DIR}}/bin/activate"
mypy "{{.ITEM}}"
Expand Down
145 changes: 101 additions & 44 deletions taskfiles/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,24 +5,29 @@ vars:
G_TEST_VENV_DIR: "{{.G_BUILD_DIR}}/test-venv"
G_TEST_VENV_CHECKSUM_FILE: "{{.G_BUILD_DIR}}/test#venv.md5"

# MariaDB testing config
G_MARIADB_DATABASE: "spider-db"
G_MARIADB_USERNAME: "spider-user"
G_MARIADB_PASSWORD: "spider-password"

tasks:
cpp-non-storage-unit-tests:
deps:
- "build-unit-test"
Comment on lines -9 to -11
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove xxx-storage-unit-tests and xxx-non-storage-unit-tests.

cpp-unit-tests:
cmds:
- "{{.G_UNIT_TEST_BINARY}} \"~[storage]\""
- task: "mariadb-storage-task-executor"
vars:
STORAGE_TASK: "spider-wolf-unit-tests-executor"

cpp-storage-unit-tests:
deps:
- "build-unit-test"
cpp-integration:
cmds:
- "{{.G_UNIT_TEST_BINARY}} \"[storage]\""
- task: "mariadb-storage-task-executor"
vars:
STORAGE_TASK: "spider-wolf-integration-tests-executor"

cpp-unit-tests:
deps:
- "build-unit-test"
spider-py-unit-tests:
cmds:
- "{{.G_UNIT_TEST_BINARY}}"
- task: "mariadb-storage-task-executor"
vars:
STORAGE_TASK: "spider-py-unit-tests-executor"

build-unit-test:
internal: true
Expand All @@ -31,23 +36,6 @@ tasks:
vars:
TARGETS: ["spider_task_executor", "unitTest", "worker_test"]

cpp-integration:
dir: "{{.G_BUILD_SPIDER_DIR}}"
deps:
- "venv"
- task: ":build:cpp-target"
vars:
TARGETS: [
"spider_task_executor",
"worker_test",
"client_test",
"spider_worker",
"spider_scheduler",
"integrationTest"]
cmd: |-
. ../test-venv/bin/activate
../test-venv/bin/pytest tests/integration

venv:
internal: true
vars:
Expand Down Expand Up @@ -77,26 +65,95 @@ tasks:
CHECKSUM_FILE: "{{.CHECKSUM_FILE}}"
INCLUDE_PATTERNS: ["{{.OUTPUT_DIR}}"]

spider-py-unit-tests:
dir: "{{.G_SRC_PYTHON_DIR}}"
env:
# Don't create __pycache__ directories in the source tree.
PYTHONDONTWRITEBYTECODE: "1"
# A generic wrapper that runs the given task with a MariaDB storage backend.
#
# @param {string} STORAGE_TASK The task to execute. The task must accept no parameters other than
# `SPIDER_STORAGE_URL`, which is set to the MariaDB instance URL.
Comment on lines +68 to +71
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to add docstrings for unit tests that contains variables.

mariadb-storage-task-executor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a wrapper for storage task exeuction so we don't need to repeat the save set up everywhere.
The storage URL is passed as a variable.

internal: true
vars:
MARIADB_CONTAINER_NAME:
# Normalize UUID casing: macOS generates uppercase while Linux generates lowercase.
sh: "uuidgen | tr '[:upper:]' '[:lower:]' | sed 's/^/spider-mariadb-/'"
MARIADB_PORT:
sh: "tools/scripts/get_free_port.py"
requires:
vars: ["STORAGE_TASK"]
dir: "{{.ROOT_DIR}}"
cmds:
- "uv run pytest"
- |-
tools/scripts/mariadb/start.py \
--name "{{.MARIADB_CONTAINER_NAME}}" \
--port "{{.MARIADB_PORT}}" \
--database "{{.G_MARIADB_DATABASE}}" \
--username "{{.G_MARIADB_USERNAME}}" \
--password "{{.G_MARIADB_PASSWORD}}"
- defer: |-
{{.ROOT_DIR}}/tools/scripts/mariadb/stop.py \
--name "{{.MARIADB_CONTAINER_NAME}}"
- |-
tools/scripts/mariadb/wolf/init_db.py \
--port "{{.MARIADB_PORT}}" \
--database "{{.G_MARIADB_DATABASE}}" \
--username "{{.G_MARIADB_USERNAME}}" \
--password "{{.G_MARIADB_PASSWORD}}"
- task: "{{.STORAGE_TASK}}"
vars:
SPIDER_STORAGE_URL:
"jdbc:mariadb://127.0.0.1:{{.MARIADB_PORT}}/{{.G_MARIADB_DATABASE}}?\
user={{.G_MARIADB_USERNAME}}&password={{.G_MARIADB_PASSWORD}}"

spider-py-non-storage-unit-tests:
dir: "{{.G_SRC_PYTHON_DIR}}"
# Internal task that runs all spider-py's unit tests.
#
# @param {string} SPIDER_STORAGE_URL An URL pointing to the MariaDB instance.
spider-py-unit-tests-executor:
internal: true
env:
# Don't create __pycache__ directories in the source tree.
PYTHONDONTWRITEBYTECODE: "1"
cmds:
- "uv run pytest -m \"not storage\""

spider-py-storage-unit-tests:
SPIDER_STORAGE_URL: "{{.SPIDER_STORAGE_URL}}"
requires:
vars: ["SPIDER_STORAGE_URL"]
dir: "{{.G_SRC_PYTHON_DIR}}"
cmd: "uv run pytest"

# Internal task that runs all Spider Wolf's unit tests.
#
# @param {string} SPIDER_STORAGE_URL An URL pointing to the MariaDB instance.
spider-wolf-unit-tests-executor:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document the versioning. But since this is an internal task, I'm gonna refer to wolf directly.

internal: true
env:
# Don't create __pycache__ directories in the source tree.
PYTHONDONTWRITEBYTECODE: "1"
SPIDER_STORAGE_URL: "{{.SPIDER_STORAGE_URL}}"
requires:
vars: ["SPIDER_STORAGE_URL"]
dir: "{{.ROOT_DIR}}"
deps:
- "build-unit-test"
cmd: "{{.G_UNIT_TEST_BINARY}}"

# Internal task that runs all Spider Wolf's integration tests.
#
# @param {string} SPIDER_STORAGE_URL An URL pointing to the MariaDB instance.
spider-wolf-integration-tests-executor:
internal: true
env:
SPIDER_STORAGE_URL: "{{.SPIDER_STORAGE_URL}}"
requires:
vars: ["SPIDER_STORAGE_URL"]
dir: "{{.G_BUILD_SPIDER_DIR}}"
deps:
- "venv"
- task: ":build:cpp-target"
vars:
TARGETS: [
"spider_task_executor",
"worker_test",
"client_test",
"spider_worker",
"spider_scheduler",
"integrationTest"
]
cmds:
- "uv run pytest -m \"storage\""
- |-
. {{.G_TEST_VENV_DIR}}/bin/activate
{{.G_TEST_VENV_DIR}}/bin/pytest tests/integration
Loading