Skip to content
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
915dc91
Add start mariadb
sitaowang1998 Jan 7, 2026
6300ca0
Merge branch 'main' into storage-test
sitaowang1998 Jan 7, 2026
5e5f7a0
Format file
sitaowang1998 Jan 7, 2026
32e4462
Add stop
sitaowang1998 Jan 7, 2026
cd34070
Add get free port
sitaowang1998 Jan 7, 2026
50d0288
Add description
sitaowang1998 Jan 7, 2026
a622008
Refactor init_db into python script
sitaowang1998 Jan 7, 2026
aa7e551
Add db wait and bug fix
sitaowang1998 Jan 8, 2026
37e68cb
Use env for spider-py unit tests
sitaowang1998 Jan 8, 2026
dee2ecb
Add storage in tasks
sitaowang1998 Jan 8, 2026
6a28b85
Cpp integration tests use mariadb
sitaowang1998 Jan 8, 2026
38dee77
Make start mariadb internal
sitaowang1998 Jan 8, 2026
0705901
Use env for storage url in C++ unit tests
sitaowang1998 Jan 8, 2026
c115833
Fix clang-tidy
sitaowang1998 Jan 8, 2026
62d9add
Use general unit tests and add integration tests
sitaowang1998 Jan 8, 2026
82f882e
Update spider-py readme
sitaowang1998 Jan 8, 2026
0f72824
Update test doc
sitaowang1998 Jan 8, 2026
5e871bc
Fix timestamp
sitaowang1998 Jan 8, 2026
0173937
Fix typo
sitaowang1998 Jan 8, 2026
909de43
Fix lint
sitaowang1998 Jan 8, 2026
ef02ec8
Bug fix
sitaowang1998 Jan 8, 2026
ecac810
Fix import
sitaowang1998 Jan 8, 2026
55f7300
Fix GH workflow name
sitaowang1998 Jan 8, 2026
89c8b6b
Fix doc
sitaowang1998 Jan 8, 2026
9b3fa91
Bug fix
sitaowang1998 Jan 8, 2026
f03981e
Fix multiline string
sitaowang1998 Jan 8, 2026
4ca400c
Fix SQL format
sitaowang1998 Jan 9, 2026
b21e361
Restruct file
sitaowang1998 Jan 9, 2026
d0b32cc
Use variable
sitaowang1998 Jan 9, 2026
0678ea1
Stop container if wait fail
sitaowang1998 Jan 12, 2026
26b8cc6
Polishing test tasks.
LinZhihao-723 Jan 12, 2026
e25e09b
Polish docstring.
LinZhihao-723 Jan 12, 2026
068f454
->
LinZhihao-723 Jan 12, 2026
7d21f3c
CSC100: Test your changes before pushing.
LinZhihao-723 Jan 12, 2026
1ae488e
Apply suggestions from code review
sitaowang1998 Jan 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: "unit-tests"
name: "tests"

on:
pull_request:
Expand All @@ -17,7 +17,7 @@ concurrency:
cancel-in-progress: true

jobs:
non-storage-unit-tests:
tests:
strategy:
matrix:
os: ["ubuntu-22.04", "ubuntu-24.04"]
Expand Down Expand Up @@ -55,6 +55,8 @@ jobs:
SPIDER_DEPS_MAX_PARALLELISM_PER_TASK: "1"
run: "task deps:lib_install"

- run: "task test:cpp-non-storage-unit-tests"
- run: "task test:cpp-unit-tests"

- run: "task test:spider-py-non-storage-unit-tests"
- run: "task test:cpp-integration"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Group unit tests before integration tests.


- run: "task test:spider-py-unit-tests"
45 changes: 6 additions & 39 deletions docs/src/dev-docs/testing.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,5 @@
# Testing

## Set up storage backend

Spider relies on a fault-tolerant storage to store metadata and data. Spider's unit tests also
require this storage backend.

### Set up MySQL as storage backend

1. Start a MySQL database running in background.
2. Create an empty database.
```sql
CREATE DATABASE <db_name>;
```
3. Set the password for `root` or create another user with password and grant access to database
created in step 2.
```sql
ALTER USER 'root'@'localhost' IDENTIFIED BY '<pwd>';
--- OR create a new user
CREATE USER '<usr>'@'localhost' IDENTIFIED BY '<pwd>';
GRANT ALL PRIVILEGES ON <db_name>.* TO '<usr>'@'localhost';
```
4. Set the `cStorageUrl` in `tests/storage/StorageTestHelper.hpp` to
`jdbc:mariadb://localhost:3306/<db_name>?user=<usr>&password=<pwd>`.

5. Set the `storage_url` in `tests/integration/client.py` to
`jdbc:mariadb://localhost:3306/<db_name>?user=<usr>&password=<pwd>`.

## Running unit tests

You can use the following tasks to run the set of unit tests that's appropriate.
Expand All @@ -39,18 +13,6 @@ You can use the following tasks to run the set of unit tests that's appropriate.
| `test:spider-py-non-storage-unit-tests` | Runs all spider-py unit tests which don't require a storage backend to run. |
| `test:spider-py-storage-unit-tests` | Runs all spider-py unit tests which require a storage backend to run. |

If any tests show error messages for the connection function below, revisit the
[setup section](#set-up-mysql-as-storage-backend) and verify that `cStorageUrl` was set correctly.

```c++
REQUIRE( storage->connect(spider::test::cStorageUrl).success() )
```

## GitHub unit test workflow

The [unit_tests.yaml][gh-workflow-unit-tests] GitHub workflow runs the unit tests on push,
pull requests, and daily. Currently, it only runs unit tests that don't require a storage backend.

## Running integration tests

You can use the following tasks to run integration tests.
Expand All @@ -59,5 +21,10 @@ You can use the following tasks to run integration tests.
|------------------------|---------------------------------|
| `test:cpp-integration` | Runs all C++ integration tests. |

## GitHub test workflow

The [tests.yaml][gh-workflow-tests] GitHub workflow runs all unit tests and integration tests on
push, pull requests, and daily.


[gh-workflow-unit-tests]: https://github.com/y-scope/spider/blob/main/.github/workflows/unit-tests.yaml
[gh-workflow-tests]: https://github.com/y-scope/spider/blob/main/.github/workflows/tests.yaml
50 changes: 1 addition & 49 deletions python/spider-py/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,60 +26,12 @@ directory at the Spider project root.

## Testing

Unit tests are divided into two categories: storage and non-storage tests. Non-storage tests do not
require any external services, while storage tests require a MariaDB instance to be available.

### Non-Storage Unit Tests

To run all non-storage unit tests:

```shell
task test:spider-py-non-storage-unit-tests
```

### Setup MariaDB for Storage Unit Tests

To run storage unit tests, we need to create a MariaDB instance first.

```shell
docker run \
--detach \
--rm \
--name spider-storage \
--env MARIADB_USER=spider \
--env MARIADB_PASSWORD=password \
--env MARIADB_DATABASE=spider-storage \
--env MARIADB_ALLOW_EMPTY_ROOT_PASSWORD=true \
--publish 3306:3306 mariadb:latest
```

After the docker container starts, set up the database table manually by using the SQL script
`tools/scripts/storage/init_db.sql` from the project root.

```shell
mysql -h 127.0.0.1 -u spider -ppassword spider-storage < tools/scripts/storage/init_db.sql
```

### Storage Unit Tests

To run all storage unit tests:

```shell
task test:spider-py-storage-unit-tests
```

This requires a running MariaDB instance as described above.

### All Unit Tests

To run all unit tests (both storage and non-storage):
To run all unit tests:

```shell
task test:spider-py-unit-tests
```

This requires a running MariaDB instance as described above.

## Linting

To run all linting checks:
Expand Down
4 changes: 3 additions & 1 deletion python/spider-py/tests/client/test_driver.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Tests for the driver module."""

import os
from dataclasses import dataclass

import pytest
Expand All @@ -12,7 +13,8 @@
@pytest.fixture(scope="session")
def driver() -> Driver:
"""Fixture for the driver."""
return Driver(MariaDBTestUrl)
url = os.getenv("SPIDER_STORAGE_URL", MariaDBTestUrl)
return Driver(url)


def double(_: TaskContext, x: Int8) -> Int8:
Expand Down
4 changes: 3 additions & 1 deletion python/spider-py/tests/storage/test_mariadb.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Tests for the MariaDB storage backend."""

import os
from uuid import uuid4

import msgpack
Expand All @@ -15,7 +16,8 @@
@pytest.fixture(scope="session")
def mariadb_storage() -> MariaDBStorage:
"""Fixture to create a MariaDB storage instance."""
params = parse_jdbc_url(MariaDBTestUrl)
url = os.getenv("SPIDER_STORAGE_URL", MariaDBTestUrl)
params = parse_jdbc_url(url)
return MariaDBStorage(params)


Expand Down
1 change: 1 addition & 0 deletions taskfiles/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,7 @@ tasks:
cmds:
- for:
- "tests/integration"
- "tools/scripts/storage"
cmd: |-
. "{{.G_LINT_VENV_DIR}}/bin/activate"
mypy "{{.ITEM}}"
Expand Down
124 changes: 121 additions & 3 deletions taskfiles/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
G_UNIT_TEST_BINARY: "{{.G_BUILD_SPIDER_DIR}}/tests/unitTest"
G_TEST_VENV_DIR: "{{.G_BUILD_DIR}}/test-venv"
G_TEST_VENV_CHECKSUM_FILE: "{{.G_BUILD_DIR}}/test#venv.md5"
G_MARIADB_DATABASE: "spider-db"
G_MARIADB_USERNAME: "spider-user"
G_MARIADB_PASSWORD: "spider-password"

tasks:
cpp-non-storage-unit-tests:
Expand All @@ -15,13 +18,49 @@
cpp-storage-unit-tests:
deps:
- "build-unit-test"
env:
SPIDER_STORAGE_URL: "jdbc:mariadb://127.0.0.1:{{.MARIADB_PORT}}/{{.G_MARIADB_DATABASE}}?user={{.G_MARIADB_USERNAME}}&password={{.G_MARIADB_PASSWORD}}"

Check failure on line 22 in taskfiles/test.yaml

View workflow job for this annotation

GitHub Actions / lint

22:101 [line-length] line too long (156 > 100 characters)
vars:
MARIADB_CONTAINER_NAME:
# Normalize UUID casing: macOS generates uppercase while Linux generates lowercase.
sh: "uuidgen | tr '[:upper:]' '[:lower:]' | sed 's/^/spider-mariadb-/'"
MARIADB_PORT:
sh: "tools/scripts/storage/get_free_port.py"
cmds:
- task: "start-storage"
vars:
MARIADB_CONTAINER_NAME: "{{.MARIADB_CONTAINER_NAME}}"
MARIADB_DATABASE: "{{.G_MARIADB_DATABASE}}"
MARIADB_USERNAME: "{{.G_MARIADB_USERNAME}}"
MARIADB_PASSWORD: "{{.G_MARIADB_PASSWORD}}"
MARIADB_PORT: "{{.MARIADB_PORT}}"
- defer: |-
{{.ROOT_DIR}}/tools/scripts/storage/stop.py \
--name "{{.MARIADB_CONTAINER_NAME}}"
- "{{.G_UNIT_TEST_BINARY}} \"[storage]\""

cpp-unit-tests:
deps:
- "build-unit-test"
env:
SPIDER_STORAGE_URL: "jdbc:mariadb://127.0.0.1:{{.MARIADB_PORT}}/{{.G_MARIADB_DATABASE}}?user={{.G_MARIADB_USERNAME}}&password={{.G_MARIADB_PASSWORD}}"

Check failure on line 46 in taskfiles/test.yaml

View workflow job for this annotation

GitHub Actions / lint

46:101 [line-length] line too long (156 > 100 characters)
vars:
MARIADB_CONTAINER_NAME:
# Normalize UUID casing: macOS generates uppercase while Linux generates lowercase.
sh: "uuidgen | tr '[:upper:]' '[:lower:]' | sed 's/^/spider-mariadb-/'"
MARIADB_PORT:
sh: "tools/scripts/storage/get_free_port.py"
cmds:
- task: "start-storage"
vars:
MARIADB_CONTAINER_NAME: "{{.MARIADB_CONTAINER_NAME}}"
MARIADB_DATABASE: "{{.G_MARIADB_DATABASE}}"
MARIADB_USERNAME: "{{.G_MARIADB_USERNAME}}"
MARIADB_PASSWORD: "{{.G_MARIADB_PASSWORD}}"
MARIADB_PORT: "{{.MARIADB_PORT}}"
- defer: |-
{{.ROOT_DIR}}/tools/scripts/storage/stop.py \
--name "{{.MARIADB_CONTAINER_NAME}}"
- "{{.G_UNIT_TEST_BINARY}}"

build-unit-test:
Expand All @@ -44,9 +83,28 @@
"spider_worker",
"spider_scheduler",
"integrationTest"]
cmd: |-
. ../test-venv/bin/activate
../test-venv/bin/pytest tests/integration
env:
SPIDER_STORAGE_URL: "jdbc:mariadb://127.0.0.1:{{.MARIADB_PORT}}/{{.G_MARIADB_DATABASE}}?user={{.G_MARIADB_USERNAME}}&password={{.G_MARIADB_PASSWORD}}"

Check failure on line 87 in taskfiles/test.yaml

View workflow job for this annotation

GitHub Actions / lint

87:101 [line-length] line too long (156 > 100 characters)
vars:
MARIADB_CONTAINER_NAME:
# Normalize UUID casing: macOS generates uppercase while Linux generates lowercase.
sh: "uuidgen | tr '[:upper:]' '[:lower:]' | sed 's/^/spider-mariadb-/'"
MARIADB_PORT:
sh: "tools/scripts/storage/get_free_port.py"
cmds:
- task: "start-storage"
vars:
MARIADB_CONTAINER_NAME: "{{.MARIADB_CONTAINER_NAME}}"
MARIADB_DATABASE: "{{.G_MARIADB_DATABASE}}"
MARIADB_USERNAME: "{{.G_MARIADB_USERNAME}}"
MARIADB_PASSWORD: "{{.G_MARIADB_PASSWORD}}"
MARIADB_PORT: "{{.MARIADB_PORT}}"
- defer: |-
{{.ROOT_DIR}}/tools/scripts/storage/stop.py \
--name "{{.MARIADB_CONTAINER_NAME}}"
- |-
. ../test-venv/bin/activate
../test-venv/bin/pytest tests/integration

venv:
internal: true
Expand Down Expand Up @@ -82,7 +140,24 @@
env:
# Don't create __pycache__ directories in the source tree.
PYTHONDONTWRITEBYTECODE: "1"
SPIDER_STORAGE_URL: "jdbc:mariadb://127.0.0.1:{{.MARIADB_PORT}}/{{.G_MARIADB_DATABASE}}?user={{.G_MARIADB_USERNAME}}&password={{.G_MARIADB_PASSWORD}}"

Check failure on line 143 in taskfiles/test.yaml

View workflow job for this annotation

GitHub Actions / lint

143:101 [line-length] line too long (156 > 100 characters)
vars:
MARIADB_CONTAINER_NAME:
# Normalize UUID casing: macOS generates uppercase while Linux generates lowercase.
sh: "uuidgen | tr '[:upper:]' '[:lower:]' | sed 's/^/spider-mariadb-/'"
MARIADB_PORT:
sh: "tools/scripts/storage/get_free_port.py"
cmds:
- task: "start-storage"
vars:
MARIADB_CONTAINER_NAME: "{{.MARIADB_CONTAINER_NAME}}"
MARIADB_DATABASE: "{{.G_MARIADB_DATABASE}}"
MARIADB_USERNAME: "{{.G_MARIADB_USERNAME}}"
MARIADB_PASSWORD: "{{.G_MARIADB_PASSWORD}}"
MARIADB_PORT: "{{.MARIADB_PORT}}"
- defer: |-
{{.ROOT_DIR}}/tools/scripts/storage/stop.py \
--name "{{.MARIADB_CONTAINER_NAME}}"
- "uv run pytest"

spider-py-non-storage-unit-tests:
Expand All @@ -98,5 +173,48 @@
env:
# Don't create __pycache__ directories in the source tree.
PYTHONDONTWRITEBYTECODE: "1"
SPIDER_STORAGE_URL: "jdbc:mariadb://127.0.0.1:{{.MARIADB_PORT}}/{{.G_MARIADB_DATABASE}}?user={{.G_MARIADB_USERNAME}}&password={{.G_MARIADB_PASSWORD}}"

Check failure on line 176 in taskfiles/test.yaml

View workflow job for this annotation

GitHub Actions / lint

176:101 [line-length] line too long (156 > 100 characters)
vars:
MARIADB_CONTAINER_NAME:
# Normalize UUID casing: macOS generates uppercase while Linux generates lowercase.
sh: "uuidgen | tr '[:upper:]' '[:lower:]' | sed 's/^/spider-mariadb-/'"
MARIADB_PORT:
sh: "tools/scripts/storage/get_free_port.py"
cmds:
- task: "start-storage"
vars:
MARIADB_CONTAINER_NAME: "{{.MARIADB_CONTAINER_NAME}}"
MARIADB_DATABASE: "{{.G_MARIADB_DATABASE}}"
MARIADB_USERNAME: "{{.G_MARIADB_USERNAME}}"
MARIADB_PASSWORD: "{{.G_MARIADB_PASSWORD}}"
MARIADB_PORT: "{{.MARIADB_PORT}}"
- defer: |-
{{.ROOT_DIR}}/tools/scripts/storage/stop.py \
--name "{{.MARIADB_CONTAINER_NAME}}"
- "uv run pytest -m \"storage\""

start-storage:
internal: true
vars:
MARIADB_CONTAINER_NAME: "{{.MARIADB_CONTAINER_NAME}}"
MARIADB_DATABASE: "{{.MARIADB_DATABASE}}"
MARIADB_USERNAME: "{{.MARIADB_USERNAME}}"
MARIADB_PASSWORD: "{{.MARIADB_PASSWORD}}"
MARIADB_PORT: "{{.MARIADB_PORT | default 3306}}"
cmds:
- |-
tools/scripts/storage/start.py \
--name "{{.MARIADB_CONTAINER_NAME}}" \
--port "{{.MARIADB_PORT}}" \
--database "{{.MARIADB_DATABASE}}" \
--username "{{.MARIADB_USERNAME}}" \
--password "{{.MARIADB_PASSWORD}}"
- |-
tools/scripts/storage/wait_for_db.py \
--name "{{.MARIADB_CONTAINER_NAME}}"
- |-
tools/scripts/storage/init_db.py \
--port "{{.MARIADB_PORT}}" \
--database "{{.MARIADB_DATABASE}}" \
--username "{{.MARIADB_USERNAME}}" \
--password "{{.MARIADB_PASSWORD}}"
13 changes: 11 additions & 2 deletions tests/integration/client.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
"""Simple Spider client for testing purposes."""

import os
import re
import uuid
from collections.abc import Generator
Expand Down Expand Up @@ -111,7 +112,15 @@ def is_head_task(task_id: uuid.UUID, dependencies: list[tuple[uuid.UUID, uuid.UU
return not any(dependency[1] == task_id for dependency in dependencies)


g_storage_url = "jdbc:mariadb://localhost:3306/spider_test?user=root&password=password"
G_STORAGE_URL = "jdbc:mariadb://localhost:3306/spider_test?user=root&password=password"


def get_storage_url() -> str:
"""
Gets the storage URL from the environment variable or uses the default.
:return: The storage URL.
"""
return os.getenv("SPIDER_STORAGE_URL", G_STORAGE_URL)


@pytest.fixture(scope="session")
Expand All @@ -122,7 +131,7 @@ def storage() -> Generator[SQLConnection, None, None]:
after the test session is complete.
:return: A generator yielding a MySQL connection object.
"""
conn = create_connection(g_storage_url)
conn = create_connection(get_storage_url())
yield conn
conn.close()

Expand Down
Loading
Loading