Skip to content

Commit d3ea943

Browse files
[CI] Add pytest markers to current tests and update the doc. (#577)
Signed-off-by: Alicia <115451386+congw729@users.noreply.github.com> Co-authored-by: Hongsheng Liu <liuhongsheng4@huawei.com>
1 parent cdde401 commit d3ea943

37 files changed

+166
-95
lines changed

.buildkite/pipeline.yml

Lines changed: 2 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -20,19 +20,7 @@ steps:
2020
- label: "Simple Unit Test"
2121
depends_on: image-build
2222
commands:
23-
- |
24-
pytest -v -s \
25-
tests/entrypoints/ \
26-
tests/diffusion/cache/ \
27-
tests/diffusion/lora/ \
28-
tests/model_executor/models/qwen2_5_omni/test_audio_length.py \
29-
tests/worker/ \
30-
tests/distributed/omni_connectors/test_kv_flow.py \
31-
--cov=vllm_omni \
32-
--cov-branch \
33-
--cov-report=term-missing \
34-
--cov-report=html \
35-
--cov-report=xml
23+
- "pytest -v -s -m 'core_model and cpu' --cov=vllm_omni --cov-branch --cov-report=term-missing --cov-report=html --cov-report=xml"
3624
agents:
3725
queue: "gpu_1_queue"
3826
plugins:
@@ -118,7 +106,7 @@ steps:
118106
timeout_in_minutes: 15
119107
depends_on: image-build
120108
commands:
121-
- pytest -s -v tests/e2e/offline_inference/test_cache_dit.py tests/e2e/offline_inference/test_teacache.py
109+
- pytest -s -v -m 'core_model and cache and diffusion and not distributed_cuda and L4'
122110
agents:
123111
queue: "gpu_1_queue" # g6.4xlarge instance on AWS, has 1 L4 GPU
124112
plugins:

.buildkite/test-amd.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ steps:
4444
- export GPU_ARCHS=gfx942
4545
- export VLLM_LOGGING_LEVEL=DEBUG
4646
- export VLLM_WORKER_MULTIPROC_METHOD=spawn
47-
- pytest -s -v tests/e2e/offline_inference/test_cache_dit.py tests/e2e/offline_inference/test_teacache.py
47+
- pytest -s -v -m 'core_model and cache and diffusion and not distributed_rocm and MI325'
4848

4949
- label: "Diffusion Sequence Parallelism Test"
5050
timeout_in_minutes: 20

docs/contributing/ci/tests_markers.md

Lines changed: 28 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -5,33 +5,33 @@ By adding markers before test functions, tests can later be executed uniformly b
55
## Current Markers
66
Defined in `pyproject.toml`:
77

8-
| Marker | Description |
9-
| ------------------ | ------------------------------------------------------- |
10-
| `core_model` | Core model tests (run in each PR) |
11-
| `diffusion` | Diffusion model tests |
12-
| `omni` | Omni model tests |
13-
| `cache` | Cache backend tests |
14-
| `parallel` | Parallelism/distributed tests |
15-
| `cpu` | Tests that run on CPU |
16-
| `gpu` | Tests that run on GPU (auto-added) |
17-
| `cuda` | Tests that run on CUDA (auto-added) |
18-
| `rocm` | Tests that run on AMD/ROCm (auto-added) |
19-
| `npu` | Tests that run on NPU/Ascend (auto-added) |
20-
| `H100` | Tests that require H100 GPU |
21-
| `L4` | Tests that require L4 GPU |
22-
| `MI325` | Tests that require MI325 GPU (AMD/ROCm) |
23-
| `A2` | Tests that require A2 NPU |
24-
| `A3` | Tests that require A3 NPU |
25-
| `distributed_cuda` | Tests that require multi cards on CUDA platform |
26-
| `distributed_rocm` | Tests that require multi cards on ROCm platform |
27-
| `distributed_npu` | Tests that require multi cards on NPU platform |
28-
| `skipif_cuda` | Skip if the num of CUDA cards is less than the required |
29-
| `skipif_rocm` | Skip if the num of ROCm cards is less than the required |
30-
| `skipif_npu` | Skip if the num of NPU cards is less than the required |
31-
| `slow` | Slow tests (may skip in quick CI) |
32-
| `benchmark` | Benchmark tests |
33-
34-
For those markers shown as auto-added, they will be added by the `@hardware_test` decorator.
8+
| Marker | Description |
9+
| ------------------ | --------------------------------------------------------- |
10+
| `core_model` | Core model tests (run in each PR) |
11+
| `diffusion` | Diffusion model tests |
12+
| `omni` | Omni model tests |
13+
| `cache` | Cache backend tests |
14+
| `parallel` | Parallelism/distributed tests |
15+
| `cpu` | Tests that run on CPU |
16+
| `gpu` | Tests that run on GPU * |
17+
| `cuda` | Tests that run on CUDA * |
18+
| `rocm` | Tests that run on AMD/ROCm * |
19+
| `npu` | Tests that run on NPU/Ascend * |
20+
| `H100` | Tests that require H100 GPU * |
21+
| `L4` | Tests that require L4 GPU * |
22+
| `MI325` | Tests that require MI325 GPU (AMD/ROCm) * |
23+
| `A2` | Tests that require A2 NPU * |
24+
| `A3` | Tests that require A3 NPU * |
25+
| `distributed_cuda` | Tests that require multi cards on CUDA platform * |
26+
| `distributed_rocm` | Tests that require multi cards on ROCm platform * |
27+
| `distributed_npu` | Tests that require multi cards on NPU platform * |
28+
| `skipif_cuda` | Skip if the num of CUDA cards is less than the required * |
29+
| `skipif_rocm` | Skip if the num of ROCm cards is less than the required * |
30+
| `skipif_npu` | Skip if the num of NPU cards is less than the required * |
31+
| `slow` | Slow tests (may skip in quick CI) |
32+
| `benchmark` | Benchmark tests |
33+
34+
\* Means those markers are auto-added, and they will be added by the `@hardware_test` decorator.
3535

3636
### Example usage for markers
3737

@@ -71,10 +71,7 @@ This decorator is intended to make hardware-aware, cross-platform test authoring
7171
Support for `skipif_rocm` and `skipif_npu` will be implemented later.
7272

7373

74-
5. **Runs each test in a new process**
75-
Automatically wraps the distributed test with a decorator (`@create_new_process_for_each_test`) to ensure isolation and compatibility with multi-process hardware backends.
76-
77-
6. **Works with pytest filtering**
74+
5. **Works with pytest filtering**
7875
Allows tests to be filtered and selected at runtime using standard pytest marker expressions (e.g., `-m "distributed_cuda and L4"`).
7976

8077
#### Example usage for decorator
@@ -94,7 +91,6 @@ This decorator is intended to make hardware-aware, cross-platform test authoring
9491
```
9592
- `res` must be a dict; supported resources: CUDA (L4/H100), ROCm (MI325), NPU (A2/A3)
9693
- `num_cards` can be int (all platforms) or dict (per platform); defaults to 1 when missing
97-
- `hardware_test` automatically applies `@create_new_process_for_each_test` for distributed tests.
9894
- Distributed markers (`distributed_cuda`, `distributed_rocm`, `distributed_npu`) are auto-added for multi-card cases
9995
- Filtering examples:
10096
- CUDA only: `pytest -m "distributed_cuda and L4"`

pyproject.toml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -175,6 +175,10 @@ markers = [
175175
"slow: Slow tests (may skip in quick CI)",
176176
"benchmark: Benchmark tests",
177177
]
178+
filterwarnings = [
179+
"ignore:.*does not have '__test__' attribute.*:UserWarning",
180+
"ignore:.*does not have '__bases__' attribute.*:UserWarning",
181+
]
178182

179183
[tool.typos.default]
180184
extend-ignore-identifiers-re = [

tests/benchmarks/test_serve_cli.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import pytest
55

66
from tests.conftest import OmniServer
7+
from tests.utils import hardware_test
78

89
models = ["Qwen/Qwen3-Omni-30B-A3B-Instruct"]
910
stage_configs = [str(Path(__file__).parent.parent / "e2e" / "stage_configs" / "qwen3_omni_ci.yaml")]
@@ -29,6 +30,9 @@ def omni_server(request):
2930
print("OmniServer stopped")
3031

3132

33+
@pytest.mark.core_model
34+
@pytest.mark.benchmark
35+
@hardware_test(res={"cuda": "H100"}, num_cards=2)
3236
@pytest.mark.parametrize("omni_server", test_params, indirect=True)
3337
def test_bench_serve_chat(omni_server):
3438
command = [

tests/diffusion/cache/test_cache_backends.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,8 @@
2222
from vllm_omni.diffusion.cache.teacache.backend import TeaCacheBackend
2323
from vllm_omni.diffusion.data import DiffusionCacheConfig
2424

25+
pytestmark = [pytest.mark.core_model, pytest.mark.cpu]
26+
2527

2628
class TestCacheDiTBackend:
2729
"""Test CacheDiTBackend implementation."""

tests/diffusion/lora/test_base_linear.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,13 @@
55

66
from dataclasses import dataclass
77

8+
import pytest
89
import torch
910

1011
from vllm_omni.diffusion.lora.layers.base_linear import DiffusionBaseLinearLayerWithLoRA
1112

13+
pytestmark = [pytest.mark.core_model, pytest.mark.cpu]
14+
1215

1316
@dataclass
1417
class _DummyLoRAConfig:

tests/diffusion/lora/test_lora_manager.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33

44
from __future__ import annotations
55

6+
import pytest
67
import torch
78
from vllm.lora.lora_weights import LoRALayerWeights
89
from vllm.lora.utils import get_supported_lora_modules
@@ -11,6 +12,8 @@
1112
from vllm_omni.diffusion.lora.manager import DiffusionLoRAManager
1213
from vllm_omni.lora.request import LoRARequest
1314

15+
pytestmark = [pytest.mark.core_model, pytest.mark.cpu]
16+
1417

1518
class _DummyLoRALayer:
1619
def __init__(self, n_slices: int, output_slices: tuple[int, ...]):

tests/diffusion/test_diffusion_worker.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@
1717

1818
from vllm_omni.diffusion.worker.diffusion_worker import DiffusionWorker
1919

20+
pytestmark = [pytest.mark.core_model, pytest.mark.diffusion, pytest.mark.cpu]
21+
2022

2123
@pytest.fixture
2224
def mock_od_config():

tests/distributed/omni_connectors/test_kv_flow.py

Lines changed: 4 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,15 @@
11
import pytest
22
import torch
33

4-
from tests.utils import hardware_test
54
from vllm_omni.diffusion.request import OmniDiffusionRequest
65
from vllm_omni.distributed.omni_connectors.kv_transfer_manager import (
76
OmniKVCacheConfig,
87
OmniKVTransferManager,
98
)
109
from vllm_omni.inputs.data import OmniDiffusionSamplingParams
1110

11+
pytestmark = [pytest.mark.core_model, pytest.mark.cpu, pytest.mark.cache]
12+
1213

1314
class MockConnector:
1415
def __init__(self):
@@ -58,11 +59,6 @@ def common_constants():
5859
}
5960

6061

61-
@pytest.mark.cache
62-
@hardware_test(
63-
res={"cuda": "L4"},
64-
num_cards=2,
65-
)
6662
def test_manager_extraction(kv_config, mock_connector, common_constants):
6763
"""Test extraction and sending logic in OmniKVTransferManager."""
6864
num_layers = common_constants["num_layers"]
@@ -109,11 +105,6 @@ def test_manager_extraction(kv_config, mock_connector, common_constants):
109105
assert data["layer_blocks"]["key_cache"][0].shape == expected_shape
110106

111107

112-
@pytest.mark.cache
113-
@hardware_test(
114-
res={"cuda": "L4"},
115-
num_cards=2,
116-
)
117108
def test_manager_reception(kv_config, mock_connector, common_constants):
118109
"""Test reception and injection logic in OmniKVTransferManager."""
119110
num_layers = common_constants["num_layers"]
@@ -171,11 +162,6 @@ def test_manager_reception(kv_config, mock_connector, common_constants):
171162
assert req.kv_metadata["seq_len"] == seq_len
172163

173164

174-
@pytest.mark.cache
175-
@hardware_test(
176-
res={"cuda": "L4"},
177-
num_cards=2,
178-
)
179165
def test_integration_flow(common_constants):
180166
"""Simulate extraction -> connector -> reception."""
181167
num_layers = common_constants["num_layers"]
@@ -211,7 +197,8 @@ def test_integration_flow(common_constants):
211197
recv_timeout=1.0,
212198
)
213199
receiver_manager = OmniKVTransferManager(receiver_config)
214-
receiver_manager._connector = connector # Share the same mock connector instance
200+
# Share the same mock connector instance
201+
receiver_manager._connector = connector
215202

216203
req = OmniDiffusionRequest(
217204
prompts=["test_integ"],
@@ -228,11 +215,6 @@ def test_integration_flow(common_constants):
228215
assert req.kv_metadata["seq_len"] == 10
229216

230217

231-
@pytest.mark.cache
232-
@hardware_test(
233-
res={"cuda": "L4"},
234-
num_cards=2,
235-
)
236218
def test_manager_extraction_no_connector(kv_config, common_constants):
237219
"""Test extraction when connector is unavailable (should still return IDs)."""
238220
block_size = common_constants["block_size"]

0 commit comments

Comments
 (0)