Skip to content

Commit c4d8baf

Browse files
committed
checkpoint
1 parent 158224f commit c4d8baf

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+6957
-3
lines changed

.github/workflows/e2e_tests.yaml

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -115,31 +115,55 @@ jobs:
115115
kubectl create clusterrolebinding sdk-user-service-reader --clusterrole=service-reader --user=sdk-user
116116
kubectl create clusterrole port-forward-pods --verb=create --resource=pods/portforward
117117
kubectl create clusterrolebinding sdk-user-port-forward-pods-binding --clusterrole=port-forward-pods --user=sdk-user
118+
kubectl create clusterrole pod-creator-full --verb=get,list,watch,create,delete --resource=pods
119+
kubectl create clusterrolebinding sdk-user-pod-creator-full --clusterrole=pod-creator-full --user=sdk-user
120+
kubectl create clusterrole pod-logs --verb=get --resource=pods/log
121+
kubectl create clusterrolebinding sdk-user-pod-logs --clusterrole=pod-logs --user=sdk-user
122+
kubectl create clusterrole configmap-creator --verb=get,list,create,delete --resource=configmaps
123+
kubectl create clusterrolebinding sdk-user-configmap-creator --clusterrole=configmap-creator --user=sdk-user
118124
kubectl config use-context sdk-user
119125
120-
- name: Run e2e tests
126+
- name: Run e2e tests (legacy)
121127
run: |
122128
export CODEFLARE_TEST_OUTPUT_DIR=${{ env.TEMP_DIR }}
123129
echo "CODEFLARE_TEST_OUTPUT_DIR=${CODEFLARE_TEST_OUTPUT_DIR}" >> $GITHUB_ENV
124130
125131
set -euo pipefail
126132
pip install poetry
127133
poetry install --with test,docs
128-
echo "Running e2e tests..."
134+
echo "Running legacy e2e tests..."
129135
poetry run pytest -v -s ./tests/e2e/ -m 'kind and nvidia_gpu' > ${CODEFLARE_TEST_OUTPUT_DIR}/pytest_output.log 2>&1
130136
env:
131137
GRPC_DNS_RESOLVER: "native"
132138

139+
- name: Run e2e_v2 tests
140+
run: |
141+
export CODEFLARE_TEST_OUTPUT_DIR=${{ env.TEMP_DIR }}
142+
set -euo pipefail
143+
echo "Running e2e_v2 tests..."
144+
# Run tier1 tests for Kind platform
145+
poetry run pytest -v -s ./tests/e2e_v2/ -m 'kind and tier1' > ${CODEFLARE_TEST_OUTPUT_DIR}/pytest_e2e_v2_output.log 2>&1 || true
146+
echo "e2e_v2 tests completed"
147+
env:
148+
GRPC_DNS_RESOLVER: "native"
149+
continue-on-error: true # Allow legacy tests to run even if v2 fails during migration
150+
133151
- name: Switch to kind-cluster context to print logs
134152
if: always() && steps.deploy.outcome == 'success'
135153
run: kubectl config use-context kind-cluster
136154

137155
- name: Print Pytest output log
138156
if: always() && steps.deploy.outcome == 'success'
139157
run: |
140-
echo "Printing Pytest output logs"
158+
echo "Printing Pytest output logs (legacy e2e)"
141159
cat ${CODEFLARE_TEST_OUTPUT_DIR}/pytest_output.log
142160
161+
- name: Print e2e_v2 Pytest output log
162+
if: always() && steps.deploy.outcome == 'success'
163+
run: |
164+
echo "Printing e2e_v2 Pytest output logs"
165+
cat ${CODEFLARE_TEST_OUTPUT_DIR}/pytest_e2e_v2_output.log || echo "No e2e_v2 output found"
166+
143167
- name: Print CodeFlare operator logs
144168
if: always() && steps.deploy.outcome == 'success'
145169
run: |

tests/e2e_v2/README.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# E2E v2 Test Suite
2+
3+
This directory contains the restructured End-to-End (E2E) test suite for the CodeFlare SDK.
4+
5+
## Directory Structure
6+
7+
Tests are organized by **SDK feature/component**:
8+
9+
```
10+
tests/e2e_v2/
11+
├── ray_init/ # Tests for ray.init() functionality
12+
│ ├── test_remote.py # Remote connection tests
13+
│ ├── test_in_cluster.py # In-cluster connection tests
14+
│ └── test_remote_functions.py # Remote function execution tests
15+
16+
├── rayjob/ # Tests for RayJob submission
17+
│ ├── test_client.py # RayJobClient submission (remote + in-cluster)
18+
│ ├── test_cr_submission.py # RayJob CR submission
19+
│ ├── test_lifecycled.py # Lifecycled cluster tests
20+
│ └── test_lifecycled_queueing.py # Lifecycled cluster with Kueue
21+
22+
├── kueue/ # Kueue integration tests
23+
│ ├── test_admission.py
24+
│ ├── test_queueing.py
25+
│ └── test_resource_flavors.py
26+
27+
├── security/ # Security and network tests
28+
│ ├── test_oauth.py
29+
│ ├── test_mtls_remote.py
30+
│ ├── test_mtls_on_cluster.py
31+
│ └── test_network_policies.py
32+
33+
├── cluster_config/ # Cluster configuration tests
34+
│ └── test_heterogeneous.py
35+
36+
└── utils/ # Shared utilities
37+
├── base_test.py # Base test classes
38+
├── helpers.py # Helper functions
39+
├── kueue.py # Kueue utilities
40+
├── pod_execution.py # In-cluster pod execution
41+
├── support.py # Support functions
42+
└── scripts/ # Test scripts
43+
```
44+
45+
## Organization Principles
46+
47+
1. **Feature-first**: Tests are grouped by SDK feature (ray.init, rayjob, kueue, etc.)
48+
2. **Execution context**: Remote vs in-cluster tests are clearly separated by filename or test method
49+
3. **Shared utilities**: Common functionality is in `utils/`
50+
51+
## Running Tests
52+
53+
```bash
54+
# Run all e2e_v2 tests
55+
pytest tests/e2e_v2/
56+
57+
# Run tests for a specific feature
58+
pytest tests/e2e_v2/ray_init/
59+
pytest tests/e2e_v2/rayjob/
60+
61+
# Run tests by marker
62+
pytest tests/e2e_v2/ -m "kind and tier1"
63+
pytest tests/e2e_v2/ -m "openshift and nvidia_gpu"
64+
```
65+
66+
## Test Markers
67+
68+
- `@pytest.mark.kind` - Run on Kind clusters
69+
- `@pytest.mark.openshift` - Run on OpenShift clusters
70+
- `@pytest.mark.nvidia_gpu` - Requires GPU
71+
- `@pytest.mark.tier1` - Standard test suite
72+
- `@pytest.mark.smoke` - Quick validation tests
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
# E2E v2 Directory Reorganization Proposal
2+
3+
## Current Structure Issues
4+
5+
The current structure mixes different organizational principles:
6+
- **Cluster lifecycle types**: `workspace_clusters/`, `temporary_clusters/`, `lifecycled_clusters/`
7+
- **Features**: `kueue_integration/`, `secure_trusted_network/`
8+
- **Miscellaneous**: `other/` (vague category)
9+
10+
This makes it hard to find tests and understand the test suite organization.
11+
12+
## Proposed Structure
13+
14+
Organize by **feature/functionality** as the primary principle, with execution context (remote/in-cluster) clearly separated:
15+
16+
```
17+
tests/e2e_v2/
18+
├── conftest.py # Pytest configuration and fixtures
19+
├── __init__.py
20+
21+
├── cluster_management/ # Cluster lifecycle and management
22+
│ ├── workspace/ # Long-running clusters
23+
│ │ ├── test_ray_init.py # Combines remote + in-cluster
24+
│ │ └── test_remote_functions.py
25+
│ ├── temporary/ # Short-lived clusters
26+
│ │ └── test_rayjob_submission.py
27+
│ └── lifecycled/ # RayJob-managed clusters
28+
│ └── test_lifecycled_clusters.py
29+
30+
├── kueue/ # Kueue integration tests
31+
│ ├── test_admission.py
32+
│ ├── test_queueing.py
33+
│ └── test_resource_flavors.py
34+
35+
├── security/ # Security and network tests
36+
│ ├── test_oauth.py
37+
│ ├── test_mtls.py # Combines remote + in-cluster
38+
│ └── test_network_policies.py
39+
40+
├── cluster_config/ # Cluster configuration tests
41+
│ └── test_heterogeneous.py
42+
43+
└── utils/ # Shared utilities
44+
├── base_test.py
45+
├── helpers.py
46+
├── kueue.py
47+
├── pod_execution.py
48+
├── support.py
49+
└── scripts/
50+
```
51+
52+
## Alternative: Keep Execution Context Separate
53+
54+
If we want to emphasize the dual execution context pattern:
55+
56+
```
57+
tests/e2e_v2/
58+
├── conftest.py
59+
├── __init__.py
60+
61+
├── ray_init/ # Feature: ray.init()
62+
│ ├── test_remote.py # Remote execution context
63+
│ └── test_in_cluster.py # In-cluster execution context
64+
65+
├── rayjob/ # Feature: RayJob submission
66+
│ ├── test_client_remote.py
67+
│ ├── test_client_in_cluster.py
68+
│ └── test_cr_submission.py
69+
70+
├── kueue/ # Feature: Kueue integration
71+
│ ├── test_admission.py
72+
│ ├── test_queueing.py
73+
│ └── test_resource_flavors.py
74+
75+
├── security/ # Feature: Security
76+
│ ├── test_oauth.py
77+
│ ├── test_mtls_remote.py
78+
│ ├── test_mtls_in_cluster.py
79+
│ └── test_network_policies.py
80+
81+
├── cluster_config/ # Feature: Cluster configuration
82+
│ └── test_heterogeneous.py
83+
84+
└── utils/
85+
```
86+
87+
## Recommendation
88+
89+
**Option 1 (Feature-first)** is recommended because:
90+
1. ✅ Tests are grouped by what they test (feature/functionality)
91+
2. ✅ Easier to find tests for a specific feature
92+
3. ✅ Execution context (remote/in-cluster) can be in the same file or clearly named
93+
4. ✅ Cluster lifecycle is less important than what functionality is being tested
94+
5. ✅ Eliminates vague "other/" category
95+
96+
## Migration Plan
97+
98+
1. Create new directory structure
99+
2. Move files to new locations
100+
3. Update imports in test files
101+
4. Update any documentation references
102+
5. Verify tests still run correctly

tests/e2e_v2/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# E2E Test Suite v2
2+
# Restructured pytest-based E2E tests for CodeFlare SDK
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
"""
2+
Cluster configuration tests.
3+
"""

0 commit comments

Comments
 (0)