Skip to content

Commit e7dc708

Browse files
committed
docs: Update KEP with DynamicAllocation API changes
1 parent 7672086 commit e7dc708

File tree

1 file changed

+46
-9
lines changed

1 file changed

+46
-9
lines changed

docs/proposals/107-spark-client/README.md

Lines changed: 46 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -145,7 +145,7 @@ if status.state != ApplicationState.COMPLETED:
145145
**So that** I can efficiently prepare data for model training.
146146

147147
```python
148-
from kubeflow.spark import BatchSparkClient, OperatorBackendConfig
148+
from kubeflow.spark import BatchSparkClient, OperatorBackendConfig, DynamicAllocation
149149

150150
config = OperatorBackendConfig(
151151
namespace="ml-jobs",
@@ -158,9 +158,10 @@ response = client.submit_application(
158158
main_application_file="s3a://ml/features/extract.py",
159159

160160
# Dynamic allocation for cost optimization
161-
enable_dynamic_allocation=True,
162-
min_executors=5,
163-
max_executors=50,
161+
dynamic_allocation=DynamicAllocation(
162+
min_executors=5,
163+
max_executors=50,
164+
),
164165

165166
# Resource configuration
166167
driver_cores=4,
@@ -345,7 +346,7 @@ response = client.submit_application(
345346
#### Advanced Features: Dynamic Allocation, Volumes, GPU
346347

347348
```python
348-
from kubeflow.spark import BatchSparkClient, OperatorBackendConfig
349+
from kubeflow.spark import BatchSparkClient, OperatorBackendConfig, DynamicAllocation
349350

350351
config = OperatorBackendConfig(namespace="default")
351352
client = BatchSparkClient(backend_config=config)
@@ -360,10 +361,11 @@ response = client.submit_application(
360361
executor_memory="8g",
361362

362363
# Enable dynamic allocation (auto-scaling)
363-
enable_dynamic_allocation=True,
364-
initial_executors=2,
365-
min_executors=1,
366-
max_executors=10,
364+
dynamic_allocation=DynamicAllocation(
365+
initial_executors=2,
366+
min_executors=1,
367+
max_executors=10,
368+
),
367369

368370
# Configure volumes for data access
369371
volumes=[{
@@ -771,6 +773,41 @@ class ApplicationStatus:
771773
message: Optional[str]
772774
```
773775

776+
### API Changes: Dynamic Allocation
777+
778+
**Breaking Change in v0.1.0**: Dynamic allocation now uses a configuration object instead of scattered parameters.
779+
780+
**Old Pattern (Deprecated):**
781+
```python
782+
client.submit_application(
783+
enable_dynamic_allocation=True,
784+
initial_executors=2,
785+
min_executors=1,
786+
max_executors=10,
787+
)
788+
```
789+
790+
**New Pattern:**
791+
```python
792+
from kubeflow.spark import DynamicAllocation
793+
794+
client.submit_application(
795+
dynamic_allocation=DynamicAllocation(
796+
initial_executors=2,
797+
min_executors=1,
798+
max_executors=10,
799+
),
800+
)
801+
```
802+
803+
**Key Changes:**
804+
- Import `DynamicAllocation` from `kubeflow.spark`
805+
- Pass as single `dynamic_allocation` parameter
806+
- `None` or omit parameter = disabled
807+
- Object presence = enabled (no `enabled` field)
808+
- Validation: `min_executors ≤ initial_executors ≤ max_executors`
809+
- Default values: `initial_executors` from `num_executors`, `min_executors=1`, `max_executors=num_executors*2`
810+
774811
### Impact Analysis
775812

776813
#### Impact on Data Engineers

0 commit comments

Comments
 (0)