Services
RPS window for autoscaling
Services now support a window property in the scaling spec that defines the time window used to calculate RPS. Allowed values are 30s, 1m, and 5m (default is 1m). Previously, the RPS was always calculated using a 1m window.
type: service
image: nginx
port: 80
replicas: 0..1
scaling:
metric: rps
# 1 request per second, calculated over a 5-minute window
target: 1
window: 5mKubernetes
registry_auth
The kubernetes backend now supports the registry_auth property for pulling Docker images from private registries:
type: service
image: nvcr.io/nim/deepseek-ai/deepseek-r1-distill-llama-8b
registry_auth:
username: $oauthtoken
password: ${{ secrets.ngc_api_key }}dstack automatically creates and sets up imagePullSecrets for the pods. This requires new permissions for the Kubernetes role:
rules:
resources: ["secrets"]
verbs: ["create", "delete"]Read-only volumes
Kubernetes volume configurations now support a new read_only property. When set to true, it enforces readOnly: true in the pod's volumeMounts.
type: volume
backend: kubernetes
name: my-volume
size: 100GB
read_only: trueServer
Faster processing
The server has been optimized to reduce processing latencies. As a result, many operations now take less time: run provisioning is up to 14s faster and run termination is up to 7s faster.
Examples
Documentation and examples have been refreshed, including a new Qwen3.6-27B and DeepSeek V4 examples. A new prefill-decode blog post shows how to run SGLang PD disaggregation via Shepherd Model Gateway.
Breaking changes
Python 3.9 support dropped
Running dstack on Python 3.9 is no longer supported, as Python 3.9 reached end-of-life on 2025-10-31. Please upgrade to Python 3.10 or later.
What's Changed
- Refresh quickstart and service docs with Qwen3.6-27B by @peterschmidt85 in #3819
- Disallow running
dstackon Python 3.9 by @jvstme in #3817 - Create placeholder instance models by @r4victor in #3821
- Add DeepSeek V4 model docs by @peterschmidt85 in #3823
- Reduce pipelines processing latencies by @r4victor in #3828
- [Docs]: Update
scale_up/down_delaydescriptions by @jvstme in #3831 - Clean up exports on project and fleet deletion by @jvstme in #3827
- [shim,runner] Improve logging options by @un-def in #3822
- Allow configuring RPS window for service scaling by @jvstme in #3830
- Replace sglang_router with smg in PD examples by @Bihan in #3836
- Interpolate JobSpec secrets for Compute.run_job() by @un-def in #3834
- Kubernetes: configure
imagePullSecretsby @un-def in #3835 - Kubernetes: add
read_onlyvolume property by @un-def in #3838
Full Changelog: 0.20.18...0.20.19