Skip to content

Commit e384f30

Browse files
authored
Various deployment fixes and improvements (#4097)
* Various deployment fixes and improvements * fixed the GCP deployer service name max size * implemented reload setting for local deployments * reworked the app runner to correctly use sub-processes for multiple workers * fixed concurrency by instantiating a deployment orchestrator for each request * configured REST zen store connection pool size to match ASGI app thread count * implemented client-side healthcheck for deployments * Updated docs * Fix unit tests
1 parent 6d19205 commit e384f30

File tree

10 files changed

+227
-103
lines changed

10 files changed

+227
-103
lines changed

docs/book/component-guide/deployers/local.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ For additional configuration of the Local deployer, you can pass the following `
5252
* `port_range`: The range of ports to search for a free port. Defaults to `(8000, 65535)`.
5353
* `address`: The address that the deployment server will listen on. Defaults to `127.0.0.1`.
5454
* `blocking`: Whether to run the deployment in the current process instead of running it as a daemon process. Defaults to False. Use this if you want to debug issues with the deployment ASGI application itself.
55+
* `auto_reload`: Whether to enable auto-reload for the uvicorn server. This is useful to speed up local development by automatically restarting the server when code changes are detected without requiring a re-provisioning of the entire deployment. Defaults to False. NOTE: the `auto_reload` setting has no effect on changes in the pipeline configuration, step configuration or stack configuration.
5556

5657
Check out [this docs page](https://docs.zenml.io/concepts/steps_and_pipelines/configuration) for more information on how to specify settings.
5758

docs/book/how-to/deployment/deployment_settings.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,27 @@ A rudimentary playground dashboard is included with the ZenML python package tha
241241
When supplying your own custom dashboard, you may also need to [customize the security headers](./deployment_settings#secure-headers) to allow the dashboard to access various resources. For example, you may want to tweak the `Content-Security-Policy` header to allow the dashboard to access external javascript libraries, images, etc.
242242
{% endhint %}
243243

244+
#### Jinja2 templates
245+
246+
You can use a Jinja2 template to dynamically generate the `index.html` file that hosts the single-page application. This is useful if you want to dynamically generate the dashboard files based on the pipeline configuration, step configuration or stack configuration. A `service_info` variable is passed to the template that contains the service information, such as the service name, version, and description. This variable has the same structure as the `zenml.deployers.server.models.ServiceInfo` model.
247+
248+
Example:
249+
250+
```jinja2
251+
<html>
252+
<head>
253+
<title>Pipeline: {{ service_info.pipeline.pipeline_name }}</title>
254+
<meta charset="UTF-8">
255+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
256+
<link rel="stylesheet" href="https://unpkg.com/mvp.css">
257+
</head>
258+
<body>
259+
<h1>Pipeline: {{ service_info.pipeline.pipeline_name }}</h1>
260+
<p>Deployment: {{ service_info.deployment.name }}</p>
261+
</body>
262+
</html>
263+
```
264+
244265
### CORS
245266

246267
Fine-tune cross-origin access:
@@ -363,6 +384,17 @@ settings:
363384

364385
Tune server runtime parameters for performance and topology:
365386

387+
The following settings are available for tuning the uvicorn server:
388+
* `thread_pool_size`: the size of the thread pool for CPU-bound work offload.
389+
* `uvicorn_host`: the host to bind the uvicorn server to.
390+
* `uvicorn_port`: the port to bind the uvicorn server to.
391+
* `uvicorn_workers`: the number of workers to use for the uvicorn server.
392+
* `log_level`: the log level to use for the uvicorn server.
393+
* `uvicorn_reload`: whether to enable auto-reload for the uvicorn server. This is useful when using [the local Deployer stack component](https://docs.zenml.io/stacks/stack-components/deployers/docker) to speed up local development by automatically restarting the server when code changes are detected. NOTE: the `uvicorn_reload` setting has no effect on changes in the pipeline configuration, step configuration or stack configuration.
394+
* `uvicorn_kwargs`: a dictionary of keyword arguments to pass to the uvicorn server.
395+
396+
The following settings are available:
397+
366398
```python
367399
from zenml.config import DeploymentSettings
368400
from zenml.enums import LoggingLevels

src/zenml/config/deployment_settings.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@
3737

3838
logger = get_logger(__name__)
3939

40-
DEFAULT_DEPLOYMENT_APP_THREAD_POOL_SIZE = 20
40+
DEFAULT_DEPLOYMENT_APP_THREAD_POOL_SIZE = 40
4141

4242
DEFAULT_DEPLOYMENT_APP_SECURE_HEADERS_HSTS = (
4343
"max-age=63072000; includeSubdomains"
@@ -633,6 +633,8 @@ class DeploymentSettings(BaseSettings):
633633
uvicorn_host: Host of the uvicorn server.
634634
uvicorn_port: Port of the uvicorn server.
635635
uvicorn_workers: Number of workers for the uvicorn server.
636+
uvicorn_reload: Whether to automatically reload the deployment when the
637+
code changes.
636638
log_level: Log level for the deployment application.
637639
uvicorn_kwargs: Keyword arguments for the uvicorn server.
638640
@@ -694,6 +696,7 @@ class DeploymentSettings(BaseSettings):
694696
uvicorn_host: str = "0.0.0.0" # nosec
695697
uvicorn_port: int = 8000
696698
uvicorn_workers: int = 1
699+
uvicorn_reload: bool = False
697700
log_level: LoggingLevels = LoggingLevels.INFO
698701

699702
uvicorn_kwargs: Dict[str, Any] = {}

src/zenml/deployers/base_deployer.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,12 @@
3030
)
3131
from uuid import UUID
3232

33+
import requests
34+
3335
from zenml.analytics.enums import AnalyticsEvent
3436
from zenml.analytics.utils import track_handler
3537
from zenml.client import Client
38+
from zenml.config import DeploymentDefaultEndpoints
3639
from zenml.config.base_settings import BaseSettings
3740
from zenml.constants import (
3841
ENV_ZENML_ACTIVE_PROJECT_ID,
@@ -300,6 +303,56 @@ def _generate_auth_key(self, key_length: int = 32) -> str:
300303
alphabet = string.ascii_letters + string.digits
301304
return "".join(secrets.choice(alphabet) for _ in range(key_length))
302305

306+
def _check_deployment_health(
307+
self,
308+
deployment: DeploymentResponse,
309+
) -> bool:
310+
"""Check if the deployment is healthy by calling its health check endpoint.
311+
312+
Args:
313+
deployment: The deployment to check.
314+
315+
Returns:
316+
True if the deployment is healthy, False otherwise.
317+
"""
318+
assert deployment.snapshot, "Deployment snapshot not found"
319+
320+
settings = (
321+
deployment.snapshot.pipeline_configuration.deployment_settings
322+
)
323+
324+
# If the health check endpoint is disabled, we consider the deployment healthy.
325+
if (
326+
DeploymentDefaultEndpoints.HEALTH
327+
not in settings.include_default_endpoints
328+
):
329+
return True
330+
331+
if not deployment.url:
332+
return False
333+
334+
health_check_path = f"{settings.root_url_path}{settings.api_url_path}{settings.health_url_path}"
335+
health_check_url = f"{deployment.url}{health_check_path}"
336+
337+
# Attempt to connect to the deployment and check if it is healthy
338+
try:
339+
response = requests.get(health_check_url, timeout=3)
340+
if response.status_code == 200:
341+
return True
342+
else:
343+
logger.debug(
344+
f"Health check endpoint for deployment '{deployment.name}' "
345+
f"at '{health_check_url}' returned status code "
346+
f"{response.status_code}"
347+
)
348+
return False
349+
except Exception as e:
350+
logger.debug(
351+
f"Health check endpoint for deployment '{deployment.name}' "
352+
f"at '{health_check_url}' is not reachable: {e}"
353+
)
354+
return False
355+
303356
def _poll_deployment(
304357
self,
305358
deployment: DeploymentResponse,
@@ -335,6 +388,11 @@ def _poll_deployment(
335388
)
336389
try:
337390
deployment_state = self.do_get_deployment_state(deployment)
391+
392+
if deployment_state.status == DeploymentStatus.RUNNING:
393+
if not self._check_deployment_health(deployment):
394+
deployment_state.status = DeploymentStatus.PENDING
395+
338396
except DeploymentNotFoundError:
339397
deployment_state = DeploymentOperationalState(
340398
status=DeploymentStatus.ABSENT
@@ -675,6 +733,9 @@ def refresh_deployment(
675733
)
676734
try:
677735
deployment_state = self.do_get_deployment_state(deployment)
736+
if deployment_state.status == DeploymentStatus.RUNNING:
737+
if not self._check_deployment_health(deployment):
738+
deployment_state.status = DeploymentStatus.PENDING
678739
except DeploymentNotFoundError:
679740
deployment_state.status = DeploymentStatus.ABSENT
680741
except DeployerError as e:

src/zenml/deployers/local/local_deployer.py

Lines changed: 17 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,6 @@
3232
from uuid import UUID
3333

3434
import psutil
35-
import requests
3635
from pydantic import BaseModel
3736

3837
from zenml.config.base_settings import BaseSettings
@@ -112,13 +111,16 @@ class LocalDeployerSettings(BaseDeployerSettings):
112111
address: Address to bind the server to.
113112
blocking: Whether to run the deployment in the current process instead
114113
of running it as a daemon process.
114+
auto_reload: Whether to automatically reload the deployment when the
115+
code changes.
115116
"""
116117

117118
port: Optional[int] = None
118119
allocate_port_if_busy: bool = True
119120
port_range: Tuple[int, int] = (8000, 65535)
120121
address: str = "127.0.0.1"
121122
blocking: bool = False
123+
auto_reload: bool = False
122124

123125

124126
class LocalDeployerConfig(BaseDeployerConfig, LocalDeployerSettings):
@@ -230,6 +232,15 @@ def do_provision_deployment(
230232

231233
existing_meta = LocalDeploymentMetadata.from_deployment(deployment)
232234

235+
if existing_meta.pid:
236+
try:
237+
stop_process(existing_meta.pid)
238+
except Exception as e:
239+
logger.warning(
240+
f"Failed to stop existing daemon process for deployment "
241+
f"'{deployment.name}' with PID {existing_meta.pid}: {e}"
242+
)
243+
233244
preferred_ports: List[int] = []
234245
if settings.port:
235246
preferred_ports.append(settings.port)
@@ -265,15 +276,6 @@ def do_provision_deployment(
265276
if not os.path.exists(runtime_dir):
266277
os.makedirs(runtime_dir, exist_ok=True)
267278

268-
if existing_meta.pid:
269-
try:
270-
stop_process(existing_meta.pid)
271-
except Exception as e:
272-
logger.warning(
273-
f"Failed to stop existing daemon process for deployment "
274-
f"'{deployment.name}' with PID {existing_meta.pid}: {e}"
275-
)
276-
277279
if settings.blocking:
278280
self._update_deployment(
279281
deployment,
@@ -291,6 +293,7 @@ def do_provision_deployment(
291293
deployment_id=deployment.id,
292294
host=settings.address,
293295
port=port,
296+
reload=settings.auto_reload,
294297
)
295298
self._update_deployment(
296299
deployment,
@@ -320,6 +323,9 @@ def do_provision_deployment(
320323
str(port),
321324
]
322325

326+
if settings.auto_reload:
327+
cmd.append("--reload")
328+
323329
try:
324330
os.makedirs(os.path.dirname(log_file), exist_ok=True)
325331
proc = subprocess.Popen(
@@ -382,41 +388,12 @@ def do_get_deployment_state(
382388
return state
383389

384390
# Use pending until we can confirm the daemon is reachable
385-
state.status = DeploymentStatus.PENDING
391+
state.status = DeploymentStatus.RUNNING
386392
address = meta.address
387393
if address == "0.0.0.0": # nosec
388394
address = "localhost"
389395
state.url = f"http://{address}:{meta.port}"
390396

391-
settings = (
392-
deployment.snapshot.pipeline_configuration.deployment_settings
393-
)
394-
health_check_path = f"{settings.root_url_path}{settings.api_url_path}{settings.health_url_path}"
395-
health_check_url = f"{state.url}{health_check_path}"
396-
397-
# Attempt to connect to the daemon and set the status to RUNNING
398-
# if successful
399-
try:
400-
response = requests.get(health_check_url, timeout=3)
401-
if response.status_code == 200:
402-
state.status = DeploymentStatus.RUNNING
403-
else:
404-
logger.debug(
405-
f"Daemon for deployment '{deployment.name}' returned "
406-
f"status code {response.status_code} for health check "
407-
f"at '{health_check_url}'"
408-
)
409-
except Exception as e:
410-
logger.debug(
411-
f"Daemon for deployment '{deployment.name}' is not "
412-
f"reachable at '{health_check_url}': {e}"
413-
)
414-
# It can take a long time after the deployment is started until
415-
# the deployment is ready to serve requests, but this isn't an
416-
# error condition. We return PENDING instead of ERROR here to
417-
# signal to the polling in the base deployer class to keep trying.
418-
state.status = DeploymentStatus.PENDING
419-
420397
state.metadata = meta.model_dump(exclude_none=True)
421398

422399
return state

0 commit comments

Comments
 (0)