You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Release new version for Health Monitoring Agent (1.0.448.0_1.0.115.0) with minor improvements and bug fixes.
* Add support for TRN2 and G6 on Health Monitoring Agent
* Fix DetectionLatency metrics publish.
This HMA was tested on all new instance types supported (G6, Gr6, G6e,
TRN2 families)
- Training job auto resume is expected to work with Kubeflow training operator release v1.7.0, v1.8.0, v1.8.1 https://github.com/kubeflow/training-operator/releases
109
109
- If you intend to use the Health Monitoring Agent container image from another region, please see below list to find relevant region's URI.
110
110
```
111
-
IAD 767398015722.dkr.ecr.us-east-1.amazonaws.com/hyperpod-health-monitoring-agent:1.0.408.0_1.0.105.0
0 commit comments