monarch-kubernetes provides a Kubernetes Custom Resource Definition (CRD) and operator for MonarchMesh, simplifying the deployment and management of Monarch workloads on Kubernetes. The operator reconciles MonarchMesh resources and provisions Monarch workers compatible with KubernetesJob.
⚠️ Early Development Warning monarch-kubernetes is currently in an experimental stage. You should expect bugs, incomplete features, and APIs that may change in future versions. The project welcomes bugfixes, but to make sure things are well coordinated you should discuss any significant change before starting the work. It's recommended that you signal your intention to contribute in the issue tracker, either by filing a new issue or by claiming an existing one.
| Directory | Description |
|---|---|
operator/ |
Operator source code for reconciling the CRD |
docs/ |
Helm Chart package and documentation index |
- Monarch v0.3.0 or higher
Install the MonarchMesh CRD and operator using Helm:
# Add the Helm repository
helm repo add monarch-operator https://meta-pytorch.github.io/monarch-kubernetes
# Update repository cache
helm repo update
# Install MonarchMesh CRD and operator
helm install monarch-operator monarch-operator/monarch-operator \
--namespace monarch-system \
--create-namespaceTo uninstall:
helm uninstall monarch-operator --namespace monarch-systemcd operator
# Generate code and manifests
make generate
make manifests
# Build the container image (default: IMG=controller:latest)
make docker-build CONTAINER_TOOL=podmancd operator
# Option 1: Run the controller locally
make run
# Option 2: Deploy to the cluster (default: IMG=controller:latest)
make deployFor a complete example demonstrating how to use the KubernetesJob class with Monarch, see the hello_kubernetes_job example.
cd operator
# Run unit tests
make test
# Run end-to-end tests (sets up a local cluster)
make test-e2eVersion Mismatch: Ensure the Monarch version installed on workers matches the controller version. Monarch does not provide forward or backward compatibility for the controller/worker protocol.
This project is licensed under the BSD-3-Clause License. See the LICENSE file for details.