Merge pull request #207 from Yelp/mpiano/node_migration_docs

piax93 · web-flow · commit ee8dec9a85ea · 2022-10-04T13:32:21.000+01:00
diff --git a/docs/source/configuration.rst b/docs/source/configuration.rst
@@ -67,6 +67,14 @@ The following is an example configuration file for the core Clusterman service a
             # How frequently the batch should run to collect metrics.
             run_interval_seconds: 60
 
+        node_migration:
+            # Maximum number of worker prcesses the batch can spawn
+            # (every worker can handle a single migration for a pool)
+            max_worker_processes: 6
+
+            # How frequently the batch should check for migration triggers.
+            run_interval_seconds: 60
+
     clusters:
         cluster-name:
             aws_region: us-west-2
@@ -153,6 +161,18 @@ The following is an example configuration file for a particular Clusterman pool:
             - paramA: 'typeA'
             - paramB: 10
 
+    node_migration:
+        trigger:
+            max_uptime: 90d
+            event: true
+        strategy:
+            rate: 5
+            prescaling: '2%'
+            precedence: highest_uptime
+            bootstrap_wait: 5m
+            bootstrap_timeout: 15m
+        disable_autoscaling: false
+        expected_duration: 2h
 
 The ``resource-groups`` section provides information for loading resource groups in the pool manager.
 
@@ -167,6 +187,11 @@ not present, then the ``autoscale_signal`` from the service configuration will b
 For required metrics, there can be any number of sections, each defining one desired metric.  The metric type must be
 one of :ref:`metric_types`.
 
+The ``node_migration`` section contains settings controlling how Clusterman should be recycling nodes
+inside the pool. Enabling this configuration is useful for keeping the average uptime of your pool low and/or
+be able to perform adhoc migrations of the nodes according to some conditional parameter.
+See :ref:`node_migration_configuration` for all details.
+
 Reloading
 ---------
 The Clusterman batches will automatically reload on changes to the clusterman service config file and the AWS
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -34,6 +34,7 @@ and simulate how changes to autoscaling logic will impact the cost and performan
    manage
    simulator
    tools
+   node_migration
 
 
 .. toctree::
diff --git a/docs/source/node_migration.rst b/docs/source/node_migration.rst
@@ -0,0 +1,143 @@
+Node Migration
+==============
+
+*Node Migration* is a functionality which allows Clusterman to recycle nodes of a pool
+according to various criteria, in order to reduce the amount of manual work necessary
+when performing infrastructure migrations.
+
+**NOTE**: this is only compatible with Kubernetes clusters.
+
+
+Node Migration Batch
+--------------------
+
+The *Node Migration batch* is the entrypoint of the migration logic. It takes care of fetching migration trigger
+events, spawning the worker processes actually performing the node recycling procedures, and monitoring their health.
+
+Batch specific configuration values are described as part of the main service configuration in :ref:`service_configuration`.
+
+The batch code can be invoked from the ``clusterman.batch.node_migration`` Python module.
+
+
+.. _node_migration_configuration:
+
+Pool Configuration
+------------------
+
+The behaviour of the migration logic for a pool is controlled by the ``node_migration`` section of the pool configuration.
+The allowed values for the migration settings are as follows:
+
+* ``trigger``:
+
+  * ``max_uptime``: if set, monitor nodes' uptime to ensure it stays lower than the provided value; human readable time string (e.g. 30d).
+  * ``event``: if set to ``true``, accept async migration trigger for this pool; details about event triggers are described below in :ref:`node_migration_trigger`.
+
+* ``strategy``:
+
+  * ``rate``: rate at which nodes are selected for termination; percentage or absolute value (required).
+  * ``prescaling``: if set, pool size is increased by this amount before performing node recycling; percentage or absolute value (0 by default).
+  * ``precedence``: precedence with which nodes are selected for termination; ``highest_uptime`` or ``lowest_task_count`` (uptime by default).
+  * ``bootstrap_wait``: indicative time necessary for a node to be ready to run workloads after boot; human readable time string (3 minutes by default).
+  * ``bootstrap_timeout``: maximum wait for nodes to be ready after boot; human readable time string (10 minutes by default).
+
+* ``disable_autoscaling``: turn off autoscaler while recycling instances (false by default).
+
+* ``expected_duration``: estimated duration for migration of the whole pool; human readable time string (1 day by default).
+
+See :ref:`pool_configuration` for how an example configuration block would look like.
+
+
+.. _node_migration_trigger:
+
+Migration Event Trigger
+-----------------------
+
+Migration trigger events are submitted as Kubernetes custom resources of type ``nodemigration``.
+They can be easily generated and submitted by using the ``clusterman migrate`` CLI sub-command and it related options.
+The manifest for the custom resource defintion is as follows:
+
+
+.. code-block:: yaml
+
+    ---
+    apiVersion: apiextensions.k8s.io/v1
+    kind: CustomResourceDefinition
+    metadata:
+      name: nodemigrations.clusterman.yelp.com
+    spec:
+      scope: Cluster
+      group: clusterman.yelp.com
+      names:
+        plural: nodemigrations
+        singular: nodemigration
+        kind: NodeMigration
+      versions:
+        - name: v1
+          served: true
+          storage: true
+          schema:
+            openAPIV3Schema:
+              type: object
+              required:
+                - spec
+              properties:
+                spec:
+                  type: object
+                  required:
+                    - cluster
+                    - pool
+                    - condition
+                  properties:
+                    cluster:
+                      type: string
+                    pool:
+                      type: string
+                    label_selectors:
+                      type: array
+                      items:
+                        type: string
+                    condition:
+                      type: object
+                      properties:
+                        trait:
+                          type: string
+                          enum: [kernel, lsbrelease, instance_type, uptime]
+                        target:
+                          type: string
+                        operator:
+                          type: string
+                          enum: [gt, ge, eq, ne, lt, le, in, notin]
+
+
+In more readable terms, an example resource manifest would look like:
+
+.. code-block:: yaml
+
+    ---
+    apiVersion: "clusterman.yelp.com/v1"
+    kind: NodeMigration
+    metadata:
+      name: my-test-migration-220912
+      labels:
+        clusterman.yelp.com/migration_status: pending
+    spec:
+      cluster: kubestage
+      pool: default
+      condition:
+        trait: uptime
+        operator: lt
+        target: 90d
+
+
+The fields in each migration event allow to control which nodes are affected by the event
+and what is the desired final condition for them. More specifically:
+
+* ``cluster``: name of the cluster to be targeted.
+* ``pool``: name of the pool to be targeted.
+* ``label_selectors``: list of additional Kubernetes label selectors to filter affected nodes.
+* ``condition``: the desired final state for the node, i.e. all nodes must be have kernel version higher than X.
+
+  * ``trait``: metadata to be compared; currently supports ``kernel``, ``lsbrelease``, ``instance_type``, or ``uptime``.
+  * ``operator``: comparison operator; supports ``gt``, ``ge``, ``eq``, ``ne``, ``lt``, ``le``, ``in``, ``notin``.
+  * ``target``: right side of the comparison expression, e.g. a kernel version or an instance type;
+    may be a single string or a comma separated list when using ``in`` / ``notin`` operators.
diff --git a/tox.ini b/tox.ini
@@ -48,7 +48,7 @@ commands =
 [testenv:docs]
 envdir = .tox/docs
 deps =
-    -rrequirements-doc.txt
+    -rrequirements-docs.txt
 changedir = docs
 commands =
     sphinx-build -b html -d build/doctrees source build/html