Skip to content

Commit 5fcc81f

Browse files
Merge pull request #232719 from JnHs/jh-arck8-mc329
move topic and update wording
2 parents e4bff57 + e2a747d commit 5fcc81f

21 files changed

+59
-68
lines changed

.openpublishing.redirection.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22376,6 +22376,11 @@
2237622376
"redirect_url": "/azure/azure-arc/kubernetes/overview",
2237722377
"redirect_document_id": "false"
2237822378
},
22379+
{
22380+
"source_path_from_root": "/articles/azure-arc/kubernetes/tutorial-workload-management.md",
22381+
"redirect_url": "/azure/azure-arc/kubernetes/workload-management",
22382+
"redirect_document_id": "true"
22383+
},
2237922384
{
2238022385
"source_path": "articles/azure-cache-for-redis/redis-cache-insights-overview.md",
2238122386
"redirect_url": "/azure/azure-cache-for-redis/cache-insights-overview",

articles/azure-arc/kubernetes/conceptual-workload-management.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,25 +1,25 @@
11
---
22
title: "Workload management in a multi-cluster environment with GitOps"
33
description: "This article provides a conceptual overview of the workload management in a multi-cluster environment with GitOps."
4-
ms.date: 03/13/2023
4+
ms.date: 03/29/2023
55
ms.topic: conceptual
66
author: eedorenko
77
ms.author: iefedore
88
---
99

1010
# Workload management in a multi-cluster environment with GitOps
1111

12-
Developing modern cloud-native applications often includes building, deploying, configuring, and promoting workloads across a fleet of Kubernetes clusters. With the increasing diversity of Kubernetes clusters in the fleet, and the variety of applications and services, the process can become complex and unscalable. Enterprise organizations can be more successful in these efforts by having a well defined structure that organizes people and their activities, and by using automated tools.
12+
Developing modern cloud-native applications often includes building, deploying, configuring, and promoting workloads across a group of Kubernetes clusters. With the increasing diversity of Kubernetes cluster types, and the variety of applications and services, the process can become complex and unscalable. Enterprise organizations can be more successful in these efforts by having a well defined structure that organizes people and their activities, and by using automated tools.
1313

1414
This article walks you through a typical business scenario, outlining the involved personas and major challenges that organizations often face while managing cloud-native workloads in a multi-cluster environment. It also suggests an architectural pattern that can make this complex process simpler, observable, and more scalable.
1515

1616
## Scenario overview
1717

18-
This article describes an organization that develops cloud-native applications. Any application needs a compute resource to work on. In the cloud-native world, this compute resource is a Kubernetes cluster. An organization may have a single cluster or, more commonly, multiple clusters. So the organization must decide which applications should work on which clusters. In other words, they must schedule the applications across clusters. The result of this decision, or scheduling, is a model of the desired state of their cluster fleet. Having that in place, they need somehow to deliver applications to the assigned clusters so that they can turn the desired state into the reality, or, in other words, reconcile it.
18+
This article describes an organization that develops cloud-native applications. Any application needs a compute resource to work on. In the cloud-native world, this compute resource is a Kubernetes cluster. An organization may have a single cluster or, more commonly, multiple clusters. So the organization must decide which applications should work on which clusters. In other words, they must schedule the applications across clusters. The result of this decision, or scheduling, is a model of the desired state of the clusters in their environment. Having that in place, they need somehow to deliver applications to the assigned clusters so that they can turn the desired state into the reality, or, in other words, reconcile it.
1919

2020
Every application goes through a software development lifecycle that promotes it to the production environment. For example, an application is built, deployed to Dev environment, tested and promoted to Stage environment, tested, and finally delivered to production. For a cloud-native application, the application requires and targets different Kubernetes cluster resources throughout its lifecycle. In addition, applications normally require clusters to provide some platform services, such as Prometheus and Fluentbit, and infrastructure configurations, such as networking policy.
2121

22-
Depending on the application, there may be a great diversity of cluster types to which the application is deployed. The same application with different configurations could be hosted on a managed cluster in the cloud, on a connected cluster in an on-premises environment, on a fleet of clusters on semi-connected edge devices on factory lines or military drones, and on an air-gapped cluster on a starship. Another complexity is that clusters in early lifecycle stages such as Dev and QA are normally managed by the developer, while reconciliation to actual production clusters may be managed by the organization's customers. In the latter case, the developer may be responsible only for promoting and scheduling the application across different rings.
22+
Depending on the application, there may be a great diversity of cluster types to which the application is deployed. The same application with different configurations could be hosted on a managed cluster in the cloud, on a connected cluster in an on-premises environment, on a group of clusters on semi-connected edge devices on factory lines or military drones, and on an air-gapped cluster on a starship. Another complexity is that clusters in early lifecycle stages such as Dev and QA are normally managed by the developer, while reconciliation to actual production clusters may be managed by the organization's customers. In the latter case, the developer may be responsible only for promoting and scheduling the application across different rings.
2323

2424
## Challenges at scale
2525

@@ -28,7 +28,7 @@ In a small organization with a single application and only a few operations, mos
2828
The following capabilities are required to perform this type of workload management at scale in a multi-cluster environment:
2929

3030
- Separation of concerns on scheduling and reconciling
31-
- Promotion of the fleet state through a chain of environments
31+
- Promotion of the multi-cluster state through a chain of environments
3232
- Sophisticated, extensible and replaceable scheduler
3333
- Flexibility to use different reconcilers for different cluster types depending on their nature and connectivity
3434

@@ -38,22 +38,22 @@ Before we describe the scenario, let's clarify which personas are involved, what
3838

3939
### Platform team
4040

41-
The platform team is responsible for managing the fleet of clusters that hosts applications produced by application teams.
41+
The platform team is responsible for managing the clusters that host applications produced by application teams.
4242

4343
Key responsibilities of the platform team are:
4444

4545
* Define staging environments (Dev, QA, UAT, Prod).
46-
* Define cluster types in the fleet and their distribution across environments.
46+
* Define cluster types and their distribution across environments.
4747
* Provision new clusters.
48-
* Manage infrastructure configurations across the fleet.
48+
* Manage infrastructure configurations across the clusters.
4949
* Maintain platform services used by applications.
5050
* Schedule applications and platform services on the clusters.
5151

5252
### Application team
5353

5454
The application team is responsible for the software development lifecycle (SDLC) of their applications. They provide Kubernetes manifests that describe how to deploy the application to different targets. They're responsible for owning CI/CD pipelines that create container images and Kubernetes manifests and promote deployment artifacts across environment stages.
5555

56-
Typically, the application team has no knowledge of the clusters that they are deploying to. They aren't aware of the structure of the fleet, global configurations, or tasks performed by other teams. The application team primarily understands the success of their application rollout as defined by the success of the pipeline stages.
56+
Typically, the application team has no knowledge of the clusters that they are deploying to. They aren't aware of the structure of the multi-cluster environment, global configurations, or tasks performed by other teams. The application team primarily understands the success of their application rollout as defined by the success of the pipeline stages.
5757

5858
Key responsibilities of the application team are:
5959

@@ -88,7 +88,7 @@ Let's have a look at the high level solution architecture and understand its pri
8888

8989
### Control plane
9090

91-
The platform team models the fleet in the control plane. It's designed to be human-oriented and easy to understand, update, and review. The control plane operates with abstractions such as Cluster Types, Environments, Workloads, Scheduling Policies, Configs and Templates. These abstractions are handled by an automated process that assigns deployment targets and configuration values to the cluster types, then saves the result to the platform GitOps repository. Although the entire fleet may consist of thousands of physical clusters, the platform repository operates at a higher level, grouping the clusters into cluster types.
91+
The platform team models the multi-cluster environment in the control plane. It's designed to be human-oriented and easy to understand, update, and review. The control plane operates with abstractions such as Cluster Types, Environments, Workloads, Scheduling Policies, Configs and Templates. These abstractions are handled by an automated process that assigns deployment targets and configuration values to the cluster types, then saves the result to the platform GitOps repository. Although there may be thousands of physical clusters, the platform repository operates at a higher level, grouping the clusters into cluster types.
9292

9393
The main requirement for the control plane storage is to provide a reliable and secure transaction processing functionality, rather than being hit with complex queries against a large amount of data. Various technologies may be used to store the control plane data.
9494

@@ -129,13 +129,13 @@ Every cluster type can use a different reconciler (such as Flux, ArgoCD, Zarf, R
129129

130130
### Platform services
131131

132-
Platform services are workloads (such as Prometheus, NGINX, Fluentbit, and so on) maintained by the platform team. Just like any workloads, they have their source repositories and manifests storage. The source repositories may contain pointers to external Helm charts. CI/CD pipelines pull the charts with containers and perform necessary security scans before submitting them to the manifests storage, from where they're reconciled to the clusters in the fleet.
132+
Platform services are workloads (such as Prometheus, NGINX, Fluentbit, and so on) maintained by the platform team. Just like any workloads, they have their source repositories and manifests storage. The source repositories may contain pointers to external Helm charts. CI/CD pipelines pull the charts with containers and perform necessary security scans before submitting them to the manifests storage, from where they're reconciled to the clusters.
133133

134134
### Deployment Observability Hub
135135

136-
Deployment Observability Hub is a central storage that is easy to query with complex queries against a large amount of data. It contains deployment data with historical information on workload versions and their deployment state across clusters in the fleet. Clusters register themselves in the storage and update their compliance status with the GitOps repositories. Clusters operate at the level of Git commits only. High-level information, such as application versions, environments, and cluster type data, is transferred to the central storage from the GitOps repositories. This high-level information gets correlated in the central storage with the commit compliance data sent from the clusters.
136+
Deployment Observability Hub is a central storage that is easy to query with complex queries against a large amount of data. It contains deployment data with historical information on workload versions and their deployment state across clusters. Clusters register themselves in the storage and update their compliance status with the GitOps repositories. Clusters operate at the level of Git commits only. High-level information, such as application versions, environments, and cluster type data, is transferred to the central storage from the GitOps repositories. This high-level information gets correlated in the central storage with the commit compliance data sent from the clusters.
137137

138138
## Next steps
139139

140-
* Explore a [sample implementation of workload management in a multi-cluster environment with GitOps](https://github.com/microsoft/kalypso).
141-
* Try our [Tutorial: Workload Management in Multi-cluster environment with GitOps](tutorial-workload-management.md) to walk through the implementation.
140+
* Walk through a sample implementation to explore [workload management in a multi-cluster environment with GitOps](workload-management.md).
141+
* Explore a [multi-cluster workload management sample repository](https://github.com/microsoft/kalypso).

0 commit comments

Comments
 (0)