Skip to content

Commit 8803356

Browse files
committed
Migrate 'gitops in appstudio 2021 arch overview'
1 parent 89a84ab commit 8803356

File tree

1 file changed

+192
-0
lines changed

1 file changed

+192
-0
lines changed
Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
# GitOps Service in AppStudio Overview (2021)
2+
3+
### Written by
4+
- Jonathan West (@jgwest)
5+
- Originally written in Q4 2021/Q1 2022.
6+
7+
8+
# Introduction
9+
10+
The goal is to deliver a GitOps experience, integrated into AppStudio, consistent with the AppStudio journeys and ongoing discussions of the technical architecture (KCP et al).
11+
12+
**Users will be able to use a GitOps process to manage deployments to their environments (backed by the GitOps service)**
13+
14+
* The deployment itself will be handled by an Argo CD instance, that we manage on behalf of the user.
15+
16+
**Other AppStudio services (eg Hybrid Application Console, HAC and/or Hybrid Application Service, HAS), will create a GitOps repository for the user at the same time that their AppStudio application is created. The GItOps service will use this repository to perform deployments.**
17+
18+
* See the journey workflow for how this is created, and other documents for the specific implementation.
19+
20+
**The GitOps service will ensure that Argo CD instances are configured to deploy from the user’s GitOps repository, to the user’s target environment(s) (based on environment data from other services/KCP).**
21+
22+
**In order to scale to a large number of users, the GitOps service must manage a fleet of Argo CD instances (controller pool), and ensure each of those instances is correctly configured for the particular user’s credentials.**
23+
24+
* This item is where most of the service complexity is introduced.
25+
26+
# Terminology
27+
28+
**Managed application environment**: The user’s cluster (for example, a logical cluster via KCP workspace) containing resources (deployments, services, etc) that are being managed by Argo CD. Argo CD control plane stack will not run here (eg the Argo CD application controller et al), rather Argo CD will live on Red Hat-managed infrastructure, and Argo CD’s push model will be used to manage customer/user clusters.
29+
30+
**GitOps Control Plane clusters**: The OCP cluster/environment that hosts the product frontend/backend components, and any infrastructure (databases such as PostgreSQL, monitoring such as Prometheus/Grafana). Does not host any Argo CD workloads. Not customer accessible (no Routes are publicly exposed), fully behind the Red Hat ‘firewall’.
31+
32+
* As of this writing (Sept 2021), this is being handled by the AppStudio ‘staging’ cluster, which will contain all team’s AppStudio deployments.
33+
34+
**GitOps Engine clusters**: Clusters/environments that host the Argo CD instances that manage user workloads (with those user workloads running on their own managed environment). We say ‘GitOps Engine’ rather than ‘Argo CD’, so that we have the option of using a different GitOps-enabling technology in the future.
35+
36+
* These cluster(s) would be owned/operated by Red Hat, and not customer accessible (e.g. no public Routes)
37+
38+
**GitOps Product Backend**: Responsible for handling requests from the HAC frontend (by watching KCP namespace, or if not available then via REST API), and communicating user intent (via those requests) to the appropriate Argo CD instance environment.
39+
40+
**GitOps Cluster Agent**: A Kubernetes-controller responsible for ensuring that the Argo CD instances on the GitOps Engine clusters are conformant to user intent (as communicated via the product UI), and likewise ensuring that the state of resources in the shared RDBMS database is accurately reflected.
41+
42+
# Connection with Other Services
43+
44+
This section is speaking broadly (details TBD), and accurate as of this writing (Sept 29, 2021):
45+
46+
### Hybrid Application Console (HAC)
47+
48+
* HAC (or another mediating component, such as HAS) will be responsible for providing details to the GitOps service (either directly, or indirectly via KCP) including:
49+
* The user's application definition
50+
* Set of user’s environments
51+
* The user's GitOps repository URL
52+
* Deployment environment information (eg target KCP Workspace, OR Kubernetes API Proxy, depending on whether KCP is ready)
53+
54+
### Build Service
55+
56+
* When a new container image is available, the build service will need to notify the appropriate AppStudio component, and that component will need to update the GitOps repository to point to the new image.
57+
* If the GitOps service is the component that is responsible for updating the GitOps repository, we will need to handle that.
58+
59+
### Hybrid Application Service (HAS)
60+
61+
* Will be responsible for generating the GitOps repository used by the GitOps service.
62+
* Will be responsible for modifying the GitOps repository, in order to update:
63+
* Update K8s resources, such as Deployments (increasing memory/CPU limits, updating environment variables, etc)
64+
* Each environment targeted
65+
66+
### Service Provider Integration (SPI)
67+
68+
* Git credentials (and cluster credentials?) will be provided by this service, and the GitOps service will need to wrangle these in a form that can be used by Argo CD.
69+
70+
### KCP (presuming it is available in time for AppStudio MVP)
71+
72+
* Argo CD to target KCP workspaces (KCP workspaces will be added as remote clusters in Argo CD)
73+
* GitOps backend component to watch the KCP workspace for configuration CRs (for example, AppStudio Application CRs, if/when they exist)
74+
75+
# API {#api}
76+
77+
## KCP scenario (or K8s-cluster-proxy scenario):
78+
79+
(*If KCP is fully available at the time of AppStudio MVP*)
80+
81+
82+
## Non-KCP/non-cluster-proxy scenario: REST backend Endpoints:
83+
84+
(*If KCP is not available at the time of AppStudio MVP, nor is our replacement for it, we fall back to REST endpoints.*)
85+
86+
In the non-KCP, non-cluster-proxy scenario, it will be the responsibility of the other services within AppStudio to inform the GitOps service of Applications and Managed Environments via a REST API.
87+
88+
Endpoints are essentially just a light wrapper over the RDMS. The details listed here are non-exhaustive, but should provide a good guideline for the shape of the API.
89+
90+
# Requirements
91+
92+
The goal here is to define a set of requirements that may be realistically delivered, while avoiding areas of complexity that will disproportionately drive up the complexity of iteration delivery. This includes:
93+
94+
- No private clusters (must be publicly internet accessible)
95+
- We will only support the AppStudio/RH-created GitOps repository, which is created on behalf of the user during the application creation process
96+
97+
**Requirement: User/customer Argo CD instances to be hosted on a single cluster, with each Argo CD instance getting its own namespace.**
98+
99+
* For example, multiple namespaces with each namespace containing a single Argo CD instance
100+
* *Forward looking*: Expand to a “fleet” of clusters.
101+
102+
**Requirement: Argo CD is the tool responsible for managing user clusters/environments**
103+
104+
* We're not rolling our own GitOps tool (for now)
105+
* We're not exploring other GitOps tools at this time.
106+
* But: the Argo CD Web UI/CLI/API will not be exposed to the user.
107+
108+
**Requirement: GitOpsDeployment CR as a single source of truth for the Argo CD Application Resource**
109+
110+
* The Argo CD Application CRs will be based on Application definitions coming from other AppStudio services. Thus we only need to handle one-way synchronization between AppStudio Applications (and GitOpsDeployment CR), and Argo CD Applications.
111+
* Or, said another way, the Argo CD Application CR will always be based on the AppStudio App definition (and NOT vice versa)
112+
* See [Architecture page](./gitops-service-internal-architecture-appstudio) re: resource.
113+
* *Corollary*: PostgreSQL database will be used as an eventually consistent mirror for what Applications exist (in Argo CD), and what their sync status and health are.
114+
115+
**Requirement: Product backend (via the RDBMS) is the single source of truth for Managed Application Environment, GitOps Engine instance, Operation resources**
116+
117+
* See [Architecture page](./gitops-service-internal-architecture-appstudio) re: resources, operations
118+
* Managed Application Environment resource is credential’s for a user’s cluster, that they wish to manage with Argo CD
119+
* GitOps Engine instance (Argo CD Instance) resource is a cluster/namespace where Argo CD is installed
120+
* Operation is a synchronization and task tracking mechanism, see Architecture Page.
121+
* *Corollary*: The product frontend/backend will update the RDBMS representation of these resources, then inform the cluster agent of the change via an Operation.
122+
123+
**Requirements: Put security guidelines in place, and enforce them via the PR review process**
124+
125+
* Security (sanitizing inputs, restricted networks/containers, static linting, unit tests, etc) is not afterthought, it should be baked in from the beginning.
126+
127+
**Non-requirements (for this iteration):**
128+
129+
* No support for private or disconnected clusters: all clusters must be accessible by the GitOps product backend (for example, they are on the public internet or are RH internal)
130+
* No support for Argo CD CLI/UI
131+
* No Helm repository support (users may still create Argo CD Applications that reference charts, but we won’t track these within the product UI)
132+
* No API-level ApplicationSet support
133+
134+
135+
136+
## Infrastructure/Test/Docs Requirements
137+
138+
Implement, and gather feedback from, functional requirements. This will help us decide if we are heading in the right direction, and allow us to course correct.
139+
140+
Focus on non-functional/process requirements.
141+
142+
**Requirement: Build E2E test infrastructure, and add E2E tests**
143+
144+
* Add tests that simulate managed workloads
145+
146+
**Requirements: Onboard onto ROMS process, and/or the delivery process agreed to for AppStudio.**
147+
148+
**Requirements: Process/tools for partial rollout, staging, for all components (especially Argo CD)**
149+
150+
**Requirements: Process/tools to deliver security updates for managed Argo CD clusters**
151+
152+
**Requirements: Prometheus/Grafana monitoring of all components**
153+
154+
* I expect this will be driven at the AppStudio level, rather than by us.
155+
156+
**Requirements: Backups/Disaster Recovery process/tools**
157+
158+
* I expect this will be driven at the AppStudio level, rather than by us.
159+
160+
**Requirements: Process/tools to handle database version migration, with eg flyway/liquibase**
161+
162+
**Requirements: Initial user-focused product documentation**
163+
164+
## Future Requirements
165+
166+
This list is a grab bag for any subsequent iterations:
167+
168+
**ArgoCD instance rebalancing: What happens if one user overwhelms the ability of a single Argo CD instance? Need to create a new Argo CD instance and rebalance (shard) the user’s Applications/Cluster across them.**
169+
170+
**Kafka message queue for messaging passing between GitOps backend service and GitOps cluster agent, rather than direct k8s api connection (should be more secure/efficient versus other mechanisms)**
171+
172+
**Redis to store Application status/health (should be more efficient versus using RDBMS for this)**
173+
174+
# Technologies
175+
176+
The technologies I’ve chosen here are modern, mature, and likely uncontroversial due to their existing prevalence within Red Hat/OpenShift.
177+
178+
**Product backend:**
179+
180+
* Simple REST-based GO backend to serve HTTP request from the frontend, and handle other required cluster/environment management tasks
181+
* Persists to and queries PostgreSQL database (via eg bun, go-pg, many other options)
182+
183+
**Cluster Backend \- Kubernetes controller (Go operator):**
184+
185+
* A Kubernetes controller (likely implemented with the operator framework)
186+
* Persists and queries PostgreSQL database (via eg bun, go-pg, many other options)
187+
188+
**Database Infrastructure:**
189+
190+
* PostgreSQL, ideally a managed service (eg Amazon RDS), which is consistent with what other Red Hat teams have done when they need a managed database (as discussed during ROMS Process presentation during F2F)
191+
* *Future considerations*: To improve scaling \- Kafka for passing messages between components, Redis for caching of resource state for UI (instead of using postgres for this purpose), and/or Application health/status
192+

0 commit comments

Comments
 (0)