diff --git a/.gitbook.yaml b/.gitbook.yaml index c25b71624..61ee45d0f 100644 --- a/.gitbook.yaml +++ b/.gitbook.yaml @@ -5,7 +5,6 @@ redirects: usage/progressive-delivery: tutorials/istio-progressive-delivery.md usage/ab-testing: tutorials/istio-ab-testing.md usage/blue-green: tutorials/kubernetes-blue-green.md - usage/appmesh-progressive-delivery: tutorials/appmesh-progressive-delivery.md usage/linkerd-progressive-delivery: tutorials/linkerd-progressive-delivery.md usage/contour-progressive-delivery: tutorials/contour-progressive-delivery.md usage/gloo-progressive-delivery: tutorials/gloo-progressive-delivery.md @@ -13,7 +12,6 @@ redirects: usage/skipper-progressive-delivery: tutorials/skipper-progressive-delivery.md usage/crossover-progressive-delivery: tutorials/crossover-progressive-delivery.md usage/traefik-progressive-delivery: tutorials/traefik-progressive-delivery.md - usage/osm-progressive-delivery: tutorials/osm-progressive-delivery.md usage/kuma-progressive-delivery: tutorials/kuma-progressive-delivery.md usage/gatewayapi-progressive-delivery: tutorials/gatewayapi-progressive-delivery.md usage/apisix-progressive-delivery: tutorials/apisix-progressive-delivery.md diff --git a/README.md b/README.md index f7bf1812a..0cdaf8911 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ - # flaggerreadme + # Flagger [![release](https://img.shields.io/github/release/fluxcd/flagger/all.svg)](https://github.com/fluxcd/flagger/releases) [![CII Best Practices](https://bestpractices.coreinfrastructure.org/projects/4783/badge)](https://bestpractices.coreinfrastructure.org/projects/4783) @@ -16,39 +16,26 @@ by gradually shifting traffic to the new version while measuring metrics and run Flagger implements several deployment strategies (Canary releases, A/B testing, Blue/Green mirroring) and integrates with various Kubernetes ingress controllers, service mesh, and monitoring solutions. -Flagger is a [Cloud Native Computing Foundation](https://cncf.io/) project +Flagger is a [Cloud Native Computing Foundation](https://cncf.io/) graduated project and part of the [Flux](https://fluxcd.io) family of GitOps tools. ### Documentation -Flagger documentation can be found at [fluxcd.io/flagger](https://fluxcd.io/flagger/). +The Flagger documentation can be found at [docs.flagger.app](https://docs.flagger.app/main). * Install - * [Flagger install on Kubernetes](https://fluxcd.io/flagger/install/flagger-install-on-kubernetes) + * [Flagger Install with Flux](https://docs.flagger.app/main/install/flagger-install-with-flux) * Usage - * [How it works](https://fluxcd.io/flagger/usage/how-it-works) - * [Deployment strategies](https://fluxcd.io/flagger/usage/deployment-strategies) - * [Metrics analysis](https://fluxcd.io/flagger/usage/metrics) - * [Webhooks](https://fluxcd.io/flagger/usage/webhooks) - * [Alerting](https://fluxcd.io/flagger/usage/alerting) - * [Monitoring](https://fluxcd.io/flagger/usage/monitoring) -* Tutorials - * [App Mesh](https://fluxcd.io/flagger/tutorials/appmesh-progressive-delivery) - * [Istio](https://fluxcd.io/flagger/tutorials/istio-progressive-delivery) - * [Linkerd](https://fluxcd.io/flagger/tutorials/linkerd-progressive-delivery) - * [Open Service Mesh (OSM)](https://dfluxcd.io/flagger/tutorials/osm-progressive-delivery) - * [Kuma Service Mesh](https://fluxcd.io/flagger/tutorials/kuma-progressive-delivery) - * [Contour](https://fluxcd.io/flagger/tutorials/contour-progressive-delivery) - * [Gloo](https://fluxcd.io/flagger/tutorials/gloo-progressive-delivery) - * [NGINX Ingress](https://fluxcd.io/flagger/tutorials/nginx-progressive-delivery) - * [Skipper](https://fluxcd.io/flagger/tutorials/skipper-progressive-delivery) - * [Traefik](https://fluxcd.io/flagger/tutorials/traefik-progressive-delivery) - * [Gateway API](https://fluxcd.io/flagger/tutorials/gatewayapi-progressive-delivery/) - * [Kubernetes Blue/Green](https://fluxcd.io/flagger/tutorials/kubernetes-blue-green) + * [How it works](https://docs.flagger.app/main/usage/how-it-works) + * [Deployment strategies](https://docs.flagger.app/main/usage/deployment-strategies) + * [Metrics analysis](https://docs.flagger.app/main/usage/metrics) + * [Webhooks](https://docs.flagger.app/main/usage/webhooks) + * [Alerting](https://docs.flagger.app/main/usage/alerting) + * [Monitoring](https://docs.flagger.app/main/usage/monitoring) ### Adopters -**Our list of production users has moved to **. +The list of production users can be found at [fluxcd.io/adopters/#flagger](https://fluxcd.io/adopters/#flagger). If you are using Flagger, please [submit a PR to add your organization](https://github.com/fluxcd/website/blob/main/data/adopters/2-flagger.yaml) to the list! @@ -72,8 +59,8 @@ metadata: namespace: test spec: # service mesh provider (optional) - # can be: kubernetes, istio, linkerd, appmesh, nginx, skipper, contour, gloo, supergloo, traefik, osm - # for SMI TrafficSplit can be: smi:v1alpha1, smi:v1alpha2, smi:v1alpha3 + # can be: kubernetes, istio, linkerd, kuma, knative, nginx, contour, gloo, traefik, skipper + # for Gateway API implementations: gatewayapi:v1 and gatewayapi:v1beta1 provider: istio # deployment reference targetRef: @@ -178,23 +165,23 @@ spec: name: on-call-msteams ``` -For more details on how the canary analysis and promotion works please [read the docs](https://fluxcd.io/flagger/usage/how-it-works). +For more details on how the canary analysis and promotion works please [read the docs](https://docs.flagger.app/usage/how-it-works). ### Features **Service Mesh** -| Feature | App Mesh | Istio | Linkerd | Kuma | OSM | Knative | Kubernetes CNI | -|--------------------------------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------| -| Canary deployments (weighted traffic) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | -| A/B testing (headers and cookies routing) | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | -| Blue/Green deployments (traffic switch) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark: | -| Blue/Green deployments (traffic mirroring) | :heavy_minus_sign: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | -| Webhooks (acceptance/load testing) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | -| Manual gating (approve/pause/resume) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | -| Request success rate check (L7 metric) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark: | -| Request duration check (L7 metric) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark: | -| Custom metric checks | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| Feature | Istio | Linkerd | Kuma | Knative | Kubernetes CNI | +|--------------------------------------------|--------------------|--------------------|--------------------|--------------------|--------------------| +| Canary deployments (weighted traffic) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | +| A/B testing (headers and cookies routing) | :heavy_check_mark: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | +| Blue/Green deployments (traffic switch) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark: | +| Blue/Green deployments (traffic mirroring) | :heavy_check_mark: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | :heavy_minus_sign: | +| Webhooks (acceptance/load testing) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| Manual gating (approve/pause/resume) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| Request success rate check (L7 metric) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark: | +| Request duration check (L7 metric) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_minus_sign: | :heavy_check_mark: | +| Custom metric checks | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | **Ingress** @@ -211,29 +198,26 @@ For more details on how the canary analysis and promotion works please [read the **Networking Interface** -| Feature | Gateway API | SMI | -|-----------------------------------------------|--------------------|--------------------| -| Canary deployments (weighted traffic) | :heavy_check_mark: | :heavy_check_mark: | -| A/B testing (headers and cookies routing) | :heavy_check_mark: | :heavy_minus_sign: | -| Blue/Green deployments (traffic switch) | :heavy_check_mark: | :heavy_check_mark: | -| Blue/Green deployments (traffic mirrroring) | :heavy_minus_sign: | :heavy_minus_sign: | -| Webhooks (acceptance/load testing) | :heavy_check_mark: | :heavy_check_mark: | -| Manual gating (approve/pause/resume) | :heavy_check_mark: | :heavy_check_mark: | -| Request success rate check (L7 metric) | :heavy_minus_sign: | :heavy_minus_sign: | -| Request duration check (L7 metric) | :heavy_minus_sign: | :heavy_minus_sign: | -| Custom metric checks | :heavy_check_mark: | :heavy_check_mark: | - -For all [Gateway API](https://gateway-api.sigs.k8s.io/) implementations like -[Contour](https://projectcontour.io/guides/gateway-api/) or -[Istio](https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/) -and [SMI](https://smi-spec.io) compatible service mesh solutions like -[Nginx Service Mesh](https://docs.nginx.com/nginx-service-mesh/), -[Prometheus MetricTemplates](https://docs.flagger.app/usage/metrics#prometheus) +| Feature | Gateway API | SMI | +|--------------------------------------------|--------------------|--------------------| +| Canary deployments (weighted traffic) | :heavy_check_mark: | :heavy_check_mark: | +| Canary deployments with session affinity | :heavy_check_mark: | :heavy_minus_sign: | +| A/B testing (headers and cookies routing) | :heavy_check_mark: | :heavy_minus_sign: | +| Blue/Green deployments (traffic switch) | :heavy_check_mark: | :heavy_check_mark: | +| Blue/Green deployments (traffic mirroring) | :heavy_minus_sign: | :heavy_minus_sign: | +| Webhooks (acceptance/load testing) | :heavy_check_mark: | :heavy_check_mark: | +| Manual gating (approve/pause/resume) | :heavy_check_mark: | :heavy_check_mark: | +| Request success rate check (L7 metric) | :heavy_minus_sign: | :heavy_minus_sign: | +| Request duration check (L7 metric) | :heavy_minus_sign: | :heavy_minus_sign: | +| Custom metric checks | :heavy_check_mark: | :heavy_check_mark: | + +For all the [Gateway API](https://gateway-api.sigs.k8s.io/) compatible ingress controllers and service meshes, +the [Prometheus MetricTemplates](https://docs.flagger.app/usage/metrics#prometheus) can be used to implement the request success rate and request duration checks. ### Roadmap -#### [GitOps Toolkit](https://github.com/fluxcd/flux2) compatibility +#### [GitOps Toolkit](https://fluxcd.io/flux/components/) compatibility - Migrate Flagger to Kubernetes controller-runtime and [kubebuilder](https://github.com/kubernetes-sigs/kubebuilder) - Make the Canary status compatible with [kstatus](https://github.com/kubernetes-sigs/cli-utils) @@ -242,7 +226,7 @@ can be used to implement the request success rate and request duration checks. #### Integrations -- Add support for ingress controllers like HAProxy, ALB, and Apache APISIX +- Migrate Linkerd, Kuma and other service mesh integrations to Gateway API ### Contributing diff --git a/artifacts/examples/appmesh-abtest.yaml b/artifacts/examples/appmesh-abtest.yaml deleted file mode 100644 index 726fcab64..000000000 --- a/artifacts/examples/appmesh-abtest.yaml +++ /dev/null @@ -1,62 +0,0 @@ -apiVersion: flagger.app/v1beta1 -kind: Canary -metadata: - name: podinfo - namespace: test -spec: - provider: appmesh - progressDeadlineSeconds: 600 - targetRef: - apiVersion: apps/v1 - kind: Deployment - name: podinfo - autoscalerRef: - apiVersion: autoscaling/v2 - kind: HorizontalPodAutoscaler - name: podinfo - service: - port: 80 - targetPort: 9898 - meshName: global - retries: - attempts: 3 - perTryTimeout: 5s - retryOn: "gateway-error,client-error,stream-error" - timeout: 35s - match: - - uri: - prefix: / - rewrite: - uri: / - analysis: - interval: 15s - threshold: 10 - iterations: 10 - match: - - headers: - x-canary: - exact: "insider" - metrics: - - name: request-success-rate - thresholdRange: - min: 99 - interval: 1m - - name: request-duration - thresholdRange: - max: 500 - interval: 30s - webhooks: - - name: conformance-test - type: pre-rollout - url: http://flagger-loadtester.test/ - timeout: 15s - metadata: - type: "bash" - cmd: "curl -sd 'test' http://podinfo-canary.test/token | grep token" - - name: load-test - type: rollout - url: http://flagger-loadtester.test/ - timeout: 5s - metadata: - type: cmd - cmd: "hey -z 1m -q 10 -c 2 -H 'X-Canary: insider' http://podinfo-canary.test/" diff --git a/artifacts/examples/appmesh-canary.yaml b/artifacts/examples/appmesh-canary.yaml deleted file mode 100644 index 16741963d..000000000 --- a/artifacts/examples/appmesh-canary.yaml +++ /dev/null @@ -1,59 +0,0 @@ -apiVersion: flagger.app/v1beta1 -kind: Canary -metadata: - name: podinfo - namespace: test -spec: - provider: appmesh - progressDeadlineSeconds: 600 - targetRef: - apiVersion: apps/v1 - kind: Deployment - name: podinfo - autoscalerRef: - apiVersion: autoscaling/v2 - kind: HorizontalPodAutoscaler - name: podinfo - service: - port: 80 - targetPort: http - meshName: global - retries: - attempts: 3 - perTryTimeout: 5s - retryOn: "gateway-error,client-error,stream-error" - timeout: 35s - match: - - uri: - prefix: / - rewrite: - uri: / - analysis: - interval: 15s - threshold: 10 - maxWeight: 50 - stepWeight: 5 - metrics: - - name: request-success-rate - thresholdRange: - min: 99 - interval: 1m - - name: request-duration - thresholdRange: - max: 500 - interval: 30s - webhooks: - - name: conformance-test - type: pre-rollout - url: http://flagger-loadtester.test/ - timeout: 15s - metadata: - type: "bash" - cmd: "curl -sd 'test' http://podinfo-canary.test/token | grep token" - - name: load-test - type: rollout - url: http://flagger-loadtester.test/ - timeout: 5s - metadata: - type: cmd - cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test/" diff --git a/artifacts/examples/osm-canary-steps.yaml b/artifacts/examples/osm-canary-steps.yaml deleted file mode 100644 index 91bccc554..000000000 --- a/artifacts/examples/osm-canary-steps.yaml +++ /dev/null @@ -1,42 +0,0 @@ -apiVersion: flagger.app/v1beta1 -kind: Canary -metadata: - name: podinfo - namespace: test -spec: - provider: osm - targetRef: - apiVersion: apps/v1 - kind: Deployment - name: podinfo - progressDeadlineSeconds: 600 - service: - port: 9898 - targetPort: 9898 - analysis: - interval: 15s - threshold: 10 - stepWeights: [5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55] - metrics: - - name: request-success-rate - thresholdRange: - min: 99 - interval: 1m - - name: request-duration - thresholdRange: - max: 500 - interval: 30s - webhooks: - - name: acceptance-test - type: pre-rollout - url: http://flagger-loadtester.test/ - timeout: 15s - metadata: - type: bash - cmd: "curl -sd 'test' http://podinfo-canary.test:9898/token | grep token" - - name: load-test - type: rollout - url: http://flagger-loadtester.test/ - timeout: 5s - metadata: - cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/" diff --git a/artifacts/examples/osm-canary.yaml b/artifacts/examples/osm-canary.yaml deleted file mode 100644 index 78208d8ec..000000000 --- a/artifacts/examples/osm-canary.yaml +++ /dev/null @@ -1,43 +0,0 @@ -apiVersion: flagger.app/v1beta1 -kind: Canary -metadata: - name: podinfo - namespace: test -spec: - provider: osm - targetRef: - apiVersion: apps/v1 - kind: Deployment - name: podinfo - progressDeadlineSeconds: 600 - service: - port: 9898 - targetPort: 9898 - analysis: - interval: 15s - threshold: 10 - maxWeight: 50 - stepWeight: 5 - metrics: - - name: request-success-rate - thresholdRange: - min: 99 - interval: 1m - - name: request-duration - thresholdRange: - max: 500 - interval: 30s - webhooks: - - name: acceptance-test - type: pre-rollout - url: http://flagger-loadtester.test/ - timeout: 15s - metadata: - type: bash - cmd: "curl -sd 'test' http://podinfo-canary.test:9898/token | grep token" - - name: load-test - type: rollout - url: http://flagger-loadtester.test/ - timeout: 5s - metadata: - cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/" diff --git a/charts/flagger/Chart.yaml b/charts/flagger/Chart.yaml index fbc8198ea..44c28791c 100644 --- a/charts/flagger/Chart.yaml +++ b/charts/flagger/Chart.yaml @@ -19,7 +19,6 @@ keywords: - appmesh - linkerd - kuma - - osm - smi - gloo - contour diff --git a/charts/flagger/README.md b/charts/flagger/README.md index c10fac924..349fa7f77 100644 --- a/charts/flagger/README.md +++ b/charts/flagger/README.md @@ -59,15 +59,6 @@ $ helm upgrade -i flagger flagger/flagger \ ``` -To install Flagger for **Open Service Mesh** (requires OSM to have been installed with Prometheus): - -```console -$ helm upgrade -i flagger flagger/flagger \ - --namespace=osm-system \ - --set meshProvider=osm \ - --set metricsServer=http://osm-prometheus.osm-system.svc:7070 -``` - To install Flagger for **Kuma Service Mesh** (requires Kuma to have been installed with Prometheus): ```console diff --git a/charts/flagger/values.yaml b/charts/flagger/values.yaml index 5cd966a0d..39847b7a8 100644 --- a/charts/flagger/values.yaml +++ b/charts/flagger/values.yaml @@ -32,7 +32,7 @@ serviceMonitor: # Set labels for the ServiceMonitor, use this to define your scrape label for Prometheus Operator # labels: -# accepted values are kubernetes, istio, linkerd, appmesh, contour, nginx, gloo, skipper, traefik, apisix, osm +# accepted values are kubernetes, istio, linkerd, appmesh, contour, nginx, gloo, skipper, traefik, apisix meshProvider: "" # single namespace restriction diff --git a/charts/loadtester/Chart.yaml b/charts/loadtester/Chart.yaml index 67e215b33..eb8ce094d 100644 --- a/charts/loadtester/Chart.yaml +++ b/charts/loadtester/Chart.yaml @@ -19,7 +19,6 @@ keywords: - appmesh - linkerd - gloo - - osm - smi - gitops - load testing diff --git a/docs/gitbook/README.md b/docs/gitbook/README.md index 1a4b7b09b..988a92380 100644 --- a/docs/gitbook/README.md +++ b/docs/gitbook/README.md @@ -10,8 +10,7 @@ version in production by gradually shifting traffic to the new version while mea and running conformance tests. Flagger implements several deployment strategies (Canary releases, A/B testing, Blue/Green mirroring) -using a service mesh (App Mesh, Istio, Linkerd, Kuma, Open Service Mesh) -or an ingress controller (Contour, Gloo, NGINX, Skipper, Traefik, APISIX) for traffic routing. +using a service mesh or an ingress controller for traffic routing. For release analysis, Flagger can query Prometheus, InfluxDB, Datadog, New Relic, CloudWatch, Stackdriver or Graphite and for alerting it uses Slack, MS Teams, Discord and Rocket. @@ -19,26 +18,23 @@ or Graphite and for alerting it uses Slack, MS Teams, Discord and Rocket. Flagger can be configured with Kubernetes custom resources and is compatible with any CI/CD solutions made for Kubernetes. Since Flagger is declarative and reacts to Kubernetes events, -it can be used in **GitOps** pipelines together with tools like [Flux](install/flagger-install-with-flux.md), -JenkinsX, Carvel, Argo, etc. +it can be used in **GitOps** pipelines together with tools like [Flux CD](install/flagger-install-with-flux.md). -Flagger is a [Cloud Native Computing Foundation](https://cncf.io/) project +Flagger is a [Cloud Native Computing Foundation](https://cncf.io/) graduated project and part of [Flux](https://fluxcd.io) family of GitOps tools. ## Getting started To get started with Flagger, choose one of the supported routing providers and -[install](install/flagger-install-on-kubernetes.md) Flagger with Helm or Kustomize. +[install](install/flagger-install-with-flux.md) Flagger with Flux CD. After installing Flagger, you can follow one of these tutorials to get started: **Service mesh tutorials** +* [Gateway API](tutorials/gatewayapi-progressive-delivery.md) * [Istio](tutorials/istio-progressive-delivery.md) * [Linkerd](tutorials/linkerd-progressive-delivery.md) -* [AWS App Mesh](tutorials/appmesh-progressive-delivery.md) -* [AWS App Mesh: Canary Deployment Using Flagger](https://www.eksworkshop.com/advanced/340_appmesh_flagger/) -* [Open Service Mesh](tutorials/osm-progressive-delivery.md) * [Kuma](tutorials/kuma-progressive-delivery.md) **Ingress controller tutorials** @@ -50,11 +46,5 @@ After installing Flagger, you can follow one of these tutorials to get started: * [Traefik](tutorials/traefik-progressive-delivery.md) * [Apache APISIX](tutorials/apisix-progressive-delivery.md) -**Hands-on GitOps workshops** - -* [Istio](https://github.com/stefanprodan/gitops-istio) -* [Linkerd](https://helm.workshop.flagger.dev) -* [AWS App Mesh](https://eks.handson.flagger.dev) - The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our [Trademark Usage page](https://www.linuxfoundation.org/legal/trademark-usage). diff --git a/docs/gitbook/SUMMARY.md b/docs/gitbook/SUMMARY.md index 666f824c2..2bb43b6d7 100644 --- a/docs/gitbook/SUMMARY.md +++ b/docs/gitbook/SUMMARY.md @@ -7,9 +7,6 @@ * [Flagger Install on Kubernetes](install/flagger-install-on-kubernetes.md) * [Flagger Install with Flux](install/flagger-install-with-flux.md) -* [Flagger Install on GKE Istio](install/flagger-install-on-google-cloud.md) -* [Flagger Install on EKS App Mesh](install/flagger-install-on-eks-appmesh.md) -* [Flagger Install on Alibaba ServiceMesh](install/flagger-install-on-alibaba-servicemesh.md) ## Usage @@ -22,20 +19,18 @@ ## Tutorials +* [Gateway API Canary Deployments](tutorials/gatewayapi-progressive-delivery.md) * [Istio Canary Deployments](tutorials/istio-progressive-delivery.md) * [Istio A/B Testing](tutorials/istio-ab-testing.md) * [Linkerd Canary Deployments](tutorials/linkerd-progressive-delivery.md) -* [App Mesh Canary Deployments](tutorials/appmesh-progressive-delivery.md) +* [Kuma Canary Deployments](tutorials/kuma-progressive-delivery.md) +* [Knative Canary Deployments](tutorials/knative-progressive-delivery.md) * [Contour Canary Deployments](tutorials/contour-progressive-delivery.md) * [Gloo Canary Deployments](tutorials/gloo-progressive-delivery.md) * [NGINX Canary Deployments](tutorials/nginx-progressive-delivery.md) * [Skipper Canary Deployments](tutorials/skipper-progressive-delivery.md) * [Traefik Canary Deployments](tutorials/traefik-progressive-delivery.md) * [Apache APISIX Canary Deployments](tutorials/apisix-progressive-delivery.md) -* [Open Service Mesh Deployments](tutorials/osm-progressive-delivery.md) -* [Kuma Canary Deployments](tutorials/kuma-progressive-delivery.md) -* [Gateway API Canary Deployments](tutorials/gatewayapi-progressive-delivery.md) -* [Knative Canary Deployments](tutorials/knative-progressive-delivery.md) * [Blue/Green Deployments](tutorials/kubernetes-blue-green.md) * [Canary analysis with Prometheus Operator](tutorials/prometheus-operator.md) * [Canary analysis with KEDA ScaledObjects](tutorials/keda-scaledobject.md) diff --git a/docs/gitbook/dev/dev-guide.md b/docs/gitbook/dev/dev-guide.md index c8310b489..1ea5fb2a1 100644 --- a/docs/gitbook/dev/dev-guide.md +++ b/docs/gitbook/dev/dev-guide.md @@ -8,17 +8,13 @@ Flagger is written in Go and uses Go modules for dependency management. On your dev machine install the following tools: -* go >= 1.19 -* git >;= 2.20 -* bash >= 5.0 -* make >= 3.81 -* kubectl >= 1.22 -* kustomize >= 4.4 +* go >= 1.25 +* kubectl >= 1.30 +* kustomize >= 5.0 * helm >= 3.0 -* docker >= 19.03 You'll also need a Kubernetes cluster for testing Flagger. -You can use Minikube, Kind, Docker desktop or any remote cluster (AKS/EKS/GKE/etc) Kubernetes version 1.22 or newer. +You can use Minikube, Kind, Docker desktop or any remote cluster (AKS/EKS/GKE/etc). To start contributing to Flagger, fork the [repository](https://github.com/fluxcd/flagger) on GitHub. @@ -195,7 +191,6 @@ docker build -t test/flagger:latest . kind load docker-image test/flagger:latest ``` - Run the Istio e2e tests: ```bash diff --git a/docs/gitbook/install/flagger-install-on-alibaba-servicemesh.md b/docs/gitbook/install/flagger-install-on-alibaba-servicemesh.md deleted file mode 100644 index c6aef1a04..000000000 --- a/docs/gitbook/install/flagger-install-on-alibaba-servicemesh.md +++ /dev/null @@ -1,57 +0,0 @@ -# Flagger Install on Alibaba ServiceMesh - -This guide walks you through setting up Flagger on Alibaba ServiceMesh. - -## Prerequisites -- Created an ACK([Alibabacloud Container Service for Kubernetes](https://cs.console.aliyun.com)) cluster instance. -- Create an ASM([Alibaba ServiceMesh](https://servicemesh.console.aliyun.com)) enterprise instance and add ACK cluster. - -### Variables declaration -- `$ACK_CONFIG`: the kubeconfig file path of ACK, which be treated as`$HOME/.kube/config` in the rest of guide. -- `$MESH_CONFIG`: the kubeconfig file path of ASM. - -### Enable Data-plane KubeAPI access in ASM - -In the Alibaba Cloud Service Mesh (ASM) console, on the basic information page, make sure Data-plane KubeAPI access is enabled. When enabled, the Istio resources of the control plane can be managed through the Kubeconfig of the data plane cluster. - -## Enable Prometheus - -In the Alibaba Cloud Service Mesh (ASM) console, click Settings to enable the collection of Prometheus monitoring metrics. You can use the self-built Prometheus monitoring, or you can use the Alibaba Cloud ARMS Prometheus monitoring plug-in that has joined the ACK cluster, and use ARMS Prometheus to collect monitoring indicators. - -## Install Flagger - -Add Flagger Helm repository: - -```bash -helm repo add flagger https://flagger.app -``` - -Install Flagger's Canary CRD: - -```bash -kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/v1.21.0/artifacts/flagger/crd.yaml -``` -## Deploy Flagger for Istio - -### Add data plane cluster to Alibaba Cloud Service Mesh (ASM) - -In the Alibaba Cloud Service Mesh (ASM) console, click Cluster & Workload Management, select the Kubernetes cluster, select the target ACK cluster, and add it to ASM. - -### Prometheus address - -If you are using Alibaba Cloud Container Service for Kubernetes (ACK) ARMS Prometheus monitoring, replace {Region-ID} in the link below with your region ID, such as cn-hangzhou. {ACKID} is the ACK ID of the data plane cluster that you added to Alibaba Cloud Service Mesh (ASM). Visit the following links to query the public and intranet addresses monitored by ACK's ARMS Prometheus: -[https://arms.console.aliyun.com/#/promDetail/{Region-ID}/{ACK-ID}/setting](https://arms.console.aliyun.com/) - -An example of an intranet address is as follows: -[http://{Region-ID}-intranet.arms.aliyuncs.com:9090/api/v1/prometheus/{Prometheus-ID}/{u-id}/{ACK-ID}/{Region-ID}](https://arms.console.aliyun.com/) - -## Deploy Flagger -Replace the value of metricsServer with your Prometheus address. - -```bash -helm upgrade -i flagger flagger/flagger \ ---namespace=istio-system \ ---set crd.create=false \ ---set meshProvider=istio \ ---set metricsServer=http://prometheus:9090 -``` \ No newline at end of file diff --git a/docs/gitbook/install/flagger-install-on-eks-appmesh.md b/docs/gitbook/install/flagger-install-on-eks-appmesh.md deleted file mode 100644 index 70d8a9c93..000000000 --- a/docs/gitbook/install/flagger-install-on-eks-appmesh.md +++ /dev/null @@ -1,151 +0,0 @@ -# Flagger Install on EKS App Mesh - -This guide walks you through setting up Flagger and AWS App Mesh on EKS. - -## App Mesh - -The App Mesh integration with EKS is made out of the following components: - -* Kubernetes custom resources - * `mesh.appmesh.k8s.aws` defines a logical boundary for network traffic between the services - * `virtualnode.appmesh.k8s.aws` defines a logical pointer to a Kubernetes workload - * `virtualservice.appmesh.k8s.aws` defines the routing rules for a workload inside the mesh -* CRD controller - keeps the custom resources in sync with the App Mesh control plane -* Admission controller - injects the Envoy sidecar and assigns Kubernetes pods to App Mesh virtual nodes -* Telemetry service - Prometheus instance that collects and stores Envoy's metrics - -## Create a Kubernetes cluster - -In order to create an EKS cluster you can use [eksctl](https://eksctl.io). -Eksctl is an open source command-line utility made by Weaveworks in collaboration with Amazon. - -On MacOS you can install eksctl with Homebrew: - -```bash -brew tap weaveworks/tap -brew install weaveworks/tap/eksctl -``` - -Create an EKS cluster with: - -```bash -eksctl create cluster --name=appmesh \ ---region=us-west-2 \ ---nodes 3 \ ---node-volume-size=120 \ ---appmesh-access -``` - -The above command will create a two nodes cluster with -App Mesh [IAM policy](https://docs.aws.amazon.com/app-mesh/latest/userguide/MESH_IAM_user_policies.html) -attached to the EKS node instance role. - -Verify the install with: - -```bash -kubectl get nodes -``` - -## Install Helm - -Install the [Helm](https://docs.helm.sh/using_helm/#installing-helm) v3 command-line tool: - -```text -brew install helm -``` - -Add the EKS repository to Helm: - -```bash -helm repo add eks https://aws.github.io/eks-charts -``` - -## Enable horizontal pod auto-scaling - -Install the Horizontal Pod Autoscaler (HPA) metrics provider: - -```bash -helm upgrade -i metrics-server stable/metrics-server \ ---namespace kube-system \ ---set args[0]=--kubelet-preferred-address-types=InternalIP -``` - -After a minute, the metrics API should report CPU and memory usage for pods. You can very the metrics API with: - -```bash -kubectl -n kube-system top pods -``` - -## Install the App Mesh components - -Install the App Mesh CRDs: - -```bash -kubectl apply -k github.com/aws/eks-charts/stable/appmesh-controller//crds?ref=master -``` - -Create the `appmesh-system` namespace: - -```bash -kubectl create ns appmesh-system -``` - -Install the App Mesh controller: - -```bash -helm upgrade -i appmesh-controller eks/appmesh-controller \ ---wait --namespace appmesh-system -``` - -In order to collect the App Mesh metrics that Flagger needs to run the canary analysis, -you'll need to setup a Prometheus instance to scrape the Envoy sidecars. - -Install the App Mesh Prometheus: - -```bash -helm upgrade -i appmesh-prometheus eks/appmesh-prometheus \ ---wait --namespace appmesh-system -``` - -## Install Flagger - -Add Flagger Helm repository: - -```bash -helm repo add flagger https://flagger.app -``` - -Install Flagger's Canary CRD: - -```yaml -kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/main/artifacts/flagger/crd.yaml -``` - -Deploy Flagger in the _**appmesh-system**_ namespace: - -```bash -helm upgrade -i flagger flagger/flagger \ ---namespace=appmesh-system \ ---set crd.create=false \ ---set meshProvider=appmesh:v1beta2 \ ---set metricsServer=http://appmesh-prometheus:9090 -``` - -## Install Grafana - -Deploy App Mesh Grafana that comes with a dashboard for monitoring Flagger's canary releases: - -```bash -helm upgrade -i appmesh-grafana eks/appmesh-grafana \ ---namespace appmesh-system -``` - -You can access Grafana using port forwarding: - -```bash -kubectl -n appmesh-system port-forward svc/appmesh-grafana 3000:3000 -``` - -Now that you have Flagger running, you can try the -[App Mesh canary deployments tutorial](https://docs.flagger.app/usage/appmesh-progressive-delivery). - diff --git a/docs/gitbook/install/flagger-install-on-google-cloud.md b/docs/gitbook/install/flagger-install-on-google-cloud.md deleted file mode 100644 index e341276fc..000000000 --- a/docs/gitbook/install/flagger-install-on-google-cloud.md +++ /dev/null @@ -1,400 +0,0 @@ -# Flagger Install on GKE Istio - -This guide walks you through setting up Flagger and Istio on Google Kubernetes Engine. - -![GKE Cluster Overview](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/diagrams/flagger-gke-istio.png) - -## Prerequisites - -You will be creating a cluster on Google’s Kubernetes Engine \(GKE\), if you don’t have an account you can sign up [here](https://cloud.google.com/free/) for free credits. - -Login into Google Cloud, create a project and enable billing for it. - -Install the [gcloud](https://cloud.google.com/sdk/) command line utility and configure your project with `gcloud init`. - -Set the default project \(replace `PROJECT_ID` with your own project\): - -```text -gcloud config set project PROJECT_ID -``` - -Set the default compute region and zone: - -```text -gcloud config set compute/region us-central1 -gcloud config set compute/zone us-central1-a -``` - -Enable the Kubernetes and Cloud DNS services for your project: - -```text -gcloud services enable container.googleapis.com -gcloud services enable dns.googleapis.com -``` - -Install the kubectl command-line tool: - -```text -gcloud components install kubectl -``` - -## GKE cluster setup - -Create a cluster with the Istio add-on: - -```bash -K8S_VERSION=$(gcloud container get-server-config --format=json \ -| jq -r '.validMasterVersions[0]') - -gcloud beta container clusters create istio \ ---cluster-version=${K8S_VERSION} \ ---zone=us-central1-a \ ---num-nodes=2 \ ---machine-type=n1-highcpu-4 \ ---preemptible \ ---no-enable-cloud-logging \ ---no-enable-cloud-monitoring \ ---disk-size=30 \ ---enable-autorepair \ ---addons=HorizontalPodAutoscaling,Istio \ ---istio-config=auth=MTLS_PERMISSIVE -``` - -The above command will create a default node pool consisting of two `n1-highcpu-4` \(vCPU: 4, RAM 3.60GB, DISK: 30GB\) preemptible VMs. Preemptible VMs are up to 80% cheaper than regular instances and are terminated and replaced after a maximum of 24 hours. - -Set up credentials for `kubectl`: - -```bash -gcloud container clusters get-credentials istio -``` - -Create a cluster admin role binding: - -```bash -kubectl create clusterrolebinding "cluster-admin-$(whoami)" \ ---clusterrole=cluster-admin \ ---user="$(gcloud config get-value core/account)" -``` - -Validate your setup with: - -```bash -kubectl -n istio-system get svc -``` - -In a couple of seconds GCP should allocate an external IP to the `istio-ingressgateway` service. - -## Cloud DNS setup - -You will need an internet domain and access to the registrar to change the name servers to Google Cloud DNS. - -Create a managed zone named `istio` in Cloud DNS \(replace `example.com` with your domain\): - -```bash -gcloud dns managed-zones create \ ---dns-name="example.com." \ ---description="Istio zone" "istio" -``` - -Look up your zone's name servers: - -```bash -gcloud dns managed-zones describe istio -``` - -Update your registrar's name server records with the records returned by the above command. - -Wait for the name servers to change \(replace `example.com` with your domain\): - -```bash -watch dig +short NS example.com -``` - -Create a static IP address named `istio-gateway` using the Istio ingress IP: - -```bash -export GATEWAY_IP=$(kubectl -n istio-system get svc/istio-ingressgateway -ojson \ -| jq -r .status.loadBalancer.ingress[0].ip) - -gcloud compute addresses create istio-gateway --addresses ${GATEWAY_IP} --region us-central1 -``` - -Create the following DNS records \(replace `example.com` with your domain\): - -```bash -DOMAIN="example.com" - -gcloud dns record-sets transaction start --zone=istio - -gcloud dns record-sets transaction add --zone=istio \ ---name="${DOMAIN}" --ttl=300 --type=A ${GATEWAY_IP} - -gcloud dns record-sets transaction add --zone=istio \ ---name="www.${DOMAIN}" --ttl=300 --type=A ${GATEWAY_IP} - -gcloud dns record-sets transaction add --zone=istio \ ---name="*.${DOMAIN}" --ttl=300 --type=A ${GATEWAY_IP} - -gcloud dns record-sets transaction execute --zone istio -``` - -Verify that the wildcard DNS is working \(replace `example.com` with your domain\): - -```bash -watch host test.example.com -``` - -## Install Helm - -Install the [Helm](https://docs.helm.sh/using_helm/#installing-helm) command-line tool: - -```text -brew install kubernetes-helm -``` - -Create a service account and a cluster role binding for Tiller: - -```bash -kubectl -n kube-system create sa tiller - -kubectl create clusterrolebinding tiller-cluster-rule \ ---clusterrole=cluster-admin \ ---serviceaccount=kube-system:tiller -``` - -Deploy Tiller in the `kube-system` namespace: - -```bash -helm init --service-account tiller -``` - -You should consider using SSL between Helm and Tiller, for more information on securing your Helm installation see [docs.helm.sh](https://docs.helm.sh/using_helm/#securing-your-helm-installation). - -## Install cert-manager - -Jetstack's [cert-manager](https://github.com/jetstack/cert-manager) is a Kubernetes operator that automatically creates and manages TLS certs issued by Let’s Encrypt. - -You'll be using cert-manager to provision a wildcard certificate for the Istio ingress gateway. - -Install cert-manager's CRDs: - -```bash -CERT_REPO=https://raw.githubusercontent.com/jetstack/cert-manager - -kubectl apply -f ${CERT_REPO}/release-0.10/deploy/manifests/00-crds.yaml -``` - -Create the cert-manager namespace and disable resource validation: - -```bash -kubectl create namespace cert-manager - -kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true -``` - -Install cert-manager with Helm: - -```bash -helm repo add jetstack https://charts.jetstack.io && \ -helm repo update && \ -helm upgrade -i cert-manager \ ---namespace cert-manager \ ---version v0.10.0 \ -jetstack/cert-manager -``` - -## Istio Gateway TLS setup - -![Istio Let's Encrypt](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/diagrams/istio-cert-manager-gke.png) - -Create a generic Istio Gateway to expose services outside the mesh on HTTPS: - -```bash -REPO=https://raw.githubusercontent.com/fluxcd/flagger/main - -kubectl apply -f ${REPO}/artifacts/gke/istio-gateway.yaml -``` - -Create a service account with Cloud DNS admin role \(replace `my-gcp-project` with your project ID\): - -```bash -GCP_PROJECT=my-gcp-project - -gcloud iam service-accounts create dns-admin \ ---display-name=dns-admin \ ---project=${GCP_PROJECT} - -gcloud iam service-accounts keys create ./gcp-dns-admin.json \ ---iam-account=dns-admin@${GCP_PROJECT}.iam.gserviceaccount.com \ ---project=${GCP_PROJECT} - -gcloud projects add-iam-policy-binding ${GCP_PROJECT} \ ---member=serviceAccount:dns-admin@${GCP_PROJECT}.iam.gserviceaccount.com \ ---role=roles/dns.admin -``` - -Create a Kubernetes secret with the GCP Cloud DNS admin key: - -```bash -kubectl create secret generic cert-manager-credentials \ ---from-file=./gcp-dns-admin.json \ ---namespace=istio-system -``` - -Create a letsencrypt issuer for CloudDNS \(replace `email@example.com` with a valid email address and `my-gcp-project`with your project ID\): - -```yaml -apiVersion: certmanager.k8s.io/v1alpha1 -kind: Issuer -metadata: - name: letsencrypt-prod - namespace: istio-system -spec: - acme: - server: https://acme-v02.api.letsencrypt.org/directory - email: email@example.com - privateKeySecretRef: - name: letsencrypt-prod - dns01: - providers: - - name: cloud-dns - clouddns: - serviceAccountSecretRef: - name: cert-manager-credentials - key: gcp-dns-admin.json - project: my-gcp-project -``` - -Save the above resource as letsencrypt-issuer.yaml and then apply it: - -```text -kubectl apply -f ./letsencrypt-issuer.yaml -``` - -Create a wildcard certificate \(replace `example.com` with your domain\): - -```yaml -apiVersion: certmanager.k8s.io/v1alpha1 -kind: Certificate -metadata: - name: istio-gateway - namespace: istio-system -spec: - secretName: istio-ingressgateway-certs - issuerRef: - name: letsencrypt-prod - commonName: "*.example.com" - acme: - config: - - dns01: - provider: cloud-dns - domains: - - "*.example.com" - - "example.com" -``` - -Save the above resource as istio-gateway-cert.yaml and then apply it: - -```text -kubectl apply -f ./istio-gateway-cert.yaml -``` - -In a couple of seconds cert-manager should fetch a wildcard certificate from letsencrypt.org: - -```text -kubectl -n istio-system describe certificate istio-gateway - -Events: - Type Reason Age From Message - ---- ------ ---- ---- ------- - Normal CertIssued 1m52s cert-manager Certificate issued successfully -``` - -Recreate Istio ingress gateway pods: - -```bash -kubectl -n istio-system get pods -l istio=ingressgateway -``` - -Note that Istio gateway doesn't reload the certificates from the TLS secret on cert-manager renewal. Since the GKE cluster is made out of preemptible VMs the gateway pods will be replaced once every 24h, if your not using preemptible nodes then you need to manually delete the gateway pods every two months before the certificate expires. - -## Install Prometheus - -The GKE Istio add-on does not include a Prometheus instance that scrapes the Istio telemetry service. Because Flagger uses the Istio HTTP metrics to run the canary analysis you have to deploy the following Prometheus configuration that's similar to the one that comes with the official Istio Helm chart. - -Find the GKE Istio version with: - -```bash -kubectl -n istio-system get deploy istio-pilot -oyaml | grep image: -``` - -Install Prometheus in istio-system namespace: - -```bash -kubectl -n istio-system apply -f \ -https://storage.googleapis.com/gke-release/istio/release/1.0.6-gke.3/patches/install-prometheus.yaml -``` - -## Install Flagger and Grafana - -Add Flagger Helm repository: - -```bash -helm repo add flagger https://flagger.app -``` - -Install Flagger's Canary CRD: - -```yaml -kubectl apply -f https://raw.githubusercontent.com/fluxcd/flagger/main/artifacts/flagger/crd.yaml -``` - -Deploy Flagger in the `istio-system` namespace with Slack notifications enabled: - -```bash -helm upgrade -i flagger flagger/flagger \ ---namespace=istio-system \ ---set crd.create=false \ ---set metricsServer=http://prometheus.istio-system:9090 \ ---set slack.url=https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK \ ---set slack.channel=general \ ---set slack.user=flagger -``` - -Deploy Grafana in the `istio-system` namespace: - -```bash -helm upgrade -i flagger-grafana flagger/grafana \ ---namespace=istio-system \ ---set url=http://prometheus.istio-system:9090 \ ---set user=admin \ ---set password=replace-me -``` - -Expose Grafana through the public gateway by creating a virtual service \(replace `example.com` with your domain\): - -```yaml -apiVersion: networking.istio.io/v1beta1 -kind: VirtualService -metadata: - name: grafana - namespace: istio-system -spec: - hosts: - - "grafana.example.com" - gateways: - - istio-system/public-gateway - http: - - route: - - destination: - host: flagger-grafana -``` - -Save the above resource as grafana-virtual-service.yaml and then apply it: - -```bash -kubectl apply -f ./grafana-virtual-service.yaml -``` - -Navigate to `http://grafana.example.com` in your browser and you should be redirected to the HTTPS version. - diff --git a/docs/gitbook/install/flagger-install-on-kubernetes.md b/docs/gitbook/install/flagger-install-on-kubernetes.md index 03cafe68b..47ebecf30 100644 --- a/docs/gitbook/install/flagger-install-on-kubernetes.md +++ b/docs/gitbook/install/flagger-install-on-kubernetes.md @@ -1,10 +1,8 @@ # Flagger Install on Kubernetes -This guide walks you through setting up Flagger on a Kubernetes cluster with Helm v3 or Kustomize. +This guide walks you through setting up Flagger on a Kubernetes cluster with Helm or Kubectl. -## Prerequisites - -Flagger requires a Kubernetes cluster **v1.16** or newer. +See the [Flux install guide](flagger-install-with-flux.md) for installing Flagger and keeping it up to date the GitOps way. ## Install Flagger with Helm @@ -61,26 +59,6 @@ helm upgrade -i flagger flagger/flagger \ --set metricsServer=http://linkerd-prometheus:9090 ``` -Deploy Flagger for App Mesh: - -```bash -helm upgrade -i flagger flagger/flagger \ ---namespace=appmesh-system \ ---set crd.create=false \ ---set meshProvider=appmesh \ ---set metricsServer=http://appmesh-prometheus:9090 -``` - -Deploy Flagger for **Open Service Mesh (OSM)** (requires OSM to have been installed with Prometheus): - -```console -$ helm upgrade -i flagger flagger/flagger \ ---namespace=osm-system \ ---set crd.create=false \ ---set meshProvider=osm \ ---set metricsServer=http://osm-prometheus.osm-system.svc:7070 -``` - If you need to add labels to the flagger deployment or pods, you can pass the labels as parameters as shown below. ```console @@ -92,7 +70,7 @@ helm upgrade -i flagger flagger/flagger \ You can install Flagger in any namespace as long as it can talk to the Prometheus service on port 9090. -For ingress controllers, the install instructions are: +For ingress controllers, the installation instructions are: * [Contour](https://docs.flagger.app/tutorials/contour-progressive-delivery) * [Gloo](https://docs.flagger.app/tutorials/gloo-progressive-delivery) @@ -101,20 +79,6 @@ For ingress controllers, the install instructions are: * [Traefik](https://docs.flagger.app/tutorials/traefik-progressive-delivery) * [APISIX](https://docs.flagger.app/tutorials/apisix-progressive-delivery) -You can use the helm template command and apply the generated yaml with kubectl: - -```bash -# generate -helm fetch --untar --untardir . flagger/flagger && -helm template flagger ./flagger \ ---namespace=istio-system \ ---set metricsServer=http://prometheus.istio-system:9090 \ -> flagger.yaml - -# apply -kubectl apply -f flagger.yaml -``` - To uninstall the Flagger release with Helm run: ```text @@ -126,7 +90,7 @@ The command removes all the Kubernetes components associated with the chart and > **Note** that on uninstall the Canary CRD will not be removed. Deleting the CRD will make Kubernetes > remove all the objects owned by Flagger like Istio virtual services, Kubernetes deployments and ClusterIP services. -If you want to remove all the objects created by Flagger you have delete the Canary CRD with kubectl: +If you want to remove all the objects created by Flagger you have to delete the Canary CRD with kubectl: ```text kubectl delete crd canaries.flagger.app @@ -146,73 +110,18 @@ helm upgrade -i flagger-grafana flagger/grafana \ --set password=change-me ``` -Or use helm template command and apply the generated yaml with kubectl: - -```bash -# generate -helm fetch --untar --untardir . flagger/grafana && -helm template flagger-grafana ./grafana \ ---namespace=istio-system \ -> flagger-grafana.yaml - -# apply -kubectl apply -f flagger-grafana.yaml -``` - You can access Grafana using port forwarding: ```bash kubectl -n istio-system port-forward svc/flagger-grafana 3000:80 ``` -## Install Flagger with Kustomize - -As an alternative to Helm, Flagger can be installed with Kustomize **3.5.0** or newer. - -**Service mesh specific installers** - -Install Flagger for Istio: - -```bash -kustomize build https://github.com/fluxcd/flagger/kustomize/istio?ref=main | kubectl apply -f - -``` - -Install Flagger for AWS App Mesh: - -```bash -kustomize build https://github.com/fluxcd/flagger/kustomize/appmesh?ref=main | kubectl apply -f - -``` - -This deploys Flagger and sets the metrics server URL to App Mesh's Prometheus instance. +## Install Flagger with Kubectl -Install Flagger for Linkerd: +Install Flagger and Prometheus using the Kustomize overlay from the GitHub repository: ```bash -kustomize build https://github.com/fluxcd/flagger/kustomize/linkerd?ref=main | kubectl apply -f - -``` - -This deploys Flagger in the `linkerd` namespace and sets the metrics server URL to Linkerd's Prometheus instance. - -Install Flagger for Open Service Mesh: - -```bash -kustomize build https://github.com/fluxcd/flagger/kustomize/osm?ref=main | kubectl apply -f - -``` - -This deploys Flagger in the `osm-system` namespace and sets the metrics server URL to OSM's Prometheus instance. - -If you want to install a specific Flagger release, add the version number to the URL: - -```bash -kustomize build https://github.com/fluxcd/flagger/kustomize/linkerd?ref=v1.0.0 | kubectl apply -f - -``` - -**Generic installer** - -Install Flagger and Prometheus for Contour, Gloo, NGINX, Skipper, APISIX or Traefik ingress: - -```bash -kustomize build https://github.com/fluxcd/flagger/kustomize/kubernetes?ref=main | kubectl apply -f - +kubectl apply -k https://github.com/fluxcd/flagger/kustomize/kubernetes?ref=main ``` This deploys Flagger and Prometheus in the `flagger-system` namespace, @@ -221,20 +130,6 @@ sets the metrics server URL to `http://flagger-prometheus.flagger-system:9090` a The Prometheus instance has a two hours data retention and is configured to scrape all pods in your cluster that have the `prometheus.io/scrape: "true"` annotation. -To target a different provider you can specify it in the canary custom resource: - -```yaml -apiVersion: flagger.app/v1beta1 -kind: Canary -metadata: - name: app - namespace: test -spec: - # can be: kubernetes, istio, linkerd, appmesh, nginx, skipper, gloo, traefik, osm, apisix - # use the kubernetes provider for Blue/Green style deployments - provider: nginx -``` - **Customized installer** Create a kustomization file using Flagger as base and patch the container args: diff --git a/docs/gitbook/install/flagger-install-with-flux.md b/docs/gitbook/install/flagger-install-with-flux.md index 1c752fd38..4b85216fa 100644 --- a/docs/gitbook/install/flagger-install-with-flux.md +++ b/docs/gitbook/install/flagger-install-with-flux.md @@ -35,46 +35,43 @@ metadata: toolkit.fluxcd.io/tenant: sre-team ``` -Define a Flux `HelmRepository` that points to where the Flagger Helm charts are stored: +Define a Flux `OCIRepository` that points to where the Flagger Helm charts are stored: ```yaml -apiVersion: source.toolkit.fluxcd.io/v1beta2 -kind: HelmRepository +apiVersion: source.toolkit.fluxcd.io/v1 +kind: OCIRepository metadata: name: flagger namespace: flagger-system spec: interval: 1h - url: oci://ghcr.io/fluxcd/charts - type: oci + url: oci://ghcr.io/fluxcd/charts/flagger + layerSelector: + mediaType: "application/vnd.cncf.helm.chart.content.v1.tar+gzip" + operation: copy + ref: + semver: "1.x" # update to the latest version ``` Define a Flux `HelmRelease` that verifies and installs Flagger's latest version on the cluster: ```yaml --- -apiVersion: helm.toolkit.fluxcd.io/v2beta1 +apiVersion: helm.toolkit.fluxcd.io/v2 kind: HelmRelease metadata: name: flagger namespace: flagger-system spec: - interval: 1h + interval: 12h releaseName: flagger install: # override existing Flagger CRDs crds: CreateReplace upgrade: # update Flagger CRDs crds: CreateReplace - chart: - spec: - chart: flagger - version: 1.x # update Flagger to the latest minor version - interval: 6h # scan for new versions every six hours - sourceRef: - kind: HelmRepository - name: flagger - verify: # verify the chart signature with Cosign keyless - provider: cosign + chartRef: + kind: OCIRepository + name: flagger values: nodeSelector: kubernetes.io/os: linux @@ -88,7 +85,7 @@ After Flux reconciles the changes on your cluster, you can check if Flagger got ```console $ helm list -n flagger-system NAME NAMESPACE REVISION STATUS CHART APP VERSION -flagger flagger-system 1 deployed flagger-1.23.0 1.23.0 +flagger flagger-system 1 deployed flagger-1.42.0 1.42.0 ``` To uninstall Flagger, delete the `flagger.yaml` from your repository, then Flux will uninstall @@ -108,7 +105,7 @@ Define a Flux `OCIRepository` that points to where the Flagger Kustomize overlay ```yaml --- -apiVersion: source.toolkit.fluxcd.io/v1beta2 +apiVersion: source.toolkit.fluxcd.io/v1 kind: OCIRepository metadata: name: flagger-loadtester @@ -117,21 +114,20 @@ spec: interval: 6h # scan for new versions every six hours url: oci://ghcr.io/fluxcd/flagger-manifests ref: - semver: 1.x # update to the latest version - verify: # verify the artifact signature with Cosign keyless - provider: cosign + semver: "*" # update to the latest version ``` Define a Flux `Kustomization` that deploys the Flagger load tester to the `apps` namespace: ```yaml --- -apiVersion: kustomize.toolkit.fluxcd.io/v1beta2 +apiVersion: kustomize.toolkit.fluxcd.io/v1 kind: Kustomization metadata: name: flagger-loadtester namespace: apps spec: + targetNamespace: apps interval: 6h wait: true timeout: 5m @@ -140,7 +136,6 @@ spec: kind: OCIRepository name: flagger-loadtester path: ./tester - targetNamespace: apps ``` Copy the above manifests into a file called `flagger-loadtester.yaml`, place the YAML file diff --git a/docs/gitbook/tutorials/appmesh-progressive-delivery.md b/docs/gitbook/tutorials/appmesh-progressive-delivery.md deleted file mode 100644 index a7013d899..000000000 --- a/docs/gitbook/tutorials/appmesh-progressive-delivery.md +++ /dev/null @@ -1,434 +0,0 @@ -# App Mesh Canary Deployments - -This guide shows you how to use App Mesh and Flagger to automate canary deployments. -You'll need an EKS cluster (Kubernetes >= 1.16) configured with App Mesh, -you can find the installation guide [here](https://docs.flagger.app/install/flagger-install-on-eks-appmesh). - -## Bootstrap - -Flagger takes a Kubernetes deployment and optionally a horizontal pod autoscaler (HPA), -then creates a series of objects (Kubernetes deployments, ClusterIP services, -App Mesh virtual nodes and services). -These objects expose the application on the mesh and drive the canary analysis and promotion. -The only App Mesh object you need to create by yourself is the mesh resource. - -Create a mesh called `global`: - -```bash -cat << EOF | kubectl apply -f - -apiVersion: appmesh.k8s.aws/v1beta2 -kind: Mesh -metadata: - name: global -spec: - namespaceSelector: - matchLabels: - appmesh.k8s.aws/sidecarInjectorWebhook: enabled -EOF -``` - -Create a test namespace with App Mesh sidecar injection enabled: - -```bash -cat << EOF | kubectl apply -f - -apiVersion: v1 -kind: Namespace -metadata: - name: test - labels: - appmesh.k8s.aws/sidecarInjectorWebhook: enabled -EOF -``` - -Create a deployment and a horizontal pod autoscaler: - -```bash -kubectl apply -k https://github.com/fluxcd/flagger//kustomize/podinfo?ref=main -``` - -Deploy the load testing service to generate traffic during the canary analysis: - -```bash -helm upgrade -i flagger-loadtester flagger/loadtester \ ---namespace=test \ ---set appmesh.enabled=true \ ---set "appmesh.backends[0]=podinfo" \ ---set "appmesh.backends[1]=podinfo-canary" -``` - -Create a canary definition: - -```yaml -apiVersion: flagger.app/v1beta1 -kind: Canary -metadata: - annotations: - # Enable Envoy access logging to stdout. - appmesh.flagger.app/accesslog: enabled - name: podinfo - namespace: test -spec: - # App Mesh API reference - provider: appmesh:v1beta2 - # deployment reference - targetRef: - apiVersion: apps/v1 - kind: Deployment - name: podinfo - # the maximum time in seconds for the canary deployment - # to make progress before it is rollback (default 600s) - progressDeadlineSeconds: 60 - # HPA reference (optional) - autoscalerRef: - apiVersion: autoscaling/v2 - kind: HorizontalPodAutoscaler - name: podinfo - service: - # container port - port: 9898 - # App Mesh ingress timeout (optional) - timeout: 15s - # App Mesh retry policy (optional) - retries: - attempts: 3 - perTryTimeout: 5s - retryOn: "gateway-error,client-error,stream-error" - # App Mesh URI settings - match: - - uri: - prefix: / - rewrite: - uri: / - # define the canary analysis timing and KPIs - analysis: - # schedule interval (default 60s) - interval: 1m - # max number of failed metric checks before rollback - threshold: 5 - # max traffic percentage routed to canary - # percentage (0-100) - maxWeight: 50 - # canary increment step - # percentage (0-100) - stepWeight: 5 - # App Mesh Prometheus checks - metrics: - - name: request-success-rate - # minimum req success rate (non 5xx responses) - # percentage (0-100) - thresholdRange: - min: 99 - interval: 1m - - name: request-duration - # maximum req duration P99 - # milliseconds - thresholdRange: - max: 500 - interval: 30s - # testing (optional) - webhooks: - - name: acceptance-test - type: pre-rollout - url: http://flagger-loadtester.test/ - timeout: 30s - metadata: - type: bash - cmd: "curl -sd 'test' http://podinfo-canary.test:9898/token | grep token" - - name: load-test - url: http://flagger-loadtester.test/ - timeout: 5s - metadata: - cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/" -``` - -Save the above resource as podinfo-canary.yaml and then apply it: - -```bash -kubectl apply -f ./podinfo-canary.yaml -``` - -After a couple of seconds Flagger will create the canary objects: - -```bash -# applied -deployment.apps/podinfo -horizontalpodautoscaler.autoscaling/podinfo -canary.flagger.app/podinfo - -# generated Kubernetes objects -deployment.apps/podinfo-primary -horizontalpodautoscaler.autoscaling/podinfo-primary -service/podinfo -service/podinfo-canary -service/podinfo-primary - -# generated App Mesh objects -virtualnode.appmesh.k8s.aws/podinfo-canary -virtualnode.appmesh.k8s.aws/podinfo-primary -virtualrouter.appmesh.k8s.aws/podinfo -virtualrouter.appmesh.k8s.aws/podinfo-canary -virtualservice.appmesh.k8s.aws/podinfo -virtualservice.appmesh.k8s.aws/podinfo-canary -``` - -After the bootstrap, the podinfo deployment will be scaled to zero and the traffic to `podinfo.test` -will be routed to the primary pods. -During the canary analysis, the `podinfo-canary.test` address can be used to target directly the canary pods. - -App Mesh blocks all egress traffic by default. -If your application needs to call another service, you have to create an App Mesh virtual service for it -and add the virtual service name to the backend list. - -```yaml - service: - port: 9898 - backends: - - backend1 - - arn:aws:appmesh:eu-west-1:12345678910:mesh/my-mesh/virtualService/backend2 -``` - -## Setup App Mesh Gateway (optional) - -In order to expose the podinfo app outside the mesh you can use the App Mesh Gateway. - -Deploy the App Mesh Gateway behind an AWS NLB: - -```bash -helm upgrade -i appmesh-gateway eks/appmesh-gateway \ ---namespace test -``` - -Find the gateway public address: - -```bash -export URL="http://$(kubectl -n test get svc/appmesh-gateway -ojson | jq -r ".status.loadBalancer.ingress[].hostname")" -echo $URL -``` - -Wait for the NLB to become active: - -```bash - watch curl -sS $URL -``` - -Create a gateway route that points to the podinfo virtual service: - -```yaml -cat << EOF | kubectl apply -f - -apiVersion: appmesh.k8s.aws/v1beta2 -kind: GatewayRoute -metadata: - name: podinfo - namespace: test -spec: - httpRoute: - match: - prefix: "/" - action: - target: - virtualService: - virtualServiceRef: - name: podinfo -EOF -``` - -Open your browser and navigate to the ingress address to access podinfo UI. - -## Automated canary promotion - -A canary deployment is triggered by changes in any of the following objects: - -* Deployment PodSpec (container image, command, ports, env, resources, etc) -* ConfigMaps and Secrets mounted as volumes or mapped to environment variables - -Trigger a canary deployment by updating the container image: - -```bash -kubectl -n test set image deployment/podinfo \ -podinfod=ghcr.io/stefanprodan/podinfo:6.0.1 -``` - -Flagger detects that the deployment revision changed and starts a new rollout: - -```text -kubectl -n test describe canary/podinfo - -Status: - Canary Weight: 0 - Failed Checks: 0 - Phase: Succeeded -Events: - New revision detected! Scaling up podinfo.test - Waiting for podinfo.test rollout to finish: 0 of 1 updated replicas are available - Pre-rollout check acceptance-test passed - Advance podinfo.test canary weight 5 - Advance podinfo.test canary weight 10 - Advance podinfo.test canary weight 15 - Advance podinfo.test canary weight 20 - Advance podinfo.test canary weight 25 - Advance podinfo.test canary weight 30 - Advance podinfo.test canary weight 35 - Advance podinfo.test canary weight 40 - Advance podinfo.test canary weight 45 - Advance podinfo.test canary weight 50 - Copying podinfo.test template spec to podinfo-primary.test - Waiting for podinfo-primary.test rollout to finish: 1 of 2 updated replicas are available - Routing all traffic to primary - Promotion completed! Scaling down podinfo.test -``` - -When the canary analysis starts, Flagger will call the pre-rollout webhooks before routing traffic to the canary. - -**Note** that if you apply new changes to the deployment during the canary analysis, Flagger will restart the analysis. - -During the analysis the canary’s progress can be monitored with Grafana. -The App Mesh dashboard URL is -[http://localhost:3000/d/flagger-appmesh/appmesh-canary?refresh=10s&orgId=1&var-namespace=test&var-primary=podinfo-primary&var-canary=podinfo](http://localhost:3000/d/flagger-appmesh/appmesh-canary?refresh=10s&orgId=1&var-namespace=test&var-primary=podinfo-primary&var-canary=podinfo). - -![App Mesh Canary Dashboard](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/screens/flagger-grafana-appmesh.png) - -You can monitor all canaries with: - -```bash -watch kubectl get canaries --all-namespaces - -NAMESPACE NAME STATUS WEIGHT -test podinfo Progressing 15 -prod frontend Succeeded 0 -prod backend Failed 0 -``` - -If you’ve enabled the Slack notifications, you should receive the following messages: - -![Flagger Slack Notifications](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/screens/slack-canary-notifications.png) - -## Automated rollback - -During the canary analysis you can generate HTTP 500 errors or high latency to test if Flagger pauses the rollout. - -Trigger a canary deployment: - -```bash -kubectl -n test set image deployment/podinfo \ -podinfod=ghcr.io/stefanprodan/podinfo:6.0.2 -``` - -Exec into the load tester pod with: - -```bash -kubectl -n test exec -it deploy/flagger-loadtester bash -``` - -Generate HTTP 500 errors: - -```bash -hey -z 1m -c 5 -q 5 http://podinfo-canary.test:9898/status/500 -``` - -Generate latency: - -```bash -watch -n 1 curl http://podinfo-canary.test:9898/delay/1 -``` - -When the number of failed checks reaches the canary analysis threshold, the traffic is routed back to the primary, -the canary is scaled to zero and the rollout is marked as failed. - -```text -kubectl -n appmesh-system logs deploy/flagger -f | jq .msg - -New revision detected! progressing canary analysis for podinfo.test -Pre-rollout check acceptance-test passed -Advance podinfo.test canary weight 5 -Advance podinfo.test canary weight 10 -Advance podinfo.test canary weight 15 -Halt podinfo.test advancement success rate 69.17% < 99% -Halt podinfo.test advancement success rate 61.39% < 99% -Halt podinfo.test advancement success rate 55.06% < 99% -Halt podinfo.test advancement request duration 1.20s > 0.5s -Halt podinfo.test advancement request duration 1.45s > 0.5s -Rolling back podinfo.test failed checks threshold reached 5 -Canary failed! Scaling down podinfo.test -``` - -If you’ve enabled the Slack notifications, you’ll receive a message if the progress deadline is exceeded, -or if the analysis reached the maximum number of failed checks: - -![Flagger Slack Notifications](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/screens/slack-canary-failed.png) - -## A/B Testing - -Besides weighted routing, Flagger can be configured to route traffic to the canary based on HTTP match conditions. -In an A/B testing scenario, you'll be using HTTP headers or cookies to target a certain segment of your users. -This is particularly useful for frontend applications that require session affinity. - -![Flagger A/B Testing Stages](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/diagrams/flagger-abtest-steps.png) - -Edit the canary analysis, remove the max/step weight and add the match conditions and iterations: - -```yaml - analysis: - interval: 1m - threshold: 5 - iterations: 10 - match: - - headers: - x-canary: - exact: "insider" - webhooks: - - name: load-test - url: http://flagger-loadtester.test/ - metadata: - cmd: "hey -z 1m -q 10 -c 2 -H 'X-Canary: insider' http://podinfo.test:9898/" -``` - -The above configuration will run an analysis for ten minutes targeting users that have a `X-Canary: insider` header. - -You can also use a HTTP cookie, to target all users with a `canary` cookie set to `insider` the match condition should be: - -```yaml -match: -- headers: - cookie: - regex: "^(.*?;)?(canary=insider)(;.*)?$" -webhooks: -- name: load-test - url: http://flagger-loadtester.test/ - metadata: - cmd: "hey -z 1m -q 10 -c 2 -H 'Cookie: canary=insider' http://podinfo.test:9898/" -``` - -Trigger a canary deployment by updating the container image: - -```bash -kubectl -n test set image deployment/podinfo \ -podinfod=ghcr.io/stefanprodan/podinfo:6.0.3 -``` - -Flagger detects that the deployment revision changed and starts the A/B test: - -```text -kubectl -n appmesh-system logs deploy/flagger -f | jq .msg - -New revision detected! progressing canary analysis for podinfo.test -Advance podinfo.test canary iteration 1/10 -Advance podinfo.test canary iteration 2/10 -Advance podinfo.test canary iteration 3/10 -Advance podinfo.test canary iteration 4/10 -Advance podinfo.test canary iteration 5/10 -Advance podinfo.test canary iteration 6/10 -Advance podinfo.test canary iteration 7/10 -Advance podinfo.test canary iteration 8/10 -Advance podinfo.test canary iteration 9/10 -Advance podinfo.test canary iteration 10/10 -Copying podinfo.test template spec to podinfo-primary.test -Waiting for podinfo-primary.test rollout to finish: 1 of 2 updated replicas are available -Routing all traffic to primary -Promotion completed! Scaling down podinfo.test -``` - -The above procedure can be extended with -[custom metrics](../usage/metrics.md) checks, -[webhooks](../usage/webhooks.md), -[manual promotion](../usage/webhooks.md#manual-gating) approval and -[Slack or MS Teams](../usage/alerting.md) notifications. diff --git a/docs/gitbook/tutorials/canary-helm-gitops.md b/docs/gitbook/tutorials/canary-helm-gitops.md deleted file mode 100644 index 979955c95..000000000 --- a/docs/gitbook/tutorials/canary-helm-gitops.md +++ /dev/null @@ -1,347 +0,0 @@ -# Canaries with Helm charts and GitOps - -This guide shows you how to package a web app into a Helm chart, trigger canary deployments on Helm upgrade and automate the chart release process with Weave Flux. - -## Packaging - -You'll be using the [podinfo](https://github.com/stefanprodan/k8s-podinfo) chart. This chart packages a web app made with Go, it's configuration, a horizontal pod autoscaler \(HPA\) and the canary configuration file. - -```text -├── Chart.yaml -├── README.md -├── templates -│ ├── NOTES.txt -│ ├── _helpers.tpl -│ ├── canary.yaml -│ ├── configmap.yaml -│ ├── deployment.yaml -│ ├── hpa.yaml -│ ├── service.yaml -│ └── tests -│ ├── test-config.yaml -│ └── test-pod.yaml -└── values.yaml -``` - -You can find the chart source [here](https://github.com/stefanprodan/flagger/tree/master/charts/podinfo). - -## Install - -Create a test namespace with Istio sidecar injection enabled: - -```bash -export REPO=https://raw.githubusercontent.com/fluxcd/flagger/main - -kubectl apply -f ${REPO}/artifacts/namespaces/test.yaml -``` - -Add Flagger Helm repository: - -```bash -helm repo add flagger https://flagger.app -``` - -Install podinfo with the release name `frontend` \(replace `example.com` with your own domain\): - -```bash -helm upgrade -i frontend flagger/podinfo \ ---namespace test \ ---set nameOverride=frontend \ ---set backend=http://backend.test:9898/echo \ ---set canary.enabled=true \ ---set canary.istioIngress.enabled=true \ ---set canary.istioIngress.gateway=istio-system/public-gateway \ ---set canary.istioIngress.host=frontend.istio.example.com -``` - -Flagger takes a Kubernetes deployment and a horizontal pod autoscaler \(HPA\), then creates a series of objects \(Kubernetes deployments, ClusterIP services and Istio virtual services\). These objects expose the application on the mesh and drive the canary analysis and promotion. - -```bash -# generated by Helm -configmap/frontend -deployment.apps/frontend -horizontalpodautoscaler.autoscaling/frontend -canary.flagger.app/frontend - -# generated by Flagger -configmap/frontend-primary -deployment.apps/frontend-primary -horizontalpodautoscaler.autoscaling/frontend-primary -service/frontend -service/frontend-canary -service/frontend-primary -virtualservice.networking.istio.io/frontend -``` - -When the `frontend-primary` deployment comes online, Flagger will route all traffic to the primary pods and scale to zero the `frontend` deployment. - -Open your browser and navigate to the frontend URL: - -![Podinfo Frontend](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/screens/demo-frontend.png) - -Now let's install the `backend` release without exposing it outside the mesh: - -```bash -helm upgrade -i backend flagger/podinfo \ ---namespace test \ ---set nameOverride=backend \ ---set canary.enabled=true \ ---set canary.istioIngress.enabled=false -``` - -Check if Flagger has successfully deployed the canaries: - -```text -kubectl -n test get canaries - -NAME STATUS WEIGHT LASTTRANSITIONTIME -backend Initialized 0 2019-02-12T18:53:18Z -frontend Initialized 0 2019-02-12T17:50:50Z -``` - -Click on the ping button in the `frontend` UI to trigger a HTTP POST request that will reach the `backend` app: - -![Jaeger Tracing](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/screens/demo-frontend-jaeger.png) - -We'll use the `/echo` endpoint \(same as the one the ping button calls\) to generate load on both apps during a canary deployment. - -## Upgrade - -First let's install a load testing service that will generate traffic during analysis: - -```bash -helm upgrade -i flagger-loadtester flagger/loadtester \ ---namespace=test -``` - -Install Flagger's helm test runner in the `kube-system` using `tiller` service account: - -```bash -helm upgrade -i flagger-helmtester flagger/loadtester \ ---namespace=kube-system \ ---set serviceAccountName=tiller -``` - -Enable the load and helm tester and deploy a new `frontend` version: - -```bash -helm upgrade -i frontend flagger/podinfo/ \ ---namespace test \ ---reuse-values \ ---set canary.loadtest.enabled=true \ ---set canary.helmtest.enabled=true \ ---set image.tag=3.1.1 -``` - -Flagger detects that the deployment revision changed and starts the canary analysis: - -```text -kubectl -n istio-system logs deployment/flagger -f | jq .msg - -New revision detected! Scaling up frontend.test -Halt advancement frontend.test waiting for rollout to finish: 0 of 2 updated replicas are available -Starting canary analysis for frontend.test -Pre-rollout check helm test passed -Advance frontend.test canary weight 5 -Advance frontend.test canary weight 10 -Advance frontend.test canary weight 15 -Advance frontend.test canary weight 20 -Advance frontend.test canary weight 25 -Advance frontend.test canary weight 30 -Advance frontend.test canary weight 35 -Advance frontend.test canary weight 40 -Advance frontend.test canary weight 45 -Advance frontend.test canary weight 50 -Copying frontend.test template spec to frontend-primary.test -Halt advancement frontend-primary.test waiting for rollout to finish: 1 old replicas are pending termination -Promotion completed! Scaling down frontend.test -``` - -You can monitor the canary deployment with Grafana. Open the Flagger dashboard, select `test` from the namespace dropdown, `frontend-primary` from the primary dropdown and `frontend` from the canary dropdown. - -![Flagger Grafana Dashboard](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/screens/demo-frontend-dashboard.png) - -Now trigger a canary deployment for the `backend` app, but this time you'll change a value in the configmap: - -```bash -helm upgrade -i backend flagger/podinfo/ \ ---namespace test \ ---reuse-values \ ---set canary.loadtest.enabled=true \ ---set canary.helmtest.enabled=true \ ---set httpServer.timeout=25s -``` - -Generate HTTP 500 errors: - -```bash -kubectl -n test exec -it flagger-loadtester-xxx-yyy sh - -watch curl http://backend-canary:9898/status/500 -``` - -Generate latency: - -```bash -kubectl -n test exec -it flagger-loadtester-xxx-yyy sh - -watch curl http://backend-canary:9898/delay/1 -``` - -Flagger detects the config map change and starts a canary analysis. Flagger will pause the advancement when the HTTP success rate drops under 99% or when the average request duration in the last minute is over 500ms: - -```text -kubectl -n test describe canary backend - -Events: - -ConfigMap backend has changed -New revision detected! Scaling up backend.test -Starting canary analysis for backend.test -Advance backend.test canary weight 5 -Advance backend.test canary weight 10 -Advance backend.test canary weight 15 -Advance backend.test canary weight 20 -Advance backend.test canary weight 25 -Advance backend.test canary weight 30 -Advance backend.test canary weight 35 -Halt backend.test advancement success rate 62.50% < 99% -Halt backend.test advancement success rate 88.24% < 99% -Advance backend.test canary weight 40 -Advance backend.test canary weight 45 -Halt backend.test advancement request duration 2.415s > 500ms -Halt backend.test advancement request duration 2.42s > 500ms -Advance backend.test canary weight 50 -ConfigMap backend-primary synced -Copying backend.test template spec to backend-primary.test -Promotion completed! Scaling down backend.test -``` - -![Flagger Grafana Dashboard](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/screens/demo-backend-dashboard.png) - -If the number of failed checks reaches the canary analysis threshold, the traffic is routed back to the primary, the canary is scaled to zero and the rollout is marked as failed. - -```bash -kubectl -n test get canary - -NAME STATUS WEIGHT LASTTRANSITIONTIME -backend Succeeded 0 2019-02-12T19:33:11Z -frontend Failed 0 2019-02-12T19:47:20Z -``` - -If you've enabled the Slack notifications, you'll receive an alert with the reason why the `backend` promotion failed. - -## GitOps automation - -Instead of using Helm CLI from a CI tool to perform the install and upgrade, you could use a Git based approach. GitOps is a way to do Continuous Delivery, it works by using Git as a source of truth for declarative infrastructure and workloads. In the [GitOps model](https://www.weave.works/technologies/gitops/), any change to production must be committed in source control prior to being applied on the cluster. This way rollback and audit logs are provided by Git. - -![Helm GitOps Canary Deployment](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/diagrams/flagger-flux-gitops.png) - -In order to apply the GitOps pipeline model to Flagger canary deployments you'll need a Git repository with your workloads definitions in YAML format, a container registry where your CI system pushes immutable images and an operator that synchronizes the Git repo with the cluster state. - -Create a git repository with the following content: - -```text -├── namespaces -│ └── test.yaml -└── releases - └── test - ├── backend.yaml - ├── frontend.yaml - ├── loadtester.yaml - └── helmtester.yaml -``` - -Define the `frontend` release using Flux `HelmRelease` custom resource: - -```yaml -apiVersion: flux.weave.works/v1beta1 -kind: HelmRelease -metadata: - name: frontend - namespace: test - annotations: - fluxcd.io/automated: "true" - filter.fluxcd.io/chart-image: semver:~3.1 -spec: - releaseName: frontend - chart: - git: https://github.com/weaveowrks/flagger - ref: master - path: charts/podinfo - values: - image: - repository: stefanprodan/podinfo - tag: 3.1.0 - backend: http://backend-podinfo:9898/echo - canary: - enabled: true - istioIngress: - enabled: true - gateway: istio-system/public-gateway - host: frontend.istio.example.com - loadtest: - enabled: true - helmtest: - enabled: true -``` - -In the `chart` section I've defined the release source by specifying the Helm repository \(hosted on GitHub Pages\), chart name and version. In the `values` section I've overwritten the defaults set in values.yaml. - -With the `fluxcd.io` annotations I instruct Flux to automate this release. When an image tag in the sem ver range of `3.1.0 - 3.1.99` is pushed to Docker Hub, Flux will upgrade the Helm release and from there Flagger will pick up the change and start a canary deployment. - -Install [Flux](https://github.com/fluxcd/flux) and its [Helm Operator](https://github.com/fluxcd/helm-operator) by specifying your Git repo URL: - -```bash -helm repo add fluxcd https://charts.fluxcd.io - -helm install --name flux \ ---set git.url=git@github.com:/ \ ---namespace fluxcd \ -fluxcd/flux - -helm upgrade -i helm-operator fluxcd/helm-operator \ ---namespace fluxcd \ ---set git.ssh.secretName=flux-git-deploy -``` - -At startup Flux generates a SSH key and logs the public key. Find the SSH public key with: - -```bash -kubectl -n fluxcd logs deployment/flux | grep identity.pub | cut -d '"' -f2 -``` - -In order to sync your cluster state with Git you need to copy the public key and create a deploy key with write access on your GitHub repository. - -Open GitHub, navigate to your fork, go to _Setting > Deploy keys_ click on _Add deploy key_, check _Allow write access_, paste the Flux public key and click _Add key_. - -After a couple of seconds Flux will apply the Kubernetes resources from Git and Flagger will launch the `frontend` and `backend` apps. - -A CI/CD pipeline for the `frontend` release could look like this: - -* cut a release from the master branch of the podinfo code repo with the git tag `3.1.1` -* CI builds the image and pushes the `podinfo:6.0.1` image to the container registry -* Flux scans the registry and updates the Helm release `image.tag` to `3.1.1` -* Flux commits and push the change to the cluster repo -* Flux applies the updated Helm release on the cluster -* Flux Helm Operator picks up the change and calls Tiller to upgrade the release -* Flagger detects a revision change and scales up the `frontend` deployment -* Flagger runs the helm test before routing traffic to the canary service -* Flagger starts the load test and runs the canary analysis -* Based on the analysis result the canary deployment is promoted to production or rolled back -* Flagger sends a Slack or MS Teams notification with the canary result - -If the canary fails, fix the bug, do another patch release eg `3.1.2` and the whole process will run again. - -A canary deployment can fail due to any of the following reasons: - -* the container image can't be downloaded -* the deployment replica set is stuck for more then ten minutes \(eg. due to a container crash loop\) -* the webhooks \(acceptance tests, helm tests, load tests, etc\) are returning a non 2xx response -* the HTTP success rate \(non 5xx responses\) metric drops under the threshold -* the HTTP average duration metric goes over the threshold -* the Istio telemetry service is unable to collect traffic metrics -* the metrics server \(Prometheus\) can't be reached - -If you want to find out more about managing Helm releases with Flux here are two in-depth guides: [flux2-kustomize-helm-example](https://github.com/fluxcd/flux2-kustomize-helm-example) and [gitops-istio](https://github.com/stefanprodan/gitops-istio). - diff --git a/docs/gitbook/tutorials/gatewayapi-progressive-delivery.md b/docs/gitbook/tutorials/gatewayapi-progressive-delivery.md index 39a4a762f..5fed772cd 100644 --- a/docs/gitbook/tutorials/gatewayapi-progressive-delivery.md +++ b/docs/gitbook/tutorials/gatewayapi-progressive-delivery.md @@ -6,12 +6,11 @@ This guide shows you how to use [Gateway API](https://gateway-api.sigs.k8s.io/) ## Prerequisites -Flagger requires a Kubernetes cluster **v1.29** or newer and any mesh/ingress that -implements the `v1` version of Gateway API. +Flagger requires an ingress controller or service mesh that implements the Gateway API **HTTPRoute** (`v1` or `v1beta1`). We'll be using Istio for the sake of this tutorial, but you can use any other implementation. -Install the Gateway API CRDs +Install the Gateway API CRDs: ```bash # Suggestion: Change v1.4.0 in to the latest Gateway API version @@ -156,7 +155,7 @@ Save the above resource as metric-templates.yaml and then apply it: kubectl apply -f metric-templates.yaml ``` -Create a canary custom resource \(replace "www.example.com" with your own domain\): +Create a Canary custom resource \(replace "www.example.com" with your own domain\): ```yaml apiVersion: flagger.app/v1beta1 @@ -277,10 +276,7 @@ point a domain e.g. `www.example.com` to the LB address. Now you can access the podinfo UI using your domain address. Note that you should be using HTTPS when exposing production workloads on internet. -You can obtain free TLS certs from Let's Encrypt, read this -[guide](https://github.com/stefanprodan/istio-gke) on how to configure cert-manager to secure Istio with TLS certificates. - -If you're using a local cluster via kind/k3s you can port forward the Envoy LoadBalancer service: +If you're using a local cluster you can port forward to the Envoy LoadBalancer service: ```bash kubectl port-forward -n istio-ingress svc/gateway-istio 8080:80 @@ -714,7 +710,97 @@ Metrics are collected on both requests so that the deployment will only proceed The above procedures can be extended with [custom metrics](../usage/metrics.md) checks, [webhooks](../usage/webhooks.md), [manual promotion](../usage/webhooks.md#manual-gating) approval and [Slack or MS Teams](../usage/alerting.md) notifications. -## CORS Support +## Customising the HTTPRoute + +Besides the `hosts` and `gatewayRefs` fields, you can customize the generated HTTPRoute with various options +exposed under the `spec.service` field of the Canary. + +### Header Manipulation + +You can configure request and response header manipulation using the `spec.service.headers` field of the Canary. + +> **Note:** Header manipulation requires a Gateway API implementation that supports +> the [`RequestHeaderModifier`](https://gateway-api.sigs.k8s.io/guides/http-header-modifier/) and [`ResponseHeaderModifier`](https://gateway-api.sigs.k8s.io/guides/http-header-modifier/) filters. + +Example configuration: + +```yaml +apiVersion: flagger.app/v1beta1 +kind: Canary +metadata: + name: podinfo + namespace: test +spec: + service: + headers: + request: + add: + x-custom-header: "custom-value" + set: + x-api-version: "v1" + remove: + - x-debug-header + response: + add: + x-frame-options: "DENY" + x-content-type-options: "nosniff" + set: + cache-control: "no-cache" + remove: + - x-powered-by +``` + +### URL Rewriting + +You can configure URL rewriting using the `spec.service.rewrite` field of the Canary to modify the path or hostname of requests. + +> **Note:** URL rewriting requires a Gateway API implementation that supports +> the [`URLRewrite`](https://gateway-api.sigs.k8s.io/guides/http-redirect-rewrite/?h=urlrewrite#rewrites) filter. + +Example configuration: + +```yaml +apiVersion: flagger.app/v1beta1 +kind: Canary +metadata: + name: podinfo + namespace: test +spec: + service: + rewrite: + # Rewrite the URI path + uri: "/v2/api" + # Optionally specify the rewrite type: "ReplaceFullPath" or "ReplacePrefixMatch" + # Defaults to "ReplaceFullPath" if not specified + type: "ReplaceFullPath" + # Rewrite the hostname/authority header + authority: "api.example.com" +``` + +The `type` field determines how the URI rewriting is performed: + +- **ReplaceFullPath**: Replaces the entire request path with the specified `uri` value +- **ReplacePrefixMatch**: Replaces only the prefix portion of the path that was matched + +Example with prefix replacement: + +```yaml +apiVersion: flagger.app/v1beta1 +kind: Canary +metadata: + name: podinfo + namespace: test +spec: + service: + rewrite: + uri: "/api/v2" + type: "ReplacePrefixMatch" +``` + +When using `ReplacePrefixMatch`, if a request comes to `/old/path`, and the HTTPRoute matches the prefix `/old`, +the request will be rewritten to `/api/v2/path`. + +### CORS Policy The cross-origin resource sharing policy can be configured the `spec.service.corsPolicy` field of the Canary. diff --git a/docs/gitbook/tutorials/knative-progressive-delivery.md b/docs/gitbook/tutorials/knative-progressive-delivery.md index 5ba3444f5..6e32f5510 100644 --- a/docs/gitbook/tutorials/knative-progressive-delivery.md +++ b/docs/gitbook/tutorials/knative-progressive-delivery.md @@ -140,8 +140,8 @@ After a couple of seconds Flagger will make the following changes the Knative Se Trigger a canary deployment by updating the container image: ```bash -kubectl -n test set image deployment/podinfo \ -podinfod=stefanprodan/podinfo:6.0.1 +kubectl -n test patch services.serving podinfo --type=json \ +-p '[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "ghcr.io/stefanprodan/podinfo:6.0.1"}]' ``` Flagger detects that the deployment revision changed and starts a new rollout: @@ -200,8 +200,8 @@ During the canary analysis you can generate HTTP 500 errors and high latency to Trigger another canary deployment: ```bash -kubectl -n test set image deployment/podinfo \ -podinfod=stefanprodan/podinfo:6.0.2 +kubectl -n test patch services.serving podinfo --type=json \ +-p '[{"op": "replace", "path": "/spec/template/spec/containers/0/image", "value": "ghcr.io/stefanprodan/podinfo:6.0.2"}]' ``` Exec into the load tester pod with: diff --git a/docs/gitbook/tutorials/kubernetes-blue-green.md b/docs/gitbook/tutorials/kubernetes-blue-green.md index ad8db524e..c5ed2feb1 100644 --- a/docs/gitbook/tutorials/kubernetes-blue-green.md +++ b/docs/gitbook/tutorials/kubernetes-blue-green.md @@ -72,7 +72,6 @@ metadata: name: podinfo namespace: test spec: - # service mesh provider can be: kubernetes, istio, appmesh, nginx, gloo provider: kubernetes # deployment reference targetRef: diff --git a/docs/gitbook/tutorials/osm-progressive-delivery.md b/docs/gitbook/tutorials/osm-progressive-delivery.md deleted file mode 100644 index 9f1edbe1d..000000000 --- a/docs/gitbook/tutorials/osm-progressive-delivery.md +++ /dev/null @@ -1,363 +0,0 @@ -# Open Service Mesh Canary Deployments - -This guide shows you how to use Open Service Mesh (OSM) and Flagger to automate canary deployments. - -![Flagger OSM Traffic Split](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/diagrams/flagger-osm-traffic-split.png) - -## Prerequisites - -Flagger requires a Kubernetes cluster **v1.16** or newer and Open Service Mesh **0.9.1** or newer. - -OSM must have permissive traffic policy enabled and have an instance of Prometheus for metrics. - -- If the OSM CLI is being used for installation, install OSM using the following command: - ```bash - osm install \ - --set=OpenServiceMesh.deployPrometheus=true \ - --set=OpenServiceMesh.enablePermissiveTrafficPolicy=true - ``` -- If a managed instance of OSM is being used: - - [Bring your own instance](docs.openservicemesh.io/docs/guides/observability/metrics/#byo-prometheus) of Prometheus, - setting the namespace to match the managed OSM controller namespace - - Enable permissive traffic policy after installation by updating the OSM MeshConfig resource: - ```bash - # Replace with OSM controller's namespace - kubectl patch meshconfig osm-mesh-config -n -p '{"spec":{"traffic":{"enablePermissiveTrafficPolicyMode":true}}}' --type=merge - ``` - -To install Flagger in the default `osm-system` namespace, use: -```bash -kubectl apply -k https://github.com/fluxcd/flagger//kustomize/osm?ref=main -``` - -Alternatively, if a non-default namespace or managed instance of OSM is in use, install Flagger with Helm, replacing the -values as appropriate. If a custom instance of Prometheus is being used, replace `osm-prometheus` with the relevant Prometheus service name. -```bash -helm upgrade -i flagger flagger/flagger \ ---namespace= \ ---set meshProvider=osm \ ---set metricsServer=http://osm-prometheus..svc:7070 -``` - -## Bootstrap - -Flagger takes a Kubernetes deployment and optionally a horizontal pod autoscaler (HPA), -then creates a series of objects (Kubernetes deployments, ClusterIP services and SMI traffic split). -These objects expose the application inside the mesh and drive the canary analysis and promotion. - -Create a `test` namespace and enable OSM namespace monitoring and metrics scraping for the namespace. - -```bash -kubectl create namespace test -osm namespace add test -osm metrics enable --namespace test -``` - -Create a `podinfo` deployment and a horizontal pod autoscaler: - -```bash -kubectl apply -k https://github.com/fluxcd/flagger//kustomize/podinfo?ref=main -``` - -Install the load testing service to generate traffic during the canary analysis: - -```bash -kubectl apply -k https://github.com/fluxcd/flagger//kustomize/tester?ref=main -``` - -Create a canary custom resource for the `podinfo` deployment. -The following `podinfo` canary custom resource instructs Flagger to: -1. monitor any changes to the `podinfo` deployment created earlier, -2. detect `podinfo` deployment revision changes, and -3. start a Flagger canary analysis, rollout, and promotion if there were deployment revision changes. - -```yaml -apiVersion: flagger.app/v1beta1 -kind: Canary -metadata: - name: podinfo - namespace: test -spec: - provider: osm - # deployment reference - targetRef: - apiVersion: apps/v1 - kind: Deployment - name: podinfo - # HPA reference (optional) - autoscalerRef: - apiVersion: autoscaling/v2 - kind: HorizontalPodAutoscaler - name: podinfo - # the maximum time in seconds for the canary deployment - # to make progress before it is rolled back (default 600s) - progressDeadlineSeconds: 60 - service: - # ClusterIP port number - port: 9898 - # container port number or name (optional) - targetPort: 9898 - analysis: - # schedule interval (default 60s) - interval: 30s - # max number of failed metric checks before rollback - threshold: 5 - # max traffic percentage routed to canary - # percentage (0-100) - maxWeight: 50 - # canary increment step - # percentage (0-100) - stepWeight: 5 - # OSM Prometheus checks - metrics: - - name: request-success-rate - # minimum req success rate (non 5xx responses) - # percentage (0-100) - thresholdRange: - min: 99 - interval: 1m - - name: request-duration - # maximum req duration P99 - # milliseconds - thresholdRange: - max: 500 - interval: 30s - # testing (optional) - webhooks: - - name: acceptance-test - type: pre-rollout - url: http://flagger-loadtester.test/ - timeout: 30s - metadata: - type: bash - cmd: "curl -sd 'test' http://podinfo-canary.test:9898/token | grep token" - - name: load-test - type: rollout - url: http://flagger-loadtester.test/ - timeout: 5s - metadata: - cmd: "hey -z 2m -q 10 -c 2 http://podinfo-canary.test:9898/" -``` - -Save the above resource as podinfo-canary.yaml and then apply it: - -```bash -kubectl apply -f ./podinfo-canary.yaml -``` - -When the canary analysis starts, Flagger will call the pre-rollout webhooks before routing traffic to the canary. -The canary analysis will run for five minutes while validating the HTTP metrics and rollout hooks every half a minute. - -After a couple of seconds Flagger will create the canary objects. - -```bash -# applied -deployment.apps/podinfo -horizontalpodautoscaler.autoscaling/podinfo -ingresses.extensions/podinfo -canary.flagger.app/podinfo - -# generated -deployment.apps/podinfo-primary -horizontalpodautoscaler.autoscaling/podinfo-primary -service/podinfo -service/podinfo-canary -service/podinfo-primary -trafficsplits.split.smi-spec.io/podinfo -``` - -After the bootstrap, the `podinfo` deployment will be scaled to zero and the traffic to `podinfo.test` will be routed to the primary pods. -During the canary analysis, the `podinfo-canary.test` address can be used to target directly the canary pods. - -## Automated Canary Promotion - -Flagger implements a control loop that gradually shifts traffic to the canary while measuring key performance indicators like HTTP requests success rate, requests average duration and pod health. -Based on analysis of the KPIs a canary is promoted or aborted. - -![Flagger Canary Stages](https://raw.githubusercontent.com/fluxcd/flagger/main/docs/diagrams/flagger-canary-steps.png) - -Trigger a canary deployment by updating the container image: - -```bash -kubectl -n test set image deployment/podinfo \ -podinfod=ghcr.io/stefanprodan/podinfo:6.0.1 -``` - -Flagger detects that the deployment revision changed and starts a new rollout. - - -```text -kubectl -n test describe canary/podinfo - -Status: - Canary Weight: 0 - Failed Checks: 0 - Phase: Succeeded -Events: - New revision detected! Scaling up podinfo.test - Waiting for podinfo.test rollout to finish: 0 of 1 updated replicas are available - Pre-rollout check acceptance-test passed - Advance podinfo.test canary weight 5 - Advance podinfo.test canary weight 10 - Advance podinfo.test canary weight 15 - Advance podinfo.test canary weight 20 - Advance podinfo.test canary weight 25 - Waiting for podinfo.test rollout to finish: 1 of 2 updated replicas are available - Advance podinfo.test canary weight 30 - Advance podinfo.test canary weight 35 - Advance podinfo.test canary weight 40 - Advance podinfo.test canary weight 45 - Advance podinfo.test canary weight 50 - Copying podinfo.test template spec to podinfo-primary.test - Waiting for podinfo-primary.test rollout to finish: 1 of 2 updated replicas are available - Promotion completed! Scaling down podinfo.test -``` - -**Note** that if you apply any new changes to the `podinfo` deployment during the canary analysis, Flagger will restart the analysis. - -A canary deployment is triggered by changes in any of the following objects: - -* Deployment PodSpec \(container image, command, ports, env, resources, etc\) -* ConfigMaps mounted as volumes or mapped to environment variables -* Secrets mounted as volumes or mapped to environment variables - -You can monitor all canaries with: - -```bash -watch kubectl get canaries --all-namespaces - -NAMESPACE NAME STATUS WEIGHT LASTTRANSITIONTIME -test podinfo Progressing 15 2019-06-30T14:05:07Z -prod frontend Succeeded 0 2019-06-30T16:15:07Z -prod backend Failed 0 2019-06-30T17:05:07Z -``` - -## Automated Rollback - -During the canary analysis you can generate HTTP 500 errors and high latency to test if Flagger pauses and rolls back the faulted version. - -Trigger another canary deployment: - -```bash -kubectl -n test set image deployment/podinfo \ -podinfod=ghcr.io/stefanprodan/podinfo:6.0.2 -``` - -Exec into the load tester pod with: - -```bash -kubectl -n test exec -it flagger-loadtester-xx-xx sh -``` - -Repeatedly generate HTTP 500 errors until the `kubectl describe` output below shows canary rollout failure: - -```bash -watch -n 0.1 curl http://podinfo-canary.test:9898/status/500 -``` - -Repeatedly generate latency until canary rollout fails: - -```bash -watch -n 0.1 curl http://podinfo-canary.test:9898/delay/1 -``` - -When the number of failed checks reaches the canary analysis thresholds defined in the `podinfo` canary custom resource earlier, the traffic is routed back to the primary, the canary is scaled to zero and the rollout is marked as failed. - -```text -kubectl -n test describe canary/podinfo - -Status: - Canary Weight: 0 - Failed Checks: 10 - Phase: Failed -Events: - Starting canary analysis for podinfo.test - Pre-rollout check acceptance-test passed - Advance podinfo.test canary weight 5 - Advance podinfo.test canary weight 10 - Advance podinfo.test canary weight 15 - Halt podinfo.test advancement success rate 69.17% < 99% - Halt podinfo.test advancement success rate 61.39% < 99% - Halt podinfo.test advancement success rate 55.06% < 99% - Halt podinfo.test advancement request duration 1.20s > 0.5s - Halt podinfo.test advancement request duration 1.45s > 0.5s - Rolling back podinfo.test failed checks threshold reached 5 - Canary failed! Scaling down podinfo.test -``` - -## Custom Metrics - -The canary analysis can be extended with Prometheus queries. - -Let's define a check for 404 not found errors. -Edit the canary analysis (`podinfo-canary.yaml` file) and add the following metric. -For more information on creating additional custom metrics using OSM metrics, please check the [metrics available in OSM](https://docs.openservicemesh.io/docs/guides/observability/metrics/#available-metrics). - -```yaml - analysis: - metrics: - - name: "404s percentage" - threshold: 3 - query: | - 100 - ( - sum( - rate( - osm_request_total{ - destination_namespace="test", - destination_kind="Deployment", - destination_name="podinfo", - response_code!="404" - }[1m] - ) - ) - / - sum( - rate( - osm_request_total{ - destination_namespace="test", - destination_kind="Deployment", - destination_name="podinfo" - }[1m] - ) - ) * 100 - ) -``` - -The above configuration validates the canary version by checking if the HTTP 404 req/sec percentage is below three percent of the total traffic. -If the 404s rate reaches the 3% threshold, then the analysis is aborted and the canary is marked as failed. - -Trigger a canary deployment by updating the container image: - -```bash -kubectl -n test set image deployment/podinfo \ -podinfod=ghcr.io/stefanprodan/podinfo:6.0.3 -``` - -Exec into the load tester pod with: - -```bash -kubectl -n test exec -it flagger-loadtester-xx-xx sh -``` - -Repeatedly generate 404s until canary rollout fails: - -```bash -watch -n 0.1 curl http://podinfo-canary.test:9898/status/404 -``` - -Watch Flagger logs to confirm successful canary rollback. - -```text -kubectl -n osm-system logs deployment/flagger -f | jq .msg - -Starting canary deployment for podinfo.test -Pre-rollout check acceptance-test passed -Advance podinfo.test canary weight 5 -Halt podinfo.test advancement 404s percentage 6.20 > 3 -Halt podinfo.test advancement 404s percentage 6.45 > 3 -Halt podinfo.test advancement 404s percentage 7.22 > 3 -Halt podinfo.test advancement 404s percentage 6.50 > 3 -Halt podinfo.test advancement 404s percentage 6.34 > 3 -Rolling back podinfo.test failed checks threshold reached 5 -Canary failed! Scaling down podinfo.test -``` diff --git a/docs/gitbook/usage/deployment-strategies.md b/docs/gitbook/usage/deployment-strategies.md index 5dfdc2966..7df71f6f2 100644 --- a/docs/gitbook/usage/deployment-strategies.md +++ b/docs/gitbook/usage/deployment-strategies.md @@ -3,11 +3,11 @@ Flagger can run automated application analysis, promotion and rollback for the following deployment strategies: * **Canary Release** \(progressive traffic shifting\) - * Istio, Linkerd, App Mesh, NGINX, Skipper, Contour, Gloo Edge, Traefik, Open Service Mesh, Kuma, Gateway API, Apache APISIX, Knative + * Istio, Linkerd, App Mesh, NGINX, Skipper, Contour, Gloo Edge, Traefik, Kuma, Gateway API, Apache APISIX, Knative * **A/B Testing** \(HTTP headers and cookies traffic routing\) * Istio, App Mesh, NGINX, Contour, Gloo Edge, Gateway API * **Blue/Green** \(traffic switching\) - * Kubernetes CNI, Istio, Linkerd, App Mesh, NGINX, Contour, Gloo Edge, Open Service Mesh, Gateway API + * Kubernetes CNI, Istio, Linkerd, App Mesh, NGINX, Contour, Gloo Edge, Gateway API * **Blue/Green Mirroring** \(traffic shadowing\) * Istio, Gateway API * **Canary Release with Session Affinity** \(progressive traffic shifting combined with cookie based routing\) diff --git a/kustomize/README.md b/kustomize/README.md index 17c23b14a..e719ca5a8 100644 --- a/kustomize/README.md +++ b/kustomize/README.md @@ -34,14 +34,6 @@ kustomize build https://github.com/fluxcd/flagger/kustomize/linkerd?ref=main | k This deploys Flagger in the `linkerd` namespace and sets the metrics server URL to linkerd-viz extension's Prometheus instance which lives under `linkerd-viz` namespace by default. -Install Flagger for Open Service Mesh: - -```bash -kustomize build https://github.com/fluxcd/flagger/kustomize/osm?ref=main | kubectl apply -f - -``` - -This deploys Flagger in the `osm-system` namespace and sets the metrics server URL to OSM's Prometheus instance. - If you want to install a specific Flagger release, add the version number to the URL: ```bash @@ -76,7 +68,7 @@ metadata: name: app namespace: test spec: - # can be: kubernetes, istio, linkerd, appmesh, nginx, skipper, gloo, osm + # can be: kubernetes, istio, linkerd, appmesh, nginx, skipper, gloo # use the kubernetes provider for Blue/Green style deployments provider: nginx ``` diff --git a/kustomize/osm/kustomization.yaml b/kustomize/osm/kustomization.yaml deleted file mode 100644 index b69c136b8..000000000 --- a/kustomize/osm/kustomization.yaml +++ /dev/null @@ -1,5 +0,0 @@ -namespace: osm-system -bases: - - ../base/flagger/ -patchesStrategicMerge: - - patch.yaml diff --git a/kustomize/osm/patch.yaml b/kustomize/osm/patch.yaml deleted file mode 100644 index f2cd07595..000000000 --- a/kustomize/osm/patch.yaml +++ /dev/null @@ -1,27 +0,0 @@ -apiVersion: apps/v1 -kind: Deployment -metadata: - name: flagger -spec: - template: - spec: - containers: - - name: flagger - args: - - -log-level=info - - -include-label-prefix=app.kubernetes.io - - -mesh-provider=osm - - -metrics-server=http://osm-prometheus.osm-system.svc:7070 ---- -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRoleBinding -metadata: - name: flagger -roleRef: - apiGroup: rbac.authorization.k8s.io - kind: ClusterRole - name: flagger -subjects: - - kind: ServiceAccount - name: flagger - namespace: osm-system