|
1 | | -# Observability Stack with GKE, LGTM, and ArgoCD |
| 1 | +# Kubernetes Observability & Operations Platform |
2 | 2 |
|
3 | | -Complete infrastructure and application stack for observability on Google Kubernetes Engine (GKE). |
| 3 | +This repository provisions a comprehensive, production-grade observability and operations platform on **Google Kubernetes Engine (GKE)**. It integrates distinct, modular components to handle **deployment**, **monitoring**, **logging**, **tracing**, and **certificate management**. |
4 | 4 |
|
5 | | -## Components |
| 5 | +## Core Components |
6 | 6 |
|
7 | | -- **GKE**: Google Kubernetes Engine cluster |
8 | | -- **LGTM Stack**: |
9 | | - - Loki (logs) |
10 | | - - Grafana (visualization) |
11 | | - - Tempo (traces) |
12 | | - - Mimir (metrics) |
13 | | -- **ArgoCD**: GitOps continuous deployment |
14 | | -- **Cert-Manager**: Automated certificate management |
15 | | -- **Ingress Controller**: Nginx ingress controller |
| 7 | +* **Observability (LGTM Stack)**: |
| 8 | + * **Loki**: Distributed logging. |
| 9 | + * **Grafana**: Visualization and dashboards. |
| 10 | + * **Tempo**: Distributed tracing. |
| 11 | + * **Mimir**: Scalable metrics (Prometheus storage). |
| 12 | +* **GitOps (ArgoCD)**: |
| 13 | + * **ArgoCD**: Continuous delivery and declarative GitOps workflows. |
| 14 | +* **Infrastructure Essentials**: |
| 15 | + * **Cert-Manager**: Automated TLS certificate issuance (Let's Encrypt). |
| 16 | + * **Ingress Controller**: NGINX Ingress for external traffic management. |
16 | 17 |
|
17 | | -## REPO STRUCTURE |
| 18 | +## Project Structure |
| 19 | + |
| 20 | +This project is built with **Terraform** and **Helm**, designed for modularity. You can deploy the entire stack or individual components as needed. |
| 21 | + |
| 22 | +* **[`lgtm-stack/`](lgtm-stack/README.md)**: The core internal monitoring platform. |
| 23 | +* **[`argocd/`](argocd/README.md)**: The GitOps delivery engine. |
| 24 | +* **[`cert-manager/`](cert-manager/README.md)**: Certificate management infrastructure. |
| 25 | +* **[`ingress-controller/`](ingress-controller/README.md)**: Ingress routing infrastructure. |
| 26 | + |
| 27 | +## Documentation |
| 28 | + |
| 29 | +* **[Kubernetes Observability Guide](docs/kubernetes-observability.md)**: Deployment and architecture of the LGTM stack. |
| 30 | +* **[Cert-Manager Deployment](docs/cert-manager-terraform-deployment.md)**: Terraform guide for Cert-Manager. |
| 31 | +* **[Ingress Controller Deployment](docs/ingress-controller-terraform-deployment.md)**: Terraform guide for NGINX Ingress. |
| 32 | +* **[ArgoCD Documentation](argocd/README.md)**: Setup and configuration for GitOps. |
18 | 33 |
|
19 | | -``` |
20 | | -observability/ |
21 | | -├── README.md |
22 | | -│ └── USE: Project overview, quick start, and entry point for new users |
23 | | -│ |
24 | | -├── argocd/ |
25 | | -│ ├── README.md |
26 | | -│ │ └── USE: ArgoCD component overview and quick reference |
27 | | -│ └── terraform/ |
28 | | -│ ├── locals.tf |
29 | | -│ │ └── USE: Local variables and computed values within ArgoCD module |
30 | | -│ ├── main.tf |
31 | | -│ │ └── USE: Deploy ArgoCD using Helm to GKE cluster |
32 | | -│ ├── outputs.tf |
33 | | -│ │ └── USE: Export ArgoCD endpoint URLs and credentials |
34 | | -│ ├── variables.tf |
35 | | -│ │ └── USE: Define input parameters for ArgoCD deployment |
36 | | -│ └── values/ |
37 | | -│ ├── argocd-values.yaml |
38 | | -│ │ └── USE: Base Helm chart values for ArgoCD |
39 | | -│ ├── argocd-dev-values.yaml |
40 | | -│ │ └── USE: Development environment overrides (reduced resources) |
41 | | -│ └── argocd-prod-values.yaml |
42 | | -│ └── USE: Production environment overrides (HA, replicas) |
43 | | -│ |
44 | | -├── cert-manager/ |
45 | | -│ ├── README.md |
46 | | -│ │ └── USE: Cert-Manager component overview and reference |
47 | | -│ └── terraform/ |
48 | | -│ ├── locals.tf |
49 | | -│ │ └── USE: Local variables and computed values |
50 | | -│ ├── main.tf |
51 | | -│ │ └── USE: Deploy Cert-Manager using Helm to manage TLS certificates |
52 | | -│ ├── outputs.tf |
53 | | -│ │ └── USE: Export Cert-Manager service account and configuration details |
54 | | -│ ├── variables.tf |
55 | | -│ │ └── USE: Define customizable parameters for Cert-Manager |
56 | | -│ |
57 | | -├── docs/ |
58 | | -│ ├── ARCHITECTURE.md |
59 | | -│ │ └── USE: Explain system design, component interactions, and data flow |
60 | | -│ ├── GETTING_STARTED.md |
61 | | -│ │ └── USE: Step-by-step quick start guide for new users |
62 | | -│ ├── README.md |
63 | | -│ │ └── USE: Documentation index and navigation hub |
64 | | -│ ├── TUTORIAL_ARGOCD.md |
65 | | -│ │ └── USE: Manual ArgoCD installation guide (alternative to Terraform) |
66 | | -│ ├── TUTORIAL_CERT_MANAGER.md |
67 | | -│ │ └── USE: Manual Cert-Manager installation guide |
68 | | -│ ├── TUTORIAL_GKE_SETUP.md |
69 | | -│ │ └── USE: Manual GKE cluster creation using gcloud CLI |
70 | | -│ ├── TUTORIAL_INGRESS.md |
71 | | -│ │ └── USE: Manual Ingress Controller installation guide |
72 | | -│ ├── TUTORIAL_LGTM.md |
73 | | -│ │ └── USE: Manual LGTM stack deployment guide |
74 | | -│ └── images/ |
75 | | -│ ├── architecture-diagram.png |
76 | | -│ │ └── USE: Visual system architecture diagram |
77 | | -│ ├── argocd-workflow.png |
78 | | -│ │ └── USE: Visual GitOps deployment workflow diagram |
79 | | -│ └── lgtm-flow.png |
80 | | -│ └── USE: Visual LGTM component data flow diagram |
81 | | -│ |
82 | | -├── ingress-controller/ |
83 | | -│ ├── README.md |
84 | | -│ │ └── USE: Ingress Controller component overview |
85 | | -│ └── terraform/ |
86 | | -│ ├── locals.tf |
87 | | -│ │ └── USE: Local variables for ingress module |
88 | | -│ ├── main.tf |
89 | | -│ │ └── USE: Deploy Nginx Ingress Controller for HTTP/HTTPS routing |
90 | | -│ ├── outputs.tf |
91 | | -│ │ └── USE: Export load balancer endpoint and service information |
92 | | -│ ├── variables.tf |
93 | | -│ │ └── USE: Define customizable parameters for Ingress |
94 | | -│ └── values.yaml |
95 | | -│ └── USE: Helm chart configuration for Nginx Ingress Controller |
96 | | -│ |
97 | | -└── lgtm-stack/ |
98 | | - ├── README.md |
99 | | - │ └── USE: LGTM stack component overview and architecture |
100 | | - └── terraform/ |
101 | | - ├── locals.tf |
102 | | - │ └── USE: Local variables for LGTM module |
103 | | - ├── main.tf |
104 | | - │ └── USE: Deploy all LGTM components (Prometheus, Loki, Mimir, Tempo, Grafana) |
105 | | - ├── outputs.tf |
106 | | - │ └── USE: Export endpoints and credentials for all LGTM components |
107 | | - ├── variables.tf |
108 | | - │ └── USE: Define customizable parameters for LGTM deployment |
109 | | - └── values/ |
110 | | - ├── grafana-values.yaml |
111 | | - │ └── USE: Helm configuration for Grafana dashboards and datasources |
112 | | - ├── loki-values.yaml |
113 | | - │ └── USE: Helm configuration for Loki log storage and retention |
114 | | - ├── mimir-values.yaml |
115 | | - │ └── USE: Helm configuration for Mimir long-term metrics storage |
116 | | - ├── prometheus-values.yaml |
117 | | - │ └── USE: Helm configuration for Prometheus metrics scraping |
118 | | - └── tempo-values.yaml |
119 | | - └── USE: Helm configuration for Tempo distributed tracing |
120 | | -``` |
0 commit comments