Skip to content

Commit 33a692b

Browse files
authored
Updates to K8s manifests and services for K8s compatibility (#117)
## Description This update improves the K8s manifest files and makes changes to the services for better compatibility when running in K8s. k8s datadog config: * set `DD_ENV` * Add `DD_CLOUD_PROVIDER_METADATA` setting Cluster files: * updated NGINX ingress controller version Storedog k8s files: * remove unnecessary envars * add and update labels * add files to setup AB testing README files: - Update AB Testing with K8s instructions - Update Ads readme details - K8s readme: review for clarity and flow Docker Compose: * Remove unnecessary envars * Add `DD_CLOUD_PROVIDER_METADATA` setting * Update Ruby commands to match Profiling docs Services files: - Discounts: remove variable for port - Update dd-trace library versions
2 parents 01d557a + b10570e commit 33a692b

File tree

32 files changed

+529
-214
lines changed

32 files changed

+529
-214
lines changed

README.md

Lines changed: 60 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,24 @@ There are several features that can be enabled by setting environment variables
206206

207207
Run two Ads services and split traffic between them. The amount of traffic sent to each service is set with a percent value.
208208

209+
This requires running a second Ads service in addition to the default Java Ads service and setting environment variables in the `service-proxy` service. The Python Ads service is typically used as the secondary service.
210+
211+
1. Set an environment variable for the Python Ads service version.
212+
213+
```bash
214+
export DD_VERSION_ADS_PYTHON=1.0.0
215+
```
216+
217+
1. These environment variables need to be set for the `service-proxy` service.
218+
219+
- `ADS_A_UPSTREAM`: Host and port for the primary (A) ads service (default: `ads:3030`)
220+
- `ADS_B_UPSTREAM`: Host and port for the secondary (B) ads service (default: `ads-python:3030`)
221+
- `ADS_B_PERCENT`: Percentage of traffic to route to the B (Python) ads service (default: `0`). The remainder goes to the A ads (Java) service.
222+
- Set a value between `0` and `100` to control the split.
223+
209224
**How to use**
225+
226+
#### Docker Compose
210227
1. Add a second Ads service to the `docker-compose.yml`
211228

212229
```yaml
@@ -223,15 +240,19 @@ Run two Ads services and split traffic between them. The amount of traffic sent
223240
- POSTGRES_USER=${POSTGRES_USER:-postgres}
224241
- POSTGRES_HOST=postgres
225242
- DD_AGENT_HOST=dd-agent
243+
- DD_ENV=${DD_ENV:-production}
226244
- DD_SERVICE=store-ads-python
227245
- DD_VERSION=${DD_VERSION_ADS_PYTHON:-1.0.0}
246+
- DD_PROFILING_ENABLED=true
247+
- DD_PROFILING_TIMELINE_ENABLED=true
248+
- DD_PROFILING_ALLOCATION_ENABLED=true
228249
networks:
229250
- storedog-network
230251
labels:
231252
com.datadoghq.ad.logs: '[{"source": "python"}]'
232253
```
233254

234-
1. Add and set these environment variables to the `service-proxy` service:
255+
1. Add these environment variables to the `service-proxy` service in the `docker-compose.yml` file:
235256

236257
```yaml
237258
environment:
@@ -240,13 +261,40 @@ Run two Ads services and split traffic between them. The amount of traffic sent
240261
- ADS_B_PERCENT=${ADS_B_PERCENT:-0}
241262
```
242263

243-
- `ADS_A_UPSTREAM`: Host and port for the primary (A) ads service (default: `ads:3030`)
244-
- `ADS_B_UPSTREAM`: Host and port for the secondary (B) ads service (default: `ads-python:3030`)
245-
- `ADS_B_PERCENT`: Percentage of traffic to route to the B (Python) ads service (default: `0`). The remainder goes to the A (Java) ads service.
246-
- Set to a value between `0` and `100` to control the split.
247264
1. Start the app via `docker compose up -d`
248265

266+
#### Kubernetes
267+
268+
A Kubernetes manifest for the Python Ads service is available in the `services/ads/k8s-manifests/` directory.
269+
270+
1. Add the `ads-python.yaml` file to the `k8s-manifests/storedog-app/deployments/` directory.
271+
272+
1. Add the following environment variables to the `nginx.yaml` file and adjust as needed:
273+
274+
```yaml
275+
# A/B testing ads services
276+
- name: ADS_A_UPSTREAM
277+
value: "ads:3030"
278+
- name: ADS_B_UPSTREAM
279+
value: "ads-python:3030"
280+
- name: ADS_B_PERCENT
281+
value: "50"
282+
```
283+
284+
1. Follow the instructions in the [Kubernetes README](./k8s-manifests/README.md) to run Storedog in Kubernetes.
285+
286+
1. If the Storedog is already running, apply the manifests to the cluster:
287+
288+
```bash
289+
envsubst < k8s-manifests/storedog-app/deployments/ads-python.yaml | kubectl apply -n storedog -f -
290+
envsubst < k8s-manifests/storedog-app/deployments/nginx.yaml | kubectl apply -n storedog -f -
291+
```
292+
293+
> [!IMPORTANT]
294+
> Be sure to set the `DD_VERSION_ADS_PYTHON` environment variable so that it will be applied to the file by `envsubst`.
295+
249296
### Feature flags
297+
250298
Some capabilities are hidden behind feature flags, which can be controlled via `services/frontend/site/featureFlags.config.json`.
251299

252300
> [!NOTE]
@@ -258,6 +306,7 @@ Some capabilities are hidden behind feature flags, which can be controlled via `
258306
> ```
259307
260308
#### dbm
309+
261310
Enables a product ticker on the homepage with a long-running query to demonstrate DBM.
262311
263312
**How to use**:
@@ -270,6 +319,7 @@ Enables a product ticker on the homepage with a long-running query to demonstrat
270319
You can modify the ticker functionality in `services/frontend/components/common/NavBar.tsx`.
271320
272321
#### error-tracking
322+
273323
Introduces an exception in the Ads services to demonstrate Error Tracking by setting a header in to a value that is not expected by the Ads service.
274324
275325
**How to use**:
@@ -281,6 +331,7 @@ Introduces an exception in the Ads services to demonstrate Error Tracking by set
281331
Modify this functionality in `services/frontend/components/common/Ad/Ad.tsx` and respective Ads service being used.
282332
283333
#### api-errors
334+
284335
This introduces random errors that occur in the frontend service's `/api` routes.
285336

286337
**How to use**:
@@ -291,6 +342,7 @@ This introduces random errors that occur in the frontend service's `/api` routes
291342
Modify this functionality in `services/frontend/pages/api/*`.
292343

293344
#### product-card-frustration
345+
294346
This will swap out the product card component with a version that doesn't have the thumbnails linked to the product page. When paired with the Puppeteer service, this can be used to demonstrate Frustration Signals in RUM.
295347
296348
**How to use**:
@@ -301,6 +353,7 @@ This will swap out the product card component with a version that doesn't have t
301353
Modify this functionality in `services/frontend/components/Product/ProductCard.tsx` and `services/frontend/components/Product/ProductCard-v2.tsx`.
302354
303355
## Image publication
356+
304357
Images are stored in GHCR. On PR merges, only the affected services will be pushed to GHCR, using the `latest` tag. For example, if you only made changes to the `backend` service, then only the `backend` Github workflow will trigger and publish `ghcr.io/datadog/storedog/backend:latest`.
305358
306359
Separately, we tag and publish *all* images when a new release is created with the corresponding release tag e.g. `ghcr.io/datadog/storedog/backend:1.0.1`. New releases are made on an ad-hoc basis, depending on the recent features that are added.
@@ -312,6 +365,7 @@ All of the services in the Storedog application are Dockerized and run in contai
312365
Below is a breakdown of services and some instructions on how to use them.
313366

314367
### Ads
368+
315369
There are two advertisement services, the default service is built in Java and there is another option available in Python. These services do the same thing, have the same endpoints, run on the same port (`3030`), and have the same failure modes. These ads are served through the `Ads.tsx` component in the frontend service.
316370

317371
To switch between the Java and Python services, see the instructions in the [Ads service README](./services/ads/README.md).
@@ -340,6 +394,7 @@ sh ./scripts/backup-db.sh
340394
This will create a new `restore.sql` file in the `services/postgres/db/` directory and get it set up with all of necessary SQL statements to prepare the database for Datadog monitoring. When done running, you'll want to rebuild the Postgres database image with the new restore point.
341395

342396
#### Worker
397+
343398
The Spree application has a worker process that runs in the background. There is a specific Datadog tracer configuration for it in the `services/worker/` directory and is mounted into the worker container.
344399

345400
### Discounts

docker-compose.dev.yml

Lines changed: 13 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ services:
1111
- DD_LOGS_ENABLED=true
1212
- DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true
1313
- DD_CONTAINER_EXCLUDE=image:agent name:puppeteer
14+
- DD_CLOUD_PROVIDER_METADATA=[]
1415
- DD_HOSTNAME=${DD_HOSTNAME-development-host}
1516
- DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true
1617
networks:
@@ -38,9 +39,9 @@ services:
3839
- storedog-network
3940
environment:
4041
- DD_AGENT_HOST=dd-agent
42+
- DD_ENV=${DD_ENV:-development}
4143
- DD_SERVICE=store-frontend-api # used for Next.js API routes
4244
- DD_VERSION=${NEXT_PUBLIC_DD_VERSION_FRONTEND-1.0.0}
43-
- DD_LOGS_INJECTION=true
4445
- DD_RUNTIME_METRICS_ENABLED=true
4546
- DD_PROFILING_ENABLED=true
4647
- DD_PROFILING_TIMELINE_ENABLED=true
@@ -59,13 +60,13 @@ services:
5960
- NEXT_PUBLIC_SPREE_IMAGE_HOST=${NEXT_PUBLIC_SPREE_IMAGE_HOST:-/services/backend}
6061
- NEXT_PUBLIC_SPREE_ALLOWED_IMAGE_DOMAIN=${NEXT_PUBLIC_SPREE_ALLOWED_IMAGE_DOMAIN:-service-proxy}
6162
labels:
62-
com.datadoghq.ad.logs: '[{"source": "nodejs", "auto_multi_line_detection":true }]'
63+
com.datadoghq.ad.logs: '[{"source": "nodejs", "auto_multi_line_detection": true }]'
6364

6465
# Backend service (Ruby on Rails/Spree)
6566
backend:
6667
build:
6768
context: ./services/backend
68-
command: wait-for-it postgres:5432 -- bundle exec rails s -b 0.0.0.0 -p 4000 --pid /app/tmp/pids/server.pid
69+
command: wait-for-it postgres:5432 -- bundle exec ddprofrb exec rails s -b 0.0.0.0 -p 4000 --pid /app/tmp/pids/server.pid
6970
depends_on:
7071
- 'postgres'
7172
- 'redis'
@@ -85,21 +86,21 @@ services:
8586
- DB_POOL=${DB_POOL:-25}
8687
- MAX_THREADS=${MAX_THREADS:-5}
8788
- DD_AGENT_HOST=dd-agent
89+
- DD_ENV=${DD_ENV:-development}
8890
- DD_SERVICE=store-backend
8991
- DD_VERSION=${DD_VERSION_BACKEND:-1.0.0}
90-
- DD_LOGS_INJECTION=true
9192
- DD_RUNTIME_METRICS_ENABLED=true
9293
- DD_PROFILING_ENABLED=true
9394
- DD_PROFILING_ALLOCATION_ENABLED=true
9495
- DD_PROFILING_TIMELINE_ENABLED=true
9596
labels:
96-
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection":true }]'
97+
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection": true }]'
9798

9899
# Background job processor (Sidekiq)
99100
worker:
100101
build:
101102
context: ./services/backend
102-
command: wait-for-it postgres:5432 -- bundle exec sidekiq -C config/sidekiq.yml
103+
command: wait-for-it postgres:5432 -- bundle exec ddprofrb exec sidekiq -C config/sidekiq.yml
103104
depends_on:
104105
- 'postgres'
105106
- 'redis'
@@ -121,15 +122,15 @@ services:
121122
- DB_POOL=${DB_POOL:-25}
122123
- MAX_THREADS=${MAX_THREADS:-5}
123124
- DD_AGENT_HOST=dd-agent
125+
- DD_ENV=${DD_ENV:-development}
124126
- DD_SERVICE=store-worker
125127
- DD_VERSION=${DD_VERSION_BACKEND:-1.0.0}
126-
- DD_LOGS_INJECTION=true
127128
- DD_RUNTIME_METRICS_ENABLED=true
128129
- DD_PROFILING_ENABLED=true
129130
- DD_PROFILING_TIMELINE_ENABLED=true
130131
- DD_PROFILING_ALLOCATION_ENABLED=true
131132
labels:
132-
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection":true }]'
133+
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection": true }]'
133134

134135
# Discounts service (Python/Flask)
135136
discounts:
@@ -140,15 +141,13 @@ services:
140141
- postgres
141142
- dd-agent
142143
environment:
143-
- FLASK_APP=discounts.py
144-
- FLASK_DEBUG=0
145144
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-postgres}
146145
- POSTGRES_USER=${POSTGRES_USER:-postgres}
147146
- POSTGRES_HOST=postgres
148147
- DD_AGENT_HOST=dd-agent
148+
- DD_ENV=${DD_ENV:-development}
149149
- DD_SERVICE=store-discounts
150150
- DD_VERSION=${DD_VERSION_DISCOUNTS:-1.0.0}
151-
- DD_LOGS_INJECTION=true
152151
- DD_RUNTIME_METRICS_ENABLED=true
153152
- DD_PROFILING_ENABLED=true
154153
- DD_PROFILING_TIMELINE_ENABLED=true
@@ -172,9 +171,9 @@ services:
172171
- POSTGRES_USER=${POSTGRES_USER:-postgres}
173172
- POSTGRES_HOST=postgres
174173
- DD_AGENT_HOST=dd-agent
174+
- DD_ENV=${DD_ENV:-development}
175175
- DD_SERVICE=store-ads
176176
- DD_VERSION=${DD_VERSION_ADS:-1.0.0}
177-
- DD_LOGS_INJECTION=true
178177
- DD_PROFILING_ENABLED=true
179178
- DD_PROFILING_TIMELINE_ENABLED=true
180179
- DD_PROFILING_ALLOCATION_ENABLED=true
@@ -197,6 +196,7 @@ services:
197196
- dd-agent
198197
environment:
199198
- DD_AGENT_HOST=dd-agent
199+
- DD_ENV=${DD_ENV:-development}
200200
- DD_SERVICE=service-proxy
201201
- DD_VERSION=${DD_VERSION_NGINX:-1.28.0}
202202
labels:
@@ -225,7 +225,7 @@ services:
225225
com.datadoghq.ad.check_names: '["postgres"]'
226226
com.datadoghq.ad.init_configs: '[{}]'
227227
com.datadoghq.ad.instances: '[{"host":"%%host%%", "port":5432, "username":"datadog", "password":"datadog"}]'
228-
com.datadoghq.ad.logs: '[{"source": "postgresql", "auto_multi_line_detection":true, "path": "/var/log/pg_log/postgresql*.json", "type": "file"}]'
228+
com.datadoghq.ad.logs: '[{"source": "postgresql", "service":"store-db", "auto_multi_line_detection": true, "path": "/var/log/pg_log/postgresql*.json", "type": "file"}]'
229229

230230
# Cache and job queue
231231
redis:
@@ -238,7 +238,6 @@ services:
238238
- storedog-network
239239
labels:
240240
com.datadoghq.tags.service: 'redis'
241-
com.datadoghq.tags.env: '${DD_ENV:-development}'
242241
com.datadoghq.tags.version: '${DD_VERSION_REDIS:-6.2}'
243242
com.datadoghq.ad.check_names: '["redisdb"]'
244243
com.datadoghq.ad.init_configs: '[{}]'

docker-compose.yml

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ services:
88
- DD_LOGS_ENABLED=true
99
- DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true
1010
- DD_CONTAINER_EXCLUDE=image:agent name:puppeteer
11+
- DD_CLOUD_PROVIDER_METADATA=[]
1112
- DD_HOSTNAME=${DD_HOSTNAME:-production-host}
1213
- DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true
1314
networks:
@@ -29,9 +30,9 @@ services:
2930
- storedog-network
3031
environment:
3132
- DD_AGENT_HOST=dd-agent
33+
- DD_ENV=${DD_ENV:-production}
3234
- DD_SERVICE=store-frontend-api # used for Next.js API routes
3335
- DD_VERSION=${NEXT_PUBLIC_DD_VERSION_FRONTEND:-1.0.0}
34-
- DD_LOGS_INJECTION=true
3536
- DD_RUNTIME_METRICS_ENABLED=true
3637
- DD_PROFILING_ENABLED=true
3738
- DD_PROFILING_TIMELINE_ENABLED=true
@@ -43,10 +44,10 @@ services:
4344
- NEXT_PUBLIC_DD_ENV=${DD_ENV:-production}
4445
- NEXT_PUBLIC_DD_VERSION=${NEXT_PUBLIC_DD_VERSION_FRONTEND:-1.0.0}
4546
labels:
46-
com.datadoghq.ad.logs: '[{"source": "nodejs", "auto_multi_line_detection":true }]'
47+
com.datadoghq.ad.logs: '[{"source": "nodejs", "auto_multi_line_detection": true }]'
4748
backend:
4849
image: ghcr.io/datadog/storedog/backend:${STOREDOG_IMAGE_VERSION:-latest}
49-
command: wait-for-it postgres:5432 -- bundle exec rails s -b 0.0.0.0 -p 4000 --pid /app/tmp/pids/server.pid
50+
command: wait-for-it postgres:5432 -- bundle exec ddprofrb exec rails s -b 0.0.0.0 -p 4000 --pid /app/tmp/pids/server.pid
5051
depends_on:
5152
- 'postgres'
5253
- 'redis'
@@ -57,18 +58,18 @@ services:
5758
- './services/backend:/app'
5859
environment:
5960
- DD_AGENT_HOST=dd-agent
61+
- DD_ENV=${DD_ENV:-production}
6062
- DD_SERVICE=store-backend
6163
- DD_VERSION=${DD_VERSION_BACKEND:-1.0.0}
62-
- DD_LOGS_INJECTION=true
6364
- DD_RUNTIME_METRICS_ENABLED=true
6465
- DD_PROFILING_ENABLED=true
6566
- DD_PROFILING_TIMELINE_ENABLED=true
6667
- DD_PROFILING_ALLOCATION_ENABLED=true
6768
labels:
68-
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection":true }]'
69+
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection": true }]'
6970
worker:
7071
image: ghcr.io/datadog/storedog/backend:${STOREDOG_IMAGE_VERSION:-latest}
71-
command: wait-for-it postgres:5432 -- bundle exec sidekiq -C config/sidekiq.yml
72+
command: wait-for-it postgres:5432 -- bundle exec ddprofrb exec sidekiq -C config/sidekiq.yml
7273
depends_on:
7374
- 'postgres'
7475
- 'redis'
@@ -79,15 +80,15 @@ services:
7980
environment:
8081
- WORKER=true
8182
- DD_AGENT_HOST=dd-agent
83+
- DD_ENV=${DD_ENV:-production}
8284
- DD_SERVICE=store-worker
8385
- DD_VERSION=${DD_VERSION_BACKEND:-1.0.0}
84-
- DD_LOGS_INJECTION=true
8586
- DD_RUNTIME_METRICS_ENABLED=true
8687
- DD_PROFILING_ENABLED=true
8788
- DD_PROFILING_TIMELINE_ENABLED=true
8889
- DD_PROFILING_ALLOCATION_ENABLED=true
8990
labels:
90-
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection":true }]'
91+
com.datadoghq.ad.logs: '[{"source": "ruby", "auto_multi_line_detection": true }]'
9192
discounts:
9293
image: ghcr.io/datadog/storedog/discounts:${STOREDOG_IMAGE_VERSION:-latest}
9394
command: wait-for-it postgres:5432 -- ddtrace-run flask run --port=2814 --host=0.0.0.0
@@ -96,9 +97,9 @@ services:
9697
- dd-agent
9798
environment:
9899
- DD_AGENT_HOST=dd-agent
100+
- DD_ENV=${DD_ENV:-production}
99101
- DD_SERVICE=store-discounts
100102
- DD_VERSION=${DD_VERSION_DISCOUNTS:-1.0.0}
101-
- DD_LOGS_INJECTION=true
102103
- DD_RUNTIME_METRICS_ENABLED=true
103104
- DD_PROFILING_ENABLED=true
104105
- DD_PROFILING_TIMELINE_ENABLED=true
@@ -117,9 +118,9 @@ services:
117118
- POSTGRES_USER=${POSTGRES_USER:-postgres}
118119
- POSTGRES_HOST=postgres
119120
- DD_AGENT_HOST=dd-agent
121+
- DD_ENV=${DD_ENV:-production}
120122
- DD_SERVICE=store-ads
121123
- DD_VERSION=${DD_VERSION_ADS:-1.0.0}
122-
- DD_LOGS_INJECTION=true
123124
- DD_PROFILING_ENABLED=true
124125
- DD_PROFILING_TIMELINE_ENABLED=true
125126
- DD_PROFILING_ALLOCATION_ENABLED=true
@@ -141,6 +142,7 @@ services:
141142
- dd-agent
142143
environment:
143144
- DD_AGENT_HOST=dd-agent
145+
- DD_ENV=${DD_ENV:-production}
144146
- DD_SERVICE=service-proxy
145147
- DD_VERSION=${DD_VERSION_NGINX:-1.28.0}
146148
labels:
@@ -166,7 +168,7 @@ services:
166168
com.datadoghq.ad.check_names: '["postgres"]'
167169
com.datadoghq.ad.init_configs: '[{}]'
168170
com.datadoghq.ad.instances: '[{"host":"%%host%%", "port":5432, "username":"datadog", "password":"datadog"}]'
169-
com.datadoghq.ad.logs: '[{"source": "postgresql", "auto_multi_line_detection":true, "path": "/var/log/pg_log/postgresql*.json", "type": "file"}]'
171+
com.datadoghq.ad.logs: '[{"source": "postgresql", "service":"store-db", "auto_multi_line_detection": true, "path": "/var/log/pg_log/postgresql*.json", "type": "file"}]'
170172
redis:
171173
image: redis:6.2-alpine
172174
depends_on:
@@ -177,7 +179,6 @@ services:
177179
- storedog-network
178180
labels:
179181
com.datadoghq.tags.service: 'redis'
180-
com.datadoghq.tags.env: '${DD_ENV:-production}'
181182
com.datadoghq.tags.version: '${DD_VERSION_REDIS:-6.2}'
182183
com.datadoghq.ad.check_names: '["redisdb"]'
183184
com.datadoghq.ad.init_configs: '[{}]'

0 commit comments

Comments
 (0)