Skip to content
Merged
Show file tree
Hide file tree
Changes from 81 commits
Commits
Show all changes
82 commits
Select commit Hold shift + click to select a range
24ba37e
NAIS deploy
mallport Nov 29, 2024
e535d40
Merge branch 'master' into nais-deploy
mallport Nov 29, 2024
92a13a3
Merge branch 'master' into nais-deploy
mallport Nov 29, 2024
6213fa5
Merge branch 'master' into nais-deploy
mallport Nov 29, 2024
c3eb29b
Temporarily deploy on PR commit
mallport Dec 4, 2024
86887cf
Add NAIS Keycloak as trusted issuer
mallport Dec 4, 2024
5b40d46
Fix PR deploy branch
mallport Dec 4, 2024
3d5d3bc
Merge branch 'master' into nais-deploy
mallport Dec 4, 2024
933325b
Forgor to save
mallport Dec 4, 2024
7aeae39
Fix application config. Add variable for templating
mallport Dec 4, 2024
ccc50e5
Add team as templated variable
mallport Dec 4, 2024
0092d99
yaml -> yml
mallport Dec 4, 2024
e806f4d
Add Keycloak for egress
mallport Dec 4, 2024
60eae1b
Add Keycloak BIP for egres
mallport Dec 4, 2024
74c3ab7
add prod release
mallport Dec 10, 2024
36fcc0c
Use pseudo users
mallport Dec 10, 2024
20fc430
Use pseudo admins
mallport Dec 10, 2024
aee750a
Lower resources in test
mallport Dec 10, 2024
fe8395e
Add internal ingress for prod. Add external egress for test
mallport Dec 10, 2024
0cd43b3
Remove test subdomain from ingress URL
mallport Dec 10, 2024
f641eeb
add alerts for pseudo-service (#119)
ssb-jnk Jan 10, 2025
94a5601
alert-deploy.yml (#120)
ssb-jnk Jan 10, 2025
772643d
change high memory usage to fetch memory dynamically
ssb-jnk Jan 10, 2025
3ae7013
edit high memory alert
ssb-jnk Jan 10, 2025
3ee443c
revert to putting max memory manually
ssb-jnk Jan 10, 2025
49cd6d7
change expression for HighMemoryUsage
ssb-jnk Jan 13, 2025
6104e91
use container_memory_working_set_bytes instead
ssb-jnk Jan 13, 2025
cef7587
Add port and protocol
mallport Jan 15, 2025
2e9d7fb
revert environment detection
mallport Jan 15, 2025
bef19c6
Remove 'service' block
mallport Jan 15, 2025
2985ae7
add outbound egress
mallport Jan 15, 2025
edea32c
fix URL
mallport Jan 17, 2025
b1359cd
test with VirtualService host URL
mallport Jan 17, 2025
5ef77d7
Add quotes for URL
mallport Jan 17, 2025
e749f51
Use public alertconfig
mallport Feb 10, 2025
03686fd
Run alertconfig deploy from branch
mallport Feb 10, 2025
78c6664
Add custom alert deploy
mallport Feb 10, 2025
b0c9d69
Add cluster info
mallport Feb 10, 2025
bdb5444
Remove quotes
mallport Feb 10, 2025
55f2afc
fix templating
mallport Feb 10, 2025
cae88a9
add qutoes for cluster label
mallport Feb 10, 2025
0a5c5cf
Add cluster to label
mallport Feb 10, 2025
690995c
remove cluster var
mallport Feb 10, 2025
db2014d
Add var cluster
mallport Feb 10, 2025
609f435
Add team info for alerts
mallport Feb 14, 2025
c398549
fix capitalization
mallport Feb 14, 2025
5d17176
trigger alert deploy
mallport Feb 26, 2025
1f5f5c3
New deploy of app
mallport Feb 26, 2025
335b983
Trigger deploy
mallport Feb 26, 2025
064c2df
redeploy
mallport Feb 26, 2025
f26a738
add new workflow
mallport Feb 26, 2025
9849fe4
Add egress rule to sid lookup service
mallport Feb 27, 2025
2492e01
Also add outbound rule for prod
mallport Feb 27, 2025
5da78f5
Update SID URL
mallport Feb 27, 2025
44580d4
Fix double http
mallport Feb 27, 2025
8265aff
Attempt without ingress
mallport Feb 27, 2025
3d38307
add 8080
mallport Feb 27, 2025
80e4ded
use service discovery
mallport Feb 27, 2025
c57b0a9
Set correct service discovery for prod. Increase resources
mallport Feb 28, 2025
6ee6140
Use protected configmap for app roles
mallport Mar 28, 2025
66e3453
Fix SID service URL
mallport Mar 28, 2025
c5722a0
Deploy to prod
mallport Mar 28, 2025
0bc0602
Fix indent
mallport Mar 28, 2025
683fd94
Remove elevated pseudo users
mallport Mar 28, 2025
3d93570
Debug why KMS URI is not being read
mallport Mar 28, 2025
f239860
Change name of configmaps
mallport Mar 28, 2025
0bdba1b
Add quotes around env variable
mallport Mar 28, 2025
2094250
Explicitly convert to URI
mallport Mar 28, 2025
54b8abb
fix compilation error
mallport Mar 28, 2025
ee777bd
reset debugging chnages
mallport Mar 28, 2025
8317fed
Merge branch 'master' into nais-deploy
mallport Apr 3, 2025
d6adab5
Add login for Swagger
mallport Apr 3, 2025
0e731f5
trigger deploy
mallport Apr 25, 2025
e517df0
Increase proxy body size
mallport May 7, 2025
c6a8a9f
deploy to both prod and test
mallport May 7, 2025
07443ea
Remove refernces to BIP keycloak
mallport May 7, 2025
818ab69
add timeouts
mallport May 7, 2025
33b44e6
Add health endpoint
mallport May 8, 2025
fd3119e
increase replica
mallport May 8, 2025
e05c913
prepare for release
mallport May 15, 2025
04f99cf
remove alert depploy
mallport May 15, 2025
7ad1218
re-add release
mallport May 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions .github/workflows/alert-deploy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: Deploy alerts
run-name: Deploy alerts for Pseudo Service to test and prod

on:
push:
branches:
- master
paths:
- ".nais/alerts.yaml"
- ".github/workflows/alert-deploy.yml"
workflow_dispatch:
permissions:
id-token: write
env:
TEAM: dapla-stat

jobs:
test-deploy:
name: Deploy alerts to test
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4

- uses: actions/checkout@v4
name: Retrieve AlertManager configuration
with:
repository: "statisticsnorway/nais-alert-config"
path: "ext_alertconfig"
sparse-checkout: |
alertconfig.yaml
sparse-checkout-cone-mode: false

- name: Deploy to test
uses: nais/deploy/actions/deploy@v2
env:
CLUSTER: test
RESOURCE: .nais/alerts.yaml,ext_alertconfig/alertconfig.yaml
VAR: cluster=test,team=${{ env.TEAM }}
DEPLOY_SERVER: deploy.ssb.cloud.nais.io:443

prod-deploy:
name: Deploy alerts to prod
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4

- uses: actions/checkout@v4
name: Retrieve AlertManager configuration
with:
repository: "statisticsnorway/nais-alert-config"
path: "ext_alertconfig"
sparse-checkout: |
alertconfig.yaml
sparse-checkout-cone-mode: false

- name: Deploy to prod
uses: nais/deploy/actions/deploy@v2
env:
CLUSTER: prod
RESOURCE: .nais/alerts.yaml,ext_alertconfig/alertconfig.yaml
VAR: cluster=prod,team=${{ env.TEAM }}
DEPLOY_SERVER: deploy.ssb.cloud.nais.io:443

4 changes: 0 additions & 4 deletions .github/workflows/build-deploy-app.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,4 @@
on:
release:
types: [ published ]
pull_request: ## ONLY FOR TESTING, SHOULD BE REMOVED AFTER DEPLOY PR IS MERGED
branches: [nais-deploy]
push:
branches:
- master
Expand Down
26 changes: 24 additions & 2 deletions .github/workflows/deploy-to-nais.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,11 +74,33 @@ jobs:
id-token: "write"
steps:
- uses: actions/checkout@v4

- uses: actions/create-github-app-token@v1
id: app-token
with:
app-id: ${{ secrets.DAPLA_BOT_APP_ID }}
private-key: ${{ secrets.DAPLA_BOT_PRIVATE_KEY }}
owner: statisticsnorway
repositories: dapla-pseudo-iac

- uses: actions/checkout@v4
name: Retrieve protected configuration
with:
repository: "statisticsnorway/dapla-pseudo-iac"
path: "ext-config"
token: ${{ steps.app-token.outputs.token }}
sparse-checkout: |
apps/nais/pseudo-service/${{ inputs.cluster }}

- name: Configure environment variables
run: |
ext_config_dir="ext-config/apps/nais/pseudo-service/${{ inputs.cluster }}"
echo "ext_config=${ext_config_dir}" >> $GITHUB_ENV

- uses: nais/deploy/actions/deploy@v2
env:
CLUSTER: ${{ inputs.cluster }}
RESOURCE: ${{ inputs.nais-config-path }}
VAR: image=${{ inputs.registry }}/${{ secrets.NAIS_MANAGEMENT_PROJECT_ID }}/${{ inputs.repository }}/${{ inputs.image-name }}:${{ inputs.image-tag }}
RESOURCE: ${{ inputs.nais-config-path }},${{ env.ext_config }}/app-roles.yml
VAR: image=${{ inputs.registry }}/${{ secrets.NAIS_MANAGEMENT_PROJECT_ID }}/${{ inputs.repository }}/${{ inputs.image-name }}:${{ inputs.image-tag }},team=dapla-stat
DEPLOY_SERVER: deploy.ssb.cloud.nais.io:443
REF: ${{ inputs.ref }}
86 changes: 86 additions & 0 deletions .nais/alerts.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
apiVersion: "monitoring.coreos.com/v1"
kind: PrometheusRule
metadata:
name: alert-pseudo-service
namespace: dapla-stat
labels:
team: dapla-stat
cluster: "{{cluster}}"
spec:
groups:
- name: dapla-stat
rules:
# This alert checks if no replicas of pseudo-service are available, indicating the service is unavailable.
- alert: PseudoServiceUnavailable
expr: kube_deployment_status_replicas_available{deployment="pseudo-service"} == 0
for: 1m
annotations:
title: "Pseudo-service is unavailable"
consequence: "The service is unavailable to users. Immediate investigation required."
action: "Check the deployment status and logs for issues."
labels:
service: pseudo-service
namespace: dapla-stat
severity: critical
alertmanager_custom_config: dapla-stat
alert_type: custom

# This alert detects high CPU usage by calculating the CPU time used over 5 minutes.
- alert: HighCPUUsage
expr: rate(process_cpu_seconds_total{app="pseudo-service"}[5m]) > 0.8
for: 5m
annotations:
title: "High CPU usage detected"
consequence: "The service might experience performance degradation."
action: "Investigate the cause of high CPU usage and optimize if necessary."
labels:
service: pseudo-service
namespace: dapla-stat
severity: warning
alertmanager_custom_config: dapla-stat
alert_type: custom

# This alert checks if memory usage exceeds 90% of the 12GB limit, which could cause instability.
- alert: HighMemoryUsage
expr: sum by (namespace, pod) (container_memory_working_set_bytes{namespace="dapla-stat", pod=~"pseudo-service-.*"}) > 0.9 * sum by (namespace, pod) (kube_pod_container_resource_limits_memory_bytes{namespace="dapla-stat", pod=~"pseudo-service-.*"})
for: 5m
annotations:
title: "High memory usage detected"
consequence: "The service might experience instability due to high memory usage."
action: "Check memory utilization and consider increasing resources or optimizing the service."
labels:
service: pseudo-service
namespace: dapla-stat
severity: warning
alertmanager_custom_config: dapla-stat
alert_type: custom

# This alert detects a high number of error logs in pseudo-service.
- alert: HighNumberOfErrors
expr: (100 * sum by (app, namespace) (rate(log_messages_errors{app="pseudo-service", level=~"Error"}[3m])) / sum by (app, namespace) (rate(log_messages_total{app="pseudo-service"}[3m]))) > 10
for: 3m
annotations:
title: "High number of errors logged in pseudo-service"
consequence: "The application is logging a significant number of errors."
action: "Check the service logs for errors and address the root cause."
labels:
service: pseudo-service
namespace: dapla-stat
severity: critical
alertmanager_custom_config: dapla-stat
alert_type: custom

# This alert monitors the number of pod restarts for pseudo-service and triggers if more than 3 restarts occur within 15 minutes.
- alert: HighPodRestarts
expr: increase(kube_pod_container_status_restarts_total{namespace="dapla-stat", app="pseudo-service"}[15m]) > 3
for: 15m
annotations:
title: "High number of pod restarts"
consequence: "The service may be unstable or misconfigured."
action: "Investigate the cause of pod restarts and fix configuration or resource issues."
labels:
service: pseudo-service
namespace: dapla-stat
severity: warning
alertmanager_custom_config: dapla-stat
alert_type: custom
Loading
Loading