- 
                Notifications
    
You must be signed in to change notification settings  - Fork 32
 
♻️🎨Dockerfile: fix start period/start interval healthchecks #7557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
♻️🎨Dockerfile: fix start period/start interval healthchecks #7557
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes the misconfiguration of Docker healthchecks by updating the start period and adding a start interval to allow for faster initial checks while maintaining overall stability. Key changes include:
- Updating --start-period from 1s to 20s and adding --start-interval=1s in multiple Docker healthcheck configurations.
 - Changing the endpoint for the healthcheck command from "http://localhost:8000/" to "http://localhost:8080/v0/" in most services.
 - Notably, while nearly all services update the endpoint, one instance remains unchanged.
 
Reviewed Changes
Copilot reviewed 22 out of 33 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description | 
|---|---|
| services/director/docker/healthcheck.py | Updated healthcheck parameters and endpoint | 
| services/director-v2/docker/healthcheck.py | Updated healthcheck parameters and endpoint | 
| services/datcore-adapter/docker/healthcheck.py | Updated healthcheck parameters and endpoint | 
| services/clusters-keeper/docker/healthcheck.py | Updated healthcheck parameters and endpoint | 
| services/catalog/docker/healthcheck.py | Updated healthcheck parameters and endpoint | 
| services/autoscaling/docker/healthcheck.py | Updated healthcheck parameters and endpoint | 
| services/api-server/docker/healthcheck.py | Updated healthcheck parameters and endpoint | 
| services/agent/docker/healthcheck.py | Updated healthcheck parameters; endpoint remains unchanged compared to others | 
| scripts/docker/healthcheck_curl_host.py | Updated healthcheck parameters and endpoint | 
Files not reviewed (11)
- services/agent/Dockerfile: Language not supported
 - services/api-server/Dockerfile: Language not supported
 - services/autoscaling/Dockerfile: Language not supported
 - services/catalog/Dockerfile: Language not supported
 - services/clusters-keeper/Dockerfile: Language not supported
 - services/dask-sidecar/Dockerfile: Language not supported
 - services/datcore-adapter/Dockerfile: Language not supported
 - services/director-v2/Dockerfile: Language not supported
 - services/director/Dockerfile: Language not supported
 - services/docker-api-proxy/Dockerfile: Language not supported
 - services/dynamic-scheduler/Dockerfile: Language not supported
 
Comments suppressed due to low confidence (1)
services/agent/docker/healthcheck.py:12
- The CMD endpoint in this file is not updated to match the uniform endpoint 'http://localhost:8080/v0/' used in the other services. Please update it for consistency.
 
CMD python3 docker/healthcheck.py http://localhost:8000/
          Codecov ReportAll modified and coverable lines are covered by tests ✅ 
 Additional details and impacted files@@            Coverage Diff             @@
##           master    #7557      +/-   ##
==========================================
+ Coverage   87.60%   88.74%   +1.14%     
==========================================
  Files        1752     1207     -545     
  Lines       67864    51449   -16415     
  Branches     1121      170     -951     
==========================================
- Hits        59453    45660   -13793     
+ Misses       8103     5731    -2372     
+ Partials      308       58     -250     
 
 Continue to review full report in Codecov by Sentry. 
 🚀 New features to boost your workflow:
  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
9c71e7c    to
    d637e76      
    Compare
  
    | 
           Please retry analysis of this Pull-Request directly on SonarQube Cloud  | 
    
What do these changes do?
It was discovered that we are misusing the Docker
HEALTHCHECKin our services Dockerfiles.setting
--start-period=1sis useless since the default--start-intervalis set to5s.Therefore this PR uniformizes the Dockerfile in that regard and sets the following values:
all services:
--interval=10s# the interval with which at runtime the container is checked for healthyness--timeout=5s# the timeout to define a container as unhealthy--retries=5# number of retries before a consider is considered as definitely unhealthy--start-period=20s#NOTE: as soon as the healthcheck returns a0exit code, then it is considered as running by Docker engine--start-interval=1s# this is the interval of checking, that means to have a fast startup this should be small (default is 5s)special cases:
dynamic-sidecar
--start-period=180s@GitHK not sure why...Related issue/s
How to test
Dev-ops checklist