Skip to content

Commit a726f88

Browse files
authored
3-tier implementation of manifest caching (#57)
* implement manifest caching; refactor config with includes, and generate from ENVs in entrypoint.sh - disabled by default; enable with -e ENABLE_MANIFEST_CACHE=true - default times and regexes are a wild guess, make sure to tune for your use case. - add manifest caching/anti-ratelimit usage note to README - add -e ENABLE_MANIFEST_CACHE=true to examples, some wording changes - add -e ENABLE_MANIFEST_CACHE=true to one the steps in test workflow.
1 parent 227a397 commit a726f88

File tree

7 files changed

+138
-47
lines changed

7 files changed

+138
-47
lines changed

.github/workflows/test.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -53,10 +53,10 @@ jobs:
5353
cache-from: type=local,src=/tmp/.buildx-cache/release
5454
# this only reads from the cache
5555

56-
- name: Start proxy instance in docker
56+
- name: Start proxy instance in docker (ENABLE_MANIFEST_CACHE=false)
5757
run: |
5858
docker run -d --rm --name docker_registry_proxy \
59-
-p 0.0.0.0:3128:3128 \
59+
-p 0.0.0.0:3128:3128 -e ENABLE_MANIFEST_CACHE=false \
6060
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
6161
-v $(pwd)/docker_mirror_certs:/ca \
6262
sanity-check/docker-registry-proxy:latest
@@ -115,10 +115,10 @@ jobs:
115115
run: |
116116
sudo systemctl restart docker.service
117117
118-
- name: Start proxy instance in docker again
118+
- name: Start proxy instance in docker again (ENABLE_MANIFEST_CACHE=true)
119119
run: |
120120
docker run -d --rm --name docker_registry_proxy \
121-
-p 0.0.0.0:3128:3128 \
121+
-p 0.0.0.0:3128:3128 -e ENABLE_MANIFEST_CACHE=true \
122122
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
123123
-v $(pwd)/docker_mirror_certs:/ca \
124124
sanity-check/docker-registry-proxy:latest

Dockerfile

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,8 @@ VOLUME /ca
4242

4343
# Add our configuration
4444
ADD nginx.conf /etc/nginx/nginx.conf
45+
ADD nginx.manifest.common.conf /etc/nginx/nginx.manifest.common.conf
46+
ADD nginx.manifest.stale.conf /etc/nginx/nginx.manifest.stale.conf
4547

4648
# Add our very hackish entrypoint and ca-building scripts, make them executable
4749
ADD entrypoint.sh /entrypoint.sh
@@ -70,5 +72,27 @@ ENV DEBUG_HUB="false"
7072
# Enable nginx debugging mode; this uses nginx-debug binary and enabled debug logging, which is VERY verbose so separate setting
7173
ENV DEBUG_NGINX="false"
7274

75+
# Manifest caching tiers. Disabled by default, to mimick 0.4/0.5 behaviour.
76+
# Setting it to true enables the processing of the ENVs below.
77+
# Once enabled, it is valid for all registries, not only DockerHub.
78+
# The envs *_REGEX represent a regex fragment, check entrypoint.sh to understand how they're used (nginx ~ location, PCRE syntax).
79+
ENV ENABLE_MANIFEST_CACHE="false"
80+
81+
# 'Primary' tier defaults to 10m cache for frequently used/abused tags.
82+
# - People publishing to production via :latest (argh) will want to include that in the regex
83+
# - Heavy pullers who are being ratelimited but don't mind getting outdated manifests should (also) increase the cache time here
84+
ENV MANIFEST_CACHE_PRIMARY_REGEX="(stable|nightly|production|test)"
85+
ENV MANIFEST_CACHE_PRIMARY_TIME="10m"
86+
87+
# 'Secondary' tier defaults any tag that has 3 digits or dots, in the hopes of matching most explicitly-versioned tags.
88+
# It caches for 60d, which is also the cache time for the large binary blobs to which the manifests refer.
89+
# That makes them effectively immutable. Make sure you're not affected; tighten this regex or widen the primary tier.
90+
ENV MANIFEST_CACHE_SECONDARY_REGEX="(.*)(\d|\.)+(.*)(\d|\.)+(.*)(\d|\.)+"
91+
ENV MANIFEST_CACHE_SECONDARY_TIME="60d"
92+
93+
# The default cache duration for manifests that don't match either the primary or secondary tiers above.
94+
# In the default config, :latest and other frequently-used tags will get this value.
95+
ENV MANIFEST_CACHE_DEFAULT_TIME="1h"
96+
7397
# Did you want a shell? Sorry, the entrypoint never returns, because it runs nginx itself. Use 'docker exec' if you need to mess around internally.
7498
ENTRYPOINT ["/entrypoint.sh"]

README.md

Lines changed: 49 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,48 @@
66
## TL,DR
77

88
A caching proxy for Docker; allows centralised management of (multiple) registries and their authentication; caches images from *any* registry.
9+
Caches the potentially huge blob/layer requests (for bandwidth/time savings), and optionally caches manifest requests ("pulls") to avoid rate-limiting.
10+
11+
### NEW: avoiding DockerHub Pull Rate Limits with Caching
12+
13+
Starting November 2nd, 2020, DockerHub will
14+
[supposedly](https://www.docker.com/blog/docker-hub-image-retention-policy-delayed-and-subscription-updates/)
15+
[start](https://www.docker.com/blog/scaling-docker-to-serve-millions-more-developers-network-egress/)
16+
[rate-limiting pulls](https://docs.docker.com/docker-hub/download-rate-limit/),
17+
also known as the _Docker Apocalypse_.
18+
The main symptom is `Error response from daemon: toomanyrequests: Too Many Requests. Please see https://docs.docker.com/docker-hub/download-rate-limit/` during pulls.
19+
Many unknowing Kubernetes clusters will hit the limit, and struggle to configure `imagePullSecrets` and `imagePullPolicy`.
20+
21+
Since version `0.6.0`, this proxy can be configured with the env var `ENABLE_MANIFEST_CACHE=true` which provides
22+
configurable caching of the manifest requests that DockerHub throttles. You can then fine-tune other parameters to your needs.
23+
Together with the possibility to centrally inject authentication (since 0.3x), this is probably one of the best ways to bring relief to your distressed cluster, while at the same time saving lots of bandwidth and time.
24+
25+
Note: enabling manifest caching, in its default config, effectively makes some tags **immutable**. Use with care. The configuration ENVs are explained in the [Dockerfile](./Dockerfile), relevant parts included below.
26+
27+
```dockerfile
28+
# Manifest caching tiers. Disabled by default, to mimick 0.4/0.5 behaviour.
29+
# Setting it to true enables the processing of the ENVs below.
30+
# Once enabled, it is valid for all registries, not only DockerHub.
31+
# The envs *_REGEX represent a regex fragment, check entrypoint.sh to understand how they're used (nginx ~ location, PCRE syntax).
32+
ENV ENABLE_MANIFEST_CACHE="false"
33+
34+
# 'Primary' tier defaults to 10m cache for frequently used/abused tags.
35+
# - People publishing to production via :latest (argh) will want to include that in the regex
36+
# - Heavy pullers who are being ratelimited but don't mind getting outdated manifests should (also) increase the cache time here
37+
ENV MANIFEST_CACHE_PRIMARY_REGEX="(stable|nightly|production|test)"
38+
ENV MANIFEST_CACHE_PRIMARY_TIME="10m"
39+
40+
# 'Secondary' tier defaults any tag that has 3 digits or dots, in the hopes of matching most explicitly-versioned tags.
41+
# It caches for 60d, which is also the cache time for the large binary blobs to which the manifests refer.
42+
# That makes them effectively immutable. Make sure you're not affected; tighten this regex or widen the primary tier.
43+
ENV MANIFEST_CACHE_SECONDARY_REGEX="(.*)(\d|\.)+(.*)(\d|\.)+(.*)(\d|\.)+"
44+
ENV MANIFEST_CACHE_SECONDARY_TIME="60d"
45+
46+
# The default cache duration for manifests that don't match either the primary or secondary tiers above.
47+
# In the default config, :latest and other frequently-used tags will get this value.
48+
ENV MANIFEST_CACHE_DEFAULT_TIME="1h"
49+
```
50+
951

1052
## What?
1153

@@ -14,7 +56,7 @@ Essentially, it's a [man in the middle](https://en.wikipedia.org/wiki/Man-in-the
1456
The main feature is Docker layer/image caching, including layers served from S3, Google Storage, etc.
1557

1658
As a bonus it allows for centralized management of Docker registry credentials, which can in itself be the main feature, eg in Kubernetes environments.
17-
59+
1860
You configure the Docker clients (_err... Kubernetes Nodes?_) once, and then all configuration is done on the proxy --
1961
for this to work it requires inserting a root CA certificate into system trusted root certs.
2062

@@ -37,6 +79,7 @@ for this to work it requires inserting a root CA certificate into system trusted
3779
- Map volume `/docker_mirror_cache` for up to `CACHE_MAX_SIZE` (32gb by default) of cached images across all cached registries
3880
- Map volume `/ca`, the proxy will store the CA certificate here across restarts. **Important** this is security sensitive.
3981
- Env `CACHE_MAX_SIZE` (default `32g`): set the max size to be used for caching local Docker image layers. Use [Nginx sizes](http://nginx.org/en/docs/syntax.html).
82+
- Env `ENABLE_MANIFEST_CACHE`, see the section on pull rate limiting.
4083
- Env `REGISTRIES`: space separated list of registries to cache; no need to include DockerHub, its already done internally.
4184
- Env `AUTH_REGISTRIES`: space separated list of `hostname:username:password` authentication info.
4285
- `hostname`s listed here should be listed in the REGISTRIES environment as well, so they can be intercepted.
@@ -46,7 +89,7 @@ for this to work it requires inserting a root CA certificate into system trusted
4689
### Simple (no auth, all cache)
4790
```bash
4891
docker run --rm --name docker_registry_proxy -it \
49-
-p 0.0.0.0:3128:3128 \
92+
-p 0.0.0.0:3128:3128 -e ENABLE_MANIFEST_CACHE=true \
5093
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
5194
-v $(pwd)/docker_mirror_certs:/ca \
5295
rpardini/docker-registry-proxy:0.5.0
@@ -60,7 +103,7 @@ For Docker Hub authentication:
60103

61104
```bash
62105
docker run --rm --name docker_registry_proxy -it \
63-
-p 0.0.0.0:3128:3128 \
106+
-p 0.0.0.0:3128:3128 -e ENABLE_MANIFEST_CACHE=true \
64107
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
65108
-v $(pwd)/docker_mirror_certs:/ca \
66109
-e REGISTRIES="k8s.gcr.io gcr.io quay.io your.own.registry another.public.registry" \
@@ -88,7 +131,7 @@ For GitLab.com itself the authentication domain should be `gitlab.com`.
88131

89132
```bash
90133
docker run --rm --name docker_registry_proxy -it \
91-
-p 0.0.0.0:3128:3128 \
134+
-p 0.0.0.0:3128:3128 -e ENABLE_MANIFEST_CACHE=true \
92135
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
93136
-v $(pwd)/docker_mirror_certs:/ca \
94137
-e REGISTRIES="reg.example.com git.example.com" \
@@ -109,7 +152,7 @@ Example with GCR using credentials from a service account from a key file `servi
109152

110153
```bash
111154
docker run --rm --name docker_registry_proxy -it \
112-
-p 0.0.0.0:3128:3128 \
155+
-p 0.0.0.0:3128:3128 -e ENABLE_MANIFEST_CACHE=true \
113156
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
114157
-v $(pwd)/docker_mirror_certs:/ca \
115158
-e REGISTRIES="k8s.gcr.io gcr.io quay.io your.own.registry another.public.registry" \
@@ -172,7 +215,7 @@ This allows very in-depth debugging. Use sparingly, and definitely not in produc
172215
```bash
173216
docker run --rm --name docker_registry_proxy -it
174217
-e DEBUG_NGINX=true -e DEBUG=true -e DEBUG_HUB=true -p 0.0.0.0:8081:8081 -p 0.0.0.0:8082:8082 \
175-
-p 0.0.0.0:3128:3128 \
218+
-p 0.0.0.0:3128:3128 -e ENABLE_MANIFEST_CACHE=true \
176219
-v $(pwd)/docker_mirror_cache:/docker_mirror_cache \
177220
-v $(pwd)/docker_mirror_certs:/ca \
178221
rpardini/docker-registry-proxy:0.5.0-debug

entrypoint.sh

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,49 @@ CACHE_MAX_SIZE=${CACHE_MAX_SIZE:-32g}
7878
# Set to 32gb which should be enough
7979
echo "proxy_cache_path /docker_mirror_cache levels=1:2 max_size=$CACHE_MAX_SIZE inactive=60d keys_zone=cache:10m use_temp_path=off;" > /etc/nginx/conf.d/cache_max_size.conf
8080

81+
# Manifest caching configuration. We generate config based on the environment vars.
82+
echo -n "" >/etc/nginx/nginx.manifest.caching.config.conf
83+
84+
[[ "a${ENABLE_MANIFEST_CACHE}" == "atrue" ]] && [[ "a${MANIFEST_CACHE_PRIMARY_REGEX}" != "a" ]] && cat <<EOD >>/etc/nginx/nginx.manifest.caching.config.conf
85+
# First tier caching of manifests; configure via MANIFEST_CACHE_PRIMARY_REGEX and MANIFEST_CACHE_PRIMARY_TIME
86+
location ~ ^/v2/(.*)/manifests/${MANIFEST_CACHE_PRIMARY_REGEX} {
87+
set \$docker_proxy_request_type "manifest-primary";
88+
proxy_cache_valid ${MANIFEST_CACHE_PRIMARY_TIME};
89+
include "/etc/nginx/nginx.manifest.stale.conf";
90+
}
91+
EOD
92+
93+
[[ "a${ENABLE_MANIFEST_CACHE}" == "atrue" ]] && [[ "a${MANIFEST_CACHE_SECONDARY_REGEX}" != "a" ]] && cat <<EOD >>/etc/nginx/nginx.manifest.caching.config.conf
94+
# Secondary tier caching of manifests; configure via MANIFEST_CACHE_SECONDARY_REGEX and MANIFEST_CACHE_SECONDARY_TIME
95+
location ~ ^/v2/(.*)/manifests/${MANIFEST_CACHE_SECONDARY_REGEX} {
96+
set \$docker_proxy_request_type "manifest-secondary";
97+
proxy_cache_valid ${MANIFEST_CACHE_SECONDARY_TIME};
98+
include "/etc/nginx/nginx.manifest.stale.conf";
99+
}
100+
EOD
101+
102+
[[ "a${ENABLE_MANIFEST_CACHE}" == "atrue" ]] && cat <<EOD >>/etc/nginx/nginx.manifest.caching.config.conf
103+
# Default tier caching for manifests. Caches for ${MANIFEST_CACHE_DEFAULT_TIME} (from MANIFEST_CACHE_DEFAULT_TIME)
104+
location ~ ^/v2/(.*)/manifests/ {
105+
set \$docker_proxy_request_type "manifest-default";
106+
proxy_cache_valid ${MANIFEST_CACHE_DEFAULT_TIME};
107+
include "/etc/nginx/nginx.manifest.stale.conf";
108+
}
109+
EOD
110+
111+
[[ "a${ENABLE_MANIFEST_CACHE}" != "atrue" ]] && cat <<EOD >>/etc/nginx/nginx.manifest.caching.config.conf
112+
# Manifest caching is disabled. Enable it with ENABLE_MANIFEST_CACHE=true
113+
location ~ ^/v2/(.*)/manifests/ {
114+
set \$docker_proxy_request_type "manifest-default-disabled";
115+
proxy_cache_valid 0s;
116+
include "/etc/nginx/nginx.manifest.stale.conf";
117+
}
118+
EOD
119+
120+
echo "Manifest caching config: ---"
121+
cat /etc/nginx/nginx.manifest.caching.config.conf
122+
echo "---"
123+
81124
# normally use non-debug version of nginx
82125
NGINX_BIN="/usr/sbin/nginx"
83126

nginx.conf

Lines changed: 7 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -267,57 +267,27 @@ echo "Docker configured with HTTPS_PROXY=$scheme://$http_host/"
267267
# For blob requests by digest, do cache, and treat redirects.
268268
location ~ ^/v2/(.*)/blobs/sha256:(.*) {
269269
set $docker_proxy_request_type "blob-by-digest";
270-
add_header X-Docker-Registry-Proxy-Cache-Upstream-Status "$upstream_cache_status";
271-
add_header X-Docker-Registry-Proxy-Cache-Type "$docker_proxy_request_type";
272-
proxy_pass https://$targetHost;
273-
proxy_cache cache;
274-
proxy_cache_key $uri;
275-
proxy_intercept_errors on;
276-
error_page 301 302 307 = @handle_redirects;
270+
include "/etc/nginx/nginx.manifest.common.conf";
277271
}
278272

279273
# For manifest requests by digest, do cache, and treat redirects.
280274
# These are some of the requests that DockerHub will throttle.
281275
location ~ ^/v2/(.*)/manifests/sha256:(.*) {
282276
set $docker_proxy_request_type "manifest-by-digest";
283-
add_header X-Docker-Registry-Proxy-Cache-Upstream-Status "$upstream_cache_status";
284-
add_header X-Docker-Registry-Proxy-Cache-Type "$docker_proxy_request_type";
285-
proxy_pass https://$targetHost;
286-
proxy_cache cache;
287-
proxy_cache_key $uri;
288-
proxy_intercept_errors on;
289-
error_page 301 302 307 = @handle_redirects;
277+
include "/etc/nginx/nginx.manifest.common.conf";
290278
}
291279

292-
# Cache manifest requests that are not by digest (e.g. tags)
293-
# Since these are mutable, we invalidate them immediately and keep them only in case the backend is down
294-
# These are some of the requests that DockerHub will throttle.
295-
location ~ ^/v2/(.*)/manifests/ {
296-
set $docker_proxy_request_type "manifest-mutable";
297-
add_header X-Docker-Registry-Proxy-Cache-Upstream-Status "$upstream_cache_status";
298-
add_header X-Docker-Registry-Proxy-Cache-Type "$docker_proxy_request_type";
299-
proxy_pass https://$targetHost;
300-
proxy_cache cache;
301-
proxy_cache_key $uri;
302-
proxy_intercept_errors on;
303-
proxy_cache_use_stale error timeout http_500 http_502 http_504 http_429;
304-
proxy_cache_valid 0s;
305-
error_page 301 302 307 = @handle_redirects;
306-
}
280+
# Config for manifest URL caching is generated by the entrypoint based on ENVs.
281+
# Go check it out, entrypoint.sh
282+
include "/etc/nginx/nginx.manifest.caching.config.conf";
283+
307284

308285
# Cache blobs requests that are not by digest
309286
# Since these are mutable, we invalidate them immediately and keep them only in case the backend is down
310287
location ~ ^/v2/(.*)/blobs/ {
311288
set $docker_proxy_request_type "blob-mutable";
312-
add_header X-Docker-Registry-Proxy-Cache-Upstream-Status "$upstream_cache_status";
313-
add_header X-Docker-Registry-Proxy-Cache-Type "$docker_proxy_request_type";
314-
proxy_pass https://$targetHost;
315-
proxy_cache cache;
316-
proxy_cache_key $uri;
317-
proxy_intercept_errors on;
318-
proxy_cache_use_stale error timeout http_500 http_502 http_504 http_429;
319289
proxy_cache_valid 0s;
320-
error_page 301 302 307 = @handle_redirects;
290+
include "/etc/nginx/nginx.manifest.stale.conf";
321291
}
322292

323293
location @handle_redirects {

nginx.manifest.common.conf

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# nginx config fragment included in every manifest-related location{} block.
2+
add_header X-Docker-Registry-Proxy-Cache-Upstream-Status "$upstream_cache_status";
3+
add_header X-Docker-Registry-Proxy-Cache-Type "$docker_proxy_request_type";
4+
proxy_pass https://$targetHost;
5+
proxy_cache cache;
6+
proxy_cache_key $uri;
7+
proxy_intercept_errors on;
8+
error_page 301 302 307 = @handle_redirects;

nginx.manifest.stale.conf

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# Just like the common block, but adds proxy_cache_use_stale
2+
include "/etc/nginx/nginx.manifest.common.conf";
3+
proxy_cache_use_stale error timeout http_500 http_502 http_504 http_429;

0 commit comments

Comments
 (0)