-
Notifications
You must be signed in to change notification settings - Fork 15
Description
Bug Overview
I have a lot of connect attempts to letsencrypt api endpoint, it does the tls handshake but then seems to fail, it does this around 15 times or so, then one works, and it does the certificate hostname verification check.
The check is successful:
accept4(7, {sa_family=AF_INET, sin_port=htons(39783), sin_addr=inet_addr("66.133.109.36")}, [112 => 16], SOCK_NONBLOCK) = 3
epoll_ctl(14, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=2185423569, u64=123933466749649}}) = 0
accept4(7, 0x7ffd3f31d7c8, [112], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
epoll_pwait(14, [{events=EPOLLIN, data={u32=2185423569, u64=123933466749649}}], 512, 100, NULL, 8) = 1
recvfrom(3, "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1\r\nHost: stats.patrickdk.com\r\nUser-Agent: Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)\r\nAccept: /\r\nAccept-Encoding: gzip\r\nConnection: close\r\n\r\n", 1024, 0, NULL, NULL) = 271
writev(3, [{iov_base="HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Thu, 14 Aug 2025 06:06:46 GMT\r\nContent-Length: 87\r\nConnection: close\r\nconnection: close\r\ncontent-type: text/plain\r\nalt-svc: h3=":443"; ma=86400;\r\n\r\n", iov_len=186}, {iov_base="l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28.qizQ8cqeXnwEpu23lb97sfzAWZEIe5YnHuprRDZjyNM", iov_len=87}], 2) = 273
write(5, "stats.patrickdk.com 66.133.109.36 - - [14/Aug/2025:02:06:46 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)\" "-"\n", 259) = 259
close(3) = 0
it then goes to fetch the completed certificate from letsencrypt api and it goes back into this failing connection thing again and gives out this message:
write(4, "2025/08/14 02:06:46 [warn] 29#29: acme certificate "letsencrypt/stats.patrickdk.com-d38f26420234f63f" request failed: unknown error\n", 132) = 132
the raw logs are not much more helpful, but it looked like in the code it might have been a filesystem permission issue, but it looks more like some tls problem
2025/08/14 02:05:21 [notice] 1#1: using the "epoll" event method
2025/08/14 02:05:21 [notice] 1#1: nginx/1.29.1
2025/08/14 02:05:21 [notice] 1#1: built by gcc 14.2.0 (Alpine 14.2.0)
2025/08/14 02:05:21 [notice] 1#1: OS: Linux 6.8.0-71-generic
2025/08/14 02:05:21 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2025/08/14 02:05:21 [notice] 1#1: start worker processes
2025/08/14 02:05:21 [notice] 1#1: start worker process 29
2025/08/14 02:05:24 [info] 29#29: epoll_wait() failed (4: Interrupted system call)
stats.patrickdk.com 66.133.109.36 - - [14/Aug/2025:02:05:34 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-"
2025/08/14 02:05:34 [warn] 29#29: acme certificate "letsencrypt/stats.patrickdk.com-d38f26420234f63f" request failed: unknown error
stats.patrickdk.com 54.244.175.230 - - [14/Aug/2025:02:05:44 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.223.133.83 - - [14/Aug/2025:02:05:44 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.143.145.34 - - [14/Aug/2025:02:05:45 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 16.171.234.54 - - [14/Aug/2025:02:05:46 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 66.133.109.36 - - [14/Aug/2025:02:06:46 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-"
2025/08/14 02:06:46 [warn] 29#29: acme certificate "letsencrypt/stats.patrickdk.com-d38f26420234f63f" request failed: unknown error
2025/08/14 02:06:51 [info] 29#29: epoll_wait() failed (4: Interrupted system call)
stats.patrickdk.com 16.171.234.54 - - [14/Aug/2025:02:06:56 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.223.133.83 - - [14/Aug/2025:02:06:57 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.143.145.34 - - [14/Aug/2025:02:06:57 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 54.244.175.230 - - [14/Aug/2025:02:06:57 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
Expected Behavior
beta code, no good errors yet
Steps to Reproduce the Bug
not sure, seems something inside the tls session is going wrong
Environment Details
- Target deployment platform: [e.g. AWS/GCP/local cluster/etc...]
- Target OS: [e.g. RHEL 9/Ubuntu 24.04/etc...]
- Version of this project or specific commit: [e.g. 1.4.3/commit hash]
- Version of any relevant project languages: [e.g. Kubernetes 1.30/Python 3.9.7/etc...]
Additional Context
tried both staging and production
acme_issuer letsencrypt {
uri https://acme-staging-v02.api.letsencrypt.org/directory;
state_path /etc/nginx/acme;
accept_terms_of_service;
ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
ssl_verify off;
}
server {
server_name stats.patrickdk.com;
access_log /var/log/nginx/access.log vhost;
http2 on;
listen 80 ;
listen [::]:80 ;
listen 443 ssl ;
listen [::]:443 ssl ;
ssl_session_timeout 5m;
ssl_session_cache shared:SSL:50m;
ssl_session_tickets off;
ssl_certificate_cache max=2;
acme_certificate letsencrypt key=rsa;
ssl_certificate $acme_certificate;
ssl_certificate_key $acme_certificate_key;
location / {
proxy_pass http://stats.patrickdk.com;
set $upstream_keepalive true;
}
}