Skip to content

Not sure, code that prints the error suggests it cannot save the cert, but I think something else is going on #29

@patrickdk77

Description

@patrickdk77

Bug Overview

I have a lot of connect attempts to letsencrypt api endpoint, it does the tls handshake but then seems to fail, it does this around 15 times or so, then one works, and it does the certificate hostname verification check.

The check is successful:
accept4(7, {sa_family=AF_INET, sin_port=htons(39783), sin_addr=inet_addr("66.133.109.36")}, [112 => 16], SOCK_NONBLOCK) = 3
epoll_ctl(14, EPOLL_CTL_ADD, 3, {events=EPOLLIN|EPOLLRDHUP|EPOLLET, data={u32=2185423569, u64=123933466749649}}) = 0
accept4(7, 0x7ffd3f31d7c8, [112], SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
epoll_pwait(14, [{events=EPOLLIN, data={u32=2185423569, u64=123933466749649}}], 512, 100, NULL, 8) = 1
recvfrom(3, "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1\r\nHost: stats.patrickdk.com\r\nUser-Agent: Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)\r\nAccept: /\r\nAccept-Encoding: gzip\r\nConnection: close\r\n\r\n", 1024, 0, NULL, NULL) = 271
writev(3, [{iov_base="HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Thu, 14 Aug 2025 06:06:46 GMT\r\nContent-Length: 87\r\nConnection: close\r\nconnection: close\r\ncontent-type: text/plain\r\nalt-svc: h3=":443"; ma=86400;\r\n\r\n", iov_len=186}, {iov_base="l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28.qizQ8cqeXnwEpu23lb97sfzAWZEIe5YnHuprRDZjyNM", iov_len=87}], 2) = 273
write(5, "stats.patrickdk.com 66.133.109.36 - - [14/Aug/2025:02:06:46 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)\" "-"\n", 259) = 259
close(3) = 0

it then goes to fetch the completed certificate from letsencrypt api and it goes back into this failing connection thing again and gives out this message:
write(4, "2025/08/14 02:06:46 [warn] 29#29: acme certificate "letsencrypt/stats.patrickdk.com-d38f26420234f63f" request failed: unknown error\n", 132) = 132

the raw logs are not much more helpful, but it looked like in the code it might have been a filesystem permission issue, but it looks more like some tls problem

2025/08/14 02:05:21 [notice] 1#1: using the "epoll" event method
2025/08/14 02:05:21 [notice] 1#1: nginx/1.29.1
2025/08/14 02:05:21 [notice] 1#1: built by gcc 14.2.0 (Alpine 14.2.0)
2025/08/14 02:05:21 [notice] 1#1: OS: Linux 6.8.0-71-generic
2025/08/14 02:05:21 [notice] 1#1: getrlimit(RLIMIT_NOFILE): 1048576:1048576
2025/08/14 02:05:21 [notice] 1#1: start worker processes
2025/08/14 02:05:21 [notice] 1#1: start worker process 29
2025/08/14 02:05:24 [info] 29#29: epoll_wait() failed (4: Interrupted system call)
stats.patrickdk.com 66.133.109.36 - - [14/Aug/2025:02:05:34 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-"
2025/08/14 02:05:34 [warn] 29#29: acme certificate "letsencrypt/stats.patrickdk.com-d38f26420234f63f" request failed: unknown error
stats.patrickdk.com 54.244.175.230 - - [14/Aug/2025:02:05:44 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.223.133.83 - - [14/Aug/2025:02:05:44 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.143.145.34 - - [14/Aug/2025:02:05:45 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 16.171.234.54 - - [14/Aug/2025:02:05:46 -0400] "GET /.well-known/acme-challenge/zUuIwccs2FCewpbN_CgI4_t4neSLyLY3bRkHl0aaJa0 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 66.133.109.36 - - [14/Aug/2025:02:06:46 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 200 87 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "-"
2025/08/14 02:06:46 [warn] 29#29: acme certificate "letsencrypt/stats.patrickdk.com-d38f26420234f63f" request failed: unknown error
2025/08/14 02:06:51 [info] 29#29: epoll_wait() failed (4: Interrupted system call)
stats.patrickdk.com 16.171.234.54 - - [14/Aug/2025:02:06:56 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.223.133.83 - - [14/Aug/2025:02:06:57 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 18.143.145.34 - - [14/Aug/2025:02:06:57 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"
stats.patrickdk.com 54.244.175.230 - - [14/Aug/2025:02:06:57 -0400] "GET /.well-known/acme-challenge/l667YdErp3YSl_r60VqXJmKqOq4liHAayJcuQZEEm28 HTTP/1.1" 404 118 "-" "Mozilla/5.0 (compatible; Let's Encrypt validation server; +https://www.letsencrypt.org)" "172.30.250.81:80"

Expected Behavior

beta code, no good errors yet

Steps to Reproduce the Bug

not sure, seems something inside the tls session is going wrong

Environment Details

  • Target deployment platform: [e.g. AWS/GCP/local cluster/etc...]
  • Target OS: [e.g. RHEL 9/Ubuntu 24.04/etc...]
  • Version of this project or specific commit: [e.g. 1.4.3/commit hash]
  • Version of any relevant project languages: [e.g. Kubernetes 1.30/Python 3.9.7/etc...]

Additional Context

tried both staging and production

acme_issuer letsencrypt {
uri https://acme-staging-v02.api.letsencrypt.org/directory;
state_path /etc/nginx/acme;
accept_terms_of_service;
ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
ssl_verify off;
}
server {
server_name stats.patrickdk.com;
access_log /var/log/nginx/access.log vhost;
http2 on;
listen 80 ;
listen [::]:80 ;
listen 443 ssl ;
listen [::]:443 ssl ;
ssl_session_timeout 5m;
ssl_session_cache shared:SSL:50m;
ssl_session_tickets off;
ssl_certificate_cache max=2;
acme_certificate letsencrypt key=rsa;
ssl_certificate $acme_certificate;
ssl_certificate_key $acme_certificate_key;
location / {
proxy_pass http://stats.patrickdk.com;
set $upstream_keepalive true;
}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions