Skip to content

Conversation

bavshin-f5
Copy link
Member

This has been slowly progressing since before the initial release, and while I'm not completely satisfied, it is time to publish something.

Fixes lots of issues with insufficient or just plain bad log messages, incorrect error reporting etc.
Also, fixes handling of rate limit errors and large Retry-After values.

I'm still planning to review and adjust the individual messages while testing this.

@bavshin-f5 bavshin-f5 added this to the 0.2.0 milestone Oct 6, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors logging and error handling throughout the ACME certificate management system to provide better error messages, proper error categorization, and improved handling of rate limits and retry scenarios.

  • Introduces structured error types to replace generic error handling
  • Adds detailed logging for certificate operations and account management
  • Implements proper rate limit handling with server-provided retry intervals

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/state/issuer.rs Adds error state tracking and exponential backoff for issuer failures
src/state/certificate.rs Refactors certificate validity checking with separate is_valid() method
src/lib.rs Replaces generic error handling with structured error types and detailed logging
src/conf/order.rs Refactors order identification with new PrintableOrderId type
src/conf/issuer.rs Adds issuer-level error handling method
src/acme/error.rs Introduces comprehensive error type hierarchy for ACME operations
src/acme.rs Major refactoring of ACME client with structured error handling and improved logging

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@ensh63 ensh63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me.

@bavshin-f5 bavshin-f5 force-pushed the error-handling branch 2 times, most recently from 3f46ef6 to 003a368 Compare October 8, 2025 00:14
@bavshin-f5
Copy link
Member Author

Example messages:

[info] 253427#0: account "https://acme.test:9000/my-account/66f25ec83859005f" created for acme issuer "default"
[info] 253574#0: acme certificate "default/example.test-ad2a129263f78529" issued, next update in 18s
[debug] 253574#0: account "https://acme.test:9000/my-account/170f6e80a537b690" found for acme issuer "default"
[warn] 253130#0: directory update failed (cannot deserialize response (invalid type: integer `404`, expected struct Directory at line 1 column 3)) while creating account for acme issuer "example"
[error] 253132#0: urn:ietf:params:acme:error:malformed: Order included DNS identifier with a value containing an illegal character: '_' while updating certificate "example/example.test-7bc5e05ebec0b7"
[warn] 253134#0: urn:ietf:params:acme:error:connection: Get "http://bad.example.test:8080/.well-known/acme-challenge/Wtc7EcUVSaf_n2dnQFMkh0Ms9m-lHQQ2IH0pjLmneAc": dial tcp 127.0.0.2:8080: connect: connection refused while updating certificate "example/example.test-cae37428fde76446"

@bavshin-f5 bavshin-f5 marked this pull request as ready for review October 8, 2025 00:27
Copy link
Contributor

@xeioex xeioex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks good.

#[error("csr generation failed ({0})")]
Csr(openssl::error::ErrorStack),

#[error("no supported challenge")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"no supported challenge" -> "no supported challenges"?
to match the Error name

.to_string();
self.account = Some(key_id);
let account: types::Account = deserialize_body(res.body())?;
if !matches!(account.status, AccountStatus::Valid) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought
This part looks like a small functional change, not just logging refactoring.

return Err(err);
}
order.status = OrderStatus::Processing
Err(RequestError::Protocol(problem))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need a helper method here, in order to not repeat ourselves.
Something like:

diff --git a/src/acme.rs b/src/acme.rs
index 6876326..d0d9dd5 100644
--- a/src/acme.rs
+++ b/src/acme.rs
@@ -18,7 +18,7 @@ use ngx::collections::Vec;
 use ngx::ngx_log_debug;
 use openssl::pkey::{PKey, PKeyRef, Private};
 use openssl::x509::{self, extension as x509_ext, X509Req};
-use types::{AccountStatus, ProblemCategory};
+use types::AccountStatus;
 
 use self::account_key::{AccountKey, AccountKeyError};
 use self::types::{AuthorizationStatus, ChallengeKind, ChallengeStatus, OrderStatus};
@@ -423,12 +423,7 @@ where
                 res = x;
                 order = deserialize_body(res.body())?;
             }
-            Err(RequestError::Protocol(problem))
-                if matches!(
-                    problem.category(),
-                    ProblemCategory::Order | ProblemCategory::Malformed
-                ) =>
-            {
+            Err(RequestError::Protocol(problem)) if problem.is_order_related() => {
                 return Err(problem.into())
             }
             _ => order.status = OrderStatus::Processing,
diff --git a/src/acme/error.rs b/src/acme/error.rs
index c7fd3c8..cff0d4b 100644
--- a/src/acme/error.rs
+++ b/src/acme/error.rs
@@ -10,7 +10,7 @@ use ngx::allocator::{unsize_box, Box};
 use thiserror::Error;
 
 use super::solvers::SolverError;
-use super::types::{self, Problem, ProblemCategory};
+use super::types::{self, Problem};
 use crate::net::http::HttpClientError;
 
 #[derive(Debug, Error)]
@@ -47,10 +47,7 @@ impl NewAccountError {
     pub fn is_invalid(&self) -> bool {
         match self {
             Self::ExternalAccount => true,
-            Self::Protocol(err) => matches!(
-                err.category(),
-                ProblemCategory::Account | ProblemCategory::Malformed
-            ),
+            Self::Protocol(err) => err.is_account_related(),
             Self::Status(_) => true,
             _ => false,
         }
@@ -105,10 +102,7 @@ impl From<RequestError> for NewCertificateError {
 impl NewCertificateError {
     pub fn is_invalid(&self) -> bool {
         match self {
-            Self::Protocol(err) => matches!(
-                err.category(),
-                ProblemCategory::Order | ProblemCategory::Malformed
-            ),
+            Self::Protocol(err) => err.is_order_related(),
             _ => false,
         }
     }
diff --git a/src/acme/types.rs b/src/acme/types.rs
index 7889e59..ef80f3b 100644
--- a/src/acme/types.rs
+++ b/src/acme/types.rs
@@ -355,6 +355,20 @@ impl Problem {
             _ => ProblemCategory::Other,
         }
     }
+
+    pub fn is_order_related(&self) -> bool {
+        matches!(
+            self.category(),
+            ProblemCategory::Order | ProblemCategory::Malformed
+        )
+    }
+
+    pub fn is_account_related(&self) -> bool {
+        matches!(
+            self.category(),
+            ProblemCategory::Account | ProblemCategory::Malformed
+        )
+    }
 }
 
 fn deserialize_vec_of_uri<'de, D>(deserializer: D) -> Result<Vec<Uri>, D::Error>

const MAX_RETRY_INTERVAL: Duration = Duration::from_secs(8);
static REPLAY_NONCE: http::HeaderName = http::HeaderName::from_static("replay-nonce");

pub enum NewAccountOutput<'a> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment about the commit log
ACME account login refactoring: -> ACME account login refactoring.?

let issuer_next = match ngx_http_acme_update_certificates_for_issuer(amcf, issuer).await {
Ok(x) => x,
Err(err) => {
// Check if the server rejected this ACME account configuration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:
after the patchset ngx_http_acme_update_certificates_for_issuer() is the only function with a prototype anyhow::Result<Time>. Any plans to get rid of it as well?

Clean up error handling in the main update loop.
* Implement backoff for login errors
* Log new account URLs
We already log the URL, but on INFO level and only once.  And while it
is possible to find the URL from private key, the operation is not
trivial.

Let's make this simpler by writing a file named "account.url" to the
state directory whenever the issuer tells us it created a new account.

Fixes nginx#54
Previously, it was possible to suspend the module execution by sending a
sufficiently high Retry-After value.  Now we will refuse to wait longer
than one minute.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants