Skip to content

Conversation

bavshin-f5
Copy link
Member

This has been slowly progressing since before the initial release, and while I'm not completely satisfied, it is time to publish something.

Fixes lots of issues with insufficient or just plain bad log messages, incorrect error reporting etc.
Also, fixes handling of rate limit errors and large Retry-After values.

I'm still planning to review and adjust the individual messages while testing this.

@bavshin-f5 bavshin-f5 added this to the 0.2.0 milestone Oct 6, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors logging and error handling throughout the ACME certificate management system to provide better error messages, proper error categorization, and improved handling of rate limits and retry scenarios.

  • Introduces structured error types to replace generic error handling
  • Adds detailed logging for certificate operations and account management
  • Implements proper rate limit handling with server-provided retry intervals

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/state/issuer.rs Adds error state tracking and exponential backoff for issuer failures
src/state/certificate.rs Refactors certificate validity checking with separate is_valid() method
src/lib.rs Replaces generic error handling with structured error types and detailed logging
src/conf/order.rs Refactors order identification with new PrintableOrderId type
src/conf/issuer.rs Adds issuer-level error handling method
src/acme/error.rs Introduces comprehensive error type hierarchy for ACME operations
src/acme.rs Major refactoring of ACME client with structured error handling and improved logging

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

@ensh63 ensh63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good for me.

@bavshin-f5 bavshin-f5 force-pushed the error-handling branch 2 times, most recently from 3f46ef6 to 003a368 Compare October 8, 2025 00:14
@bavshin-f5
Copy link
Member Author

Example messages:

[info] 253427#0: account "https://acme.test:9000/my-account/66f25ec83859005f" created for acme issuer "default"
[info] 253574#0: acme certificate "default/example.test-ad2a129263f78529" issued, next update in 18s
[debug] 253574#0: account "https://acme.test:9000/my-account/170f6e80a537b690" found for acme issuer "default"
[warn] 253130#0: directory update failed (cannot deserialize response (invalid type: integer `404`, expected struct Directory at line 1 column 3)) while creating account for acme issuer "example"
[error] 253132#0: urn:ietf:params:acme:error:malformed: Order included DNS identifier with a value containing an illegal character: '_' while updating certificate "example/example.test-7bc5e05ebec0b7"
[warn] 253134#0: urn:ietf:params:acme:error:connection: Get "http://bad.example.test:8080/.well-known/acme-challenge/Wtc7EcUVSaf_n2dnQFMkh0Ms9m-lHQQ2IH0pjLmneAc": dial tcp 127.0.0.2:8080: connect: connection refused while updating certificate "example/example.test-cae37428fde76446"

@bavshin-f5 bavshin-f5 marked this pull request as ready for review October 8, 2025 00:27
Copy link
Contributor

@xeioex xeioex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise looks good.

Clean up error handling in the main update loop.
* Implement backoff for login errors
* Log new account URLs
We already log the URL, but on INFO level and only once.  And while it
is possible to find the URL from private key, the operation is not
trivial.

Let's make this simpler by writing a file named "account.url" to the
state directory whenever the issuer tells us it created a new account.

Fixes nginx#54
Previously, it was possible to suspend the module execution by sending a
sufficiently high Retry-After value.  Now we will refuse to wait longer
than one minute.
We don't care about the error kind here, we just need to pass it to the
caller and print. anyhow seems a bit too heavy for the task.

While we're at it, remove unused futures-channel.
Copy link
Contributor

@xeioex xeioex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@bavshin-f5 bavshin-f5 merged commit 3b5459d into nginx:main Oct 8, 2025
15 checks passed
@bavshin-f5 bavshin-f5 deleted the error-handling branch October 8, 2025 05:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants