Skip to content

fix: Remove persistent error guard to enable automatic certificate request retry#202

Open
Copilot wants to merge 2 commits intomainfrom
copilot/add-clear-certificate-error-action
Open

fix: Remove persistent error guard to enable automatic certificate request retry#202
Copilot wants to merge 2 commits intomainfrom
copilot/add-clear-certificate-error-action

Conversation

Copy link

Copilot AI commented Feb 16, 2026

Description

Fixes #194
Certificate requests that failed with persistent errors (DOMAIN_NOT_ALLOWED, IP_NOT_ALLOWED, etc.) were permanently skipped, requiring operators to remove and re-add relations to retry after fixing underlying issues.

Removed the persistent error guard from _configure_certificates(). All outstanding certificate requests now retry on each hook execution. Errors remain in relation data for requirer visibility but no longer block retry attempts.

Changes:

  • Removed get_provider_certificate_errors() call (line 207)
  • Removed persistent_error_requests set creation and filtering logic (lines 223-227)
  • Removed skip guard for persistent errors (lines 233-242)
  • Updated docstring to reflect automatic retry behavior
# Before: persistent errors blocked retry
persistent_error_requests = {
    (error.relation_id, error.certificate_signing_request.raw)
    for error in provider_errors
    if error.error.code != CertificateRequestErrorCode.SERVER_NOT_AVAILABLE.value
}
if request_key in persistent_error_requests:
    continue

# After: all outstanding requests retry
for certificate_request, assigned_certificates in certificate_pair_map.items():
    if assigned_certificates:
        continue
    # Process request

Hook execution frequency (5min default) provides natural rate limiting.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that validate the behaviour of the software
  • I validated that new and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
Original prompt

This section details on the original issue you should resolve

<issue_title>No way to clear certificate request errors</issue_title>
<issue_description>### Bug Description

When we fail to provision a certificate for various reasons, we encounter an error: Skipping certificate request with persistent error

lego-operator/src/charm.py

Lines 186 to 191 in d735ae7

if request_key in persistent_error_requests:
logger.debug(
"Skipping certificate request with persistent error: %s",
certificate_request.certificate_signing_request.raw,
)
continue

After we fix the underlying issue, there is no way for us to tell LEGO to re-attempt requesting certificates. We need e.g. a juju action to clear-certificate-request-errors. Right now we must remove-relation and re-relate the application, requesting ALL certificates again

To Reproduce

N/A

Environment

PS7, lego rev274

Relevant log output

2026-02-05 17:13:14 DEBUG unit.lego-ingress-ps7-snapstore/1.juju-log server.go:405 Skipping certificate request with persistent error: -----BEGIN CERTIFICATE REQUEST-----
MIIC5TCCAc0CAQAwXjEtMCsGA1UEAwwkZ3JhZmFuYS5zdGFnaW5nLnVidW5ldC5j
YW5vbmljYWwuY29tMS0wKwYDVQQtDCQyNzUxYWI0My0zNWZiLTQzMDctYjhkNC01
YmJlYjE1Zjk4ZDAwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEKAoIBAQCTBein
CDyP6+djipom2HagGmUYisGrjP1ZnVvQY7KD6nIOSRVqfwelVwHdquV/QL499Aay
iXxqi3mLloOBstndttguPB/w/icQb9FWbHM4wJ9qm9m9USQMtgT25YSuuV41wASn
yyaL2ZY8j7K61brYqWF/R6htYMTGw/yNg2kTDAy7K9u4/sNSPdPg4Ve9fLuH7GyV
VX+RStP7JDj1oYSqCt1UCBOPdJ5oHT0VD3BzNOb903/AJwYQuB41SWLMscny7GNA
C756uD9UHrgIkQSXx4qx/4vJo+fNtMRQD2rkTv2e2Bg1s+TtiBivVSU0y/xJ120f
MxiK2KCZqozyXJyvAgMBAAGgQjBABgkqhkiG9w0BCQ4xMzAxMC8GA1UdEQQoMCaC
JGdyYWZhbmEuc3RhZ2luZy51YnVuZXQuY2Fub25pY2FsLmNvbTANBgkqhkiG9w0B
AQsFAAOCAQEAgz/8f2Snw4NHxzbvKAuCtXjPhA10sTV/I6oQO3crmu6Mp3OcQ+Ur
X2w3ejfSH3QBtd5EmLUJaY+cqzaOtwKtwtPlnx2ZpIAeBuKBAqYIBUuUzK26I2T+
2rUcPlgvPPBMGPJC9S8I5afXrcG0hraUgYzmC4FCAMK4Ksmfol7GdsPi4h9H9LsR
azsOZRCrdGwWOHxfYAm9lY7WG8Dy/ML7PWUcJws/qX4Pmcfas4c4pPJHwAzQ/Q4L
O3CJ9yOV+k7b2iK0zV8WoY1XBd+xM28DPzSnwZBu2FkwPoy/XpSftxaNAqE/60do
K6teZvmAx5j4dt/miEYDj1OQFCfVqxWTOg==
-----END CERTIFICATE REQUEST-----
2026-02-05 17:13:14 DEBUG unit.lego-ingress-ps7-snapstore/1.juju-log server.go:405 Skipping certificate request with persistent error: -----BEGIN CERTIFICATE REQUEST-----
MIIC1TCCAb0CAQAwVjElMCMGA1UEAwwcZ3JhZmFuYS51YnVuZXQuY2Fub25pY2Fs
LmNvbTEtMCsGA1UELQwkZWJlYjk4YzMtYjFkOC00MWQxLWI4YjUtMjcwYjY0OTI2
ODVmMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAkwXopwg8j+vnY4qa
Jth2oBplGIrBq4z9WZ1b0GOyg+pyDkkVan8HpVcB3arlf0C+PfQGsol8aot5i5aD
gbLZ3bbYLjwf8P4nEG/RVmxzOMCfapvZvVEkDLYE9uWErrleNcAEp8smi9mWPI+y
utW62Klhf0eobWDExsP8jYNpEwwMuyvbuP7DUj3T4OFXvXy7h+xslVV/kUrT+yQ4
9aGEqgrdVAgTj3SeaB09FQ9wczTm/dN/wCcGELgeNUlizLHJ8uxjQAu+erg/VB64
CJEEl8eKsf+LyaPnzbTEUA9q5E79ntgYNbPk7YgYr1UlNMv8SddtHzMYitigmaqM
8lycrwIDAQABoDowOAYJKoZIhvcNAQkOMSswKTAnBgNVHREEIDAeghxncmFmYW5h
LnVidW5ldC5jYW5vbmljYWwuY29tMA0GCSqGSIb3DQEBCwUAA4IBAQBFYN7n2kyS
HHTUvj7G+8we3AQwAXXhFTStb1CZ6D5+ZG9lcnE4JDYnZFyWWu5NyMX4PjfByH5s
LrjfdrZvukKKmo57WusHgYp2udjjDAJDp5h0yt2FwnhZ0XAX5n0URLqGcDyskTb4
mi25758tRdbw2Jtx6coKlOAAIOIk23Q+4jRsT1QwoY0hiJuGB4VcJACRl8t8NtIk
w0FGTmVI/6hzhb82r8PwS3fS6THS1bcdKTYFi8Z9F/AQP+zvtJAxOaSfPcDQfcZa
8HHrjCskWUM7BWupVmSfcvumqK6MCWioAd7/zFUky7Y/dRKF2FGpiKWCWYYHEFQl
k2sI2ipgVt67
-----END CERTIFICATE REQUEST-----
2026-02-05 17:13:14 INFO juju.worker.uniter.operation runhook.go:186 ran "update-status" hook (via hook dispatching script: dispatch)

Additional context

No response</issue_description>

<agent_instructions>Fix issue #194: Remove persistent error guard to enable automatic retry of failed certificate requests

PROBLEM:
The _configure_certificates() method in src/charm.py skips certificate requests that have persistent errors (lines 186-191). When a certificate request fails due to DNS misconfiguration, firewall rules, or other issues (DOMAIN_NOT_ALLOWED, IP_NOT_ALLOWED, etc.), the charm never retries it, even after the underlying issue is fixed. The only workaround is removing and re-adding the entire relation.

SOLUTION:
Remove the persistent error skipping logic. Let the charm automatically retry ALL outstanding certificate requests on each hook execution. When the underlying issue is resolved, the request succeeds and replaces the error with a certificate.

IMPLEMENTATION:
In the _configure_certificates()

  1. Remove line that gets provider_errors:
    provider_errors = self._tls_certificates.get_provider_certificate_errors()

  2. Remove the persistent_error_requests set creation (lines ~202-206):
    persistent_error_requests = {
    (error.relation_id, error.certificate_signing_request.raw)
    for error in provider_errors
    if error.error.code != CertificateRequestErrorCode.SERVER_NOT_AVAILABLE.value
    }

  3. Remove the guard that skips requests (lines ~213-219):
    request_key = (
    certificate_request.relation_id,
    certificate_request.certificate_signing_request.raw,
    )
    if request_key in persistent_error_requests:
    logger.debug("Skipping certificate request with persistent error: %s", ...)
    continue

4....


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

…tificate requests

Co-authored-by: saltiyazan <26838825+saltiyazan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add action to clear certificate request errors Remove persistent error guard to enable automatic certificate request retry Feb 16, 2026
Copilot AI requested a review from saltiyazan February 16, 2026 10:58
@saltiyazan saltiyazan marked this pull request as ready for review February 16, 2026 11:39
@saltiyazan saltiyazan requested a review from a team as a code owner February 16, 2026 11:39
@saltiyazan saltiyazan changed the title Remove persistent error guard to enable automatic certificate request retry fix: Remove persistent error guard to enable automatic certificate request retry Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No way to clear certificate request errors

2 participants