Skip to content

Comments

Granular exception handling for SML/SMP lookup failures#68

Open
harsha-amarasiri wants to merge 1 commit intoOxalisCommunity:masterfrom
harsha-amarasiri:fix/586-lookup-client-exceptions
Open

Granular exception handling for SML/SMP lookup failures#68
harsha-amarasiri wants to merge 1 commit intoOxalisCommunity:masterfrom
harsha-amarasiri:fix/586-lookup-client-exceptions

Conversation

@harsha-amarasiri
Copy link

@harsha-amarasiri harsha-amarasiri commented Feb 17, 2026

Pull Request Description

Resolves #586, relates to oxalis#586, oxalis#666, Oxalis-AS4#158, vefa-peppol#56

Lookup failures are reported as generic a LookupException, and sometimes wrapping FileNotFoundException, making it difficult to distinguish resource/configuration failures from network related failures. As a result implementing or attempting retry is also challenging.

This change introduces a domain-specific exception taxonomy: Exceptions are derived from the discussion in Oxalis-AS4#158

  LookupException (base — existing, backward compatible)
  ├── PeppolResourceException — Peppol resource is not available, configuration updates are usually required.
  │       e.g. participant not in SML, SMP returns 404
  ├── PeppolInfrastructureException — Recoverable error and  document can be retried/resent
  │       e.g. SMP 5xx, DNS Lookup Results TRY_AGAIN, UNRECOVERABLE
  └── NetworkFailureException — transient, network failures, retry may help
           e.g. socket timeout, connection refused, unknown host at http

DNS layer (BdxlLocator, BusdoxLocator):

  • HOST_NOT_FOUND / TYPE_NOT_FOUND -> PeppolResourceException
  • TRY_AGAIN -> PeppolInfrastructureException (with UDP->TCP fallback)
  • UNRECOVERABLE -> PeppolInfrastructureException

HTTP layer (ApacheFetcher, UrlFetcher):

  • 404 -> PeppolResourceException
  • 500/502/503/504 -> PeppolInfrastructureException
  • Socket/timeout/DNS errors -> NetworkFailureException
  • FileNotFoundException is no longer thrown anywhere

LookupClient:

  • Removed catch(FileNotFoundException) blocks
  • Retained existing API contracts - backwards compatible.
  • All other exceptions propagate with correct type preserved

Design Notes

  • Extracted lookup method into individual private helper for readability purposes
  • Lazy initialization of HttpClient for resource optimization

Test Coverage

  • noSML/noSMP tests now throw PeppolResourceException
  • BdxlLocatorTest / BusdoxLocatorTest: DNS result code -> exception type mapping
  • ApacheFetcherTest: HTTP status code -> exception type mapping (WireMock)
  • FileNotFoundException regression guards across all layers

Type of Pull Request

  • New feature/Enhancement - non-breaking change which adds functionality
  • Bug fix
  • Breaking change (Require Major version change?)

Type of Change

  • OpenPeppol AS2/AS4 specification
  • OpenPeppol Spring/Fall release
  • Oxalis software change or enhancement
  • CEF change

Pull Request Checklist:

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas. But did not added unnecessary annotation/comment say @author name etc
  • I have checked my code for variable and method name and corrected grammar/spelling mistakes if any
  • I have made corresponding changes to the documentation where needed
  • My changes generate no new/additional warnings
  • My change is not breaking or creating conflict with associated dependencies
  • I have performed a self-review of my own code
  • I ran mvn clean install before commit and all tests run successfully
  • I conducted basic QA to assure all features are working fine
  • My pull request generate no conflicts with master branch
  • I requested code review from other team members

@harsha-amarasiri harsha-amarasiri force-pushed the fix/586-lookup-client-exceptions branch from 9673be5 to 1d39cdf Compare February 17, 2026 12:03
@dladlk
Copy link
Collaborator

dladlk commented Feb 17, 2026

Did you notice yesterday an error at ELMA SMP? It returned 404 code - but actually it was Tomcat exception, fixed this morning. The world is not ideal.

So even if SMP says 404 - it DOES NOT mean permanent error. Actually, only NotFoundException can be considered "permanent" - because in 15 minutes it can actually be published.

Everything is temporarily :) Never give up and always retry :)

@harsha-amarasiri
Copy link
Author

Hi @dladlk ,

Thanks for the feedback. Sorry I did not know about the ELMA issue.

I was looking at it from the perspective retrying / re-queuing within a short time window, for resiliency. So "permanent" actually meant unavailable within that window, until the config changes are made and published. I understand the wording is ambiguous and can be misinterpreted. I will improve the documentation and comments to be more precise and unambiguous.

Is there a design/implementation level concern that you see ?

Resolves #586, relates to Oxalis-AS4#158, #497, #666, vefa-peppol#56.

Production monitoring indicates that lookup/delivery failures thrown were
generic LookupException, some pointing to FileNotFoundException, making it difficult to distinguish permanent failures from transient ones. As a result implementing or attempting retry is challenging.

This change introduces a domain-specific exception taxonomy:
Exceptions are derived from the discussion in Oxalis-AS4#158
  LookupException (base — existing, backward compatible)
  ├── PeppolResourceException — permanent, do not retry
  │   e.g. participant not in SML, SMP returns 404
  ├── PeppolInfrastructureException — transient, retry may help
  │   e.g. SMP 5xx, DNS SERVFAIL, DNS REFUSED
  └── NetworkFailureException — transient, retry may help
      e.g. socket timeout, connection refused, unknown host

DNS layer (BdxlLocator, BusdoxLocator):
- HOST_NOT_FOUND / TYPE_NOT_FOUND → PeppolResourceException
- TRY_AGAIN → PeppolInfrastructureException (with UDP→TCP fallback)
- UNRECOVERABLE → PeppolInfrastructureException

HTTP layer (ApacheFetcher, UrlFetcher):
- 404 → PeppolResourceException
- 500/502/503/504 → PeppolInfrastructureException
- Socket/timeout/DNS errors → NetworkFailureException
- FileNotFoundException is no longer thrown anywhere

LookupClient:
- Removed dead catch(FileNotFoundException) blocks
- PeppolResourceException caught and enriched with receiver/doctype context
- All other exceptions propagate with correct type preserved
@harsha-amarasiri harsha-amarasiri force-pushed the fix/586-lookup-client-exceptions branch from 1d39cdf to 59c2821 Compare February 17, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants