Skip to content

Conversation

@spacebear21
Copy link
Collaborator

@spacebear21 spacebear21 commented Dec 15, 2025

This PR introduces a unified payjoin-service binary that combines the OHTTP relay and payjoin directory services in one binary, as discussed in https://github.com/orgs/payjoin/discussions/775 and tracked in #941.

This approach refactors the relay & directory as tower Services, and simply routes requests to those existing services based on URL path discrimination, to introduce payjoin-service with minimal code changes. The idea is to merge this PR ~as-is as a foundation, then fold individual components into payjoin-service one by one in follow ups.

Much of this PR was written by Claude, with close supervision and scrutiny by me. I smoke-tested the binary with some basic curl to ensure it was routing properly, and updated payjoin-test-utils directory and relay test services to instances of payjoin-service. This way, we'll have some regression testing already in place as we fold more components into payjoin-service.

Pull Request Checklist

Please confirm the following before requesting review:

@coveralls
Copy link
Collaborator

coveralls commented Dec 15, 2025

Pull Request Test Coverage Report for Build 21080158958

Details

  • 240 of 379 (63.32%) changed or added relevant lines in 12 files are covered.
  • 40 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.8%) to 82.301%

Changes Missing Coverage Covered Lines Changed/Added Lines %
payjoin-directory/src/main.rs 0 1 0.0%
payjoin-test-utils/src/lib.rs 32 33 96.97%
ohttp-relay/src/sentinel.rs 30 33 90.91%
ohttp-relay/src/main.rs 0 7 0.0%
payjoin-service/src/main.rs 0 12 0.0%
ohttp-relay/src/lib.rs 54 70 77.14%
payjoin-service/src/config.rs 0 16 0.0%
payjoin-service/src/lib.rs 65 83 78.31%
payjoin-directory/src/lib.rs 46 70 65.71%
ohttp-relay/src/bootstrap/ws.rs 0 41 0.0%
Files with Coverage Reduction New Missed Lines %
ohttp-relay/src/bootstrap/ws.rs 2 0.0%
ohttp-relay/src/lib.rs 7 73.37%
payjoin-directory/src/lib.rs 31 49.9%
Totals Coverage Status
Change from base Build 21044982881: -0.8%
Covered Lines: 9914
Relevant Lines: 12046

💛 - Coveralls

@nothingmuch
Copy link
Collaborator

concept ACK.

IMO we should also let go of all of the config/cli boilerplate coming from the directory (which in turn borrows from payjoin-cli), since that mainly grew with not breaking compatibility in mind. A clean slate approach would be simpler and cleaner, and we can take our time with it.

I think main.rs can be removed entirely from this PR, focusing on making this a library crate that can be cleanly used in our test utils crate.

Then for deployment, a minimal main.rs can be added in a followup PR, tailored for container usage only.

The majority of the boilerplate in the existing code arises from interface between clap and config, and changing this code in a way that actually makes sense is tricky because of the many interactions between the different layers, but the CLI api was mainly designed around the existing environment variables.

So i guess the next thing to figure out how to configure such a minimal main.rs. config crate with only a mandatory config file is easiest on the rust side but may be a hassle to set up with e.g. docker compose. config crate using only environment variables maybe a better fit.

And we should probably look at how other projects handle this stuff.

@spacebear21 spacebear21 force-pushed the unified-payjoin-service branch 3 times, most recently from d6fe5ef to dc01a20 Compare December 16, 2025 22:01
@spacebear21 spacebear21 marked this pull request as ready for review December 16, 2025 22:17
@spacebear21
Copy link
Collaborator Author

Good points about the cli boilerplate. Instead of removing main.rs entirely I stripped it to a minimal binary which accepts a single optional argument for the config file path (inspired by cdk-mintd).

Since your last review, the latest push contains the above change, moves all the core logic to lib.rs, adds manual TLS support for tests, and replaces the payjoin-directory + ohttp-relay direct dependencies in test utils with two instances of payjoin-service.

@spacebear21 spacebear21 requested a review from zealsham December 16, 2025 22:22
@zealsham
Copy link
Collaborator

are we opting to keep hyper @nothingmuch ? asking because the inital plan was to move away from hyper and do all of our http stuffs with axum

@nothingmuch
Copy link
Collaborator

are we opting to keep hyper @nothingmuch ? asking because the inital plan was to move away from hyper and do all of our http stuffs with axum

@spacebear21 said that the challenges with hyper/tower service traits were easy enough to fix, so the original motivation for short cutting to axum might not be as strong as we thought (recall that originally i thought it would be easier to first use only the tower service trait and then integrate axum, but that was causing some headaches)

IMO there's no need to keep hyper, it's more low level than we need and doesn't have a native concept of routers that forces us to have more boilerplate, but if still allowing the hyper.Service to be in use is easy and doesn't get in the way (which kinda makes sense since axum uses it under the hood anyway, and both in turn rely on tower's service trait IIRC) then we don't need to remove it

@spacebear21
Copy link
Collaborator Author

@zealsham Indeed, instead of rewriting all the routing from scratch in axum I figured we could keep the existing logic in payjoin-directory/ohttp-relay mostly untouched while we introduce payjoin-service for the unified binary. In follow-ups we can rewrite individual components as tower services in chunks as we see fit, and softly deprecate hyper. For example we might want to start with a metrics service via axum middleware, replacing the current payjoin-directory /metrics endpoint. It should also be possible to do this work in parallel of other components.

Copy link
Collaborator Author

@spacebear21 spacebear21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments from mob programming session. Small fixes will go in immediately (including blocking self requests with HMAC of the body) and we will follow up in future PRs.


#[derive(Clone)]
pub struct Service {
config: Arc<RelayConfig>,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idiomatic approach: make RelayConfig public and make it a Builder pattern. Rename RelayConfig to InternalConfig, use RelayConfig for the Builder that builds that type.

}
}

/// Routes incoming requests to ohttp-relay or payjoin-directory based on the path.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have the gateway enforce that the relay isn't itself with a HMAC header.

@nothingmuch
Copy link
Collaborator

nothingmuch commented Jan 7, 2026

my mob review conclusion: concept ACK, made some minor requests for changes in the call (posted by spacebear):

  • handling of / path (GET -> directory, POST -> relay)
  • remove authority / short ID distinction as it is irrelevant
  • Debug trait bound for req type, we should only have 2 concrete ones and both should be Debug (Incoming and boxed Bytes)
  • best effort self loop detection with sentinel header

@spacebear21 spacebear21 force-pushed the unified-payjoin-service branch 2 times, most recently from af351cd to 7eb51af Compare January 15, 2026 00:57
This is a preparatory refactor ahead of introducing the unified
`payjoin-service`.
This makes ohttp relay modular as a tower Service, in preparation for
the unified payjoin-service.
Ensure that if a GATEWAY_URI is set, it must point to the hardcoded
default gateway. This ensures backwards-compatibility for existing
payjoin implementations. Alternate gateways can still be specified in
incoming requests via gateway opt-in.
Accept any Body implementation instead of only hyper::body::Incoming.
This enables integration with axum and other frameworks that use
different body types.
Accept any Body implementation instead of only hyper::body::Incoming.
This enables integration with axum and other frameworks that use
different body types.

Replace hyper_tungstenite with manual WebSocket upgrade handling
since hyper_tungstenite::upgrade() requires Request<Incoming>.
The generic hyper::upgrade::on() combined with tokio_tungstenite
provides equivalent functionality with generic body support.
This introduces the payjoin-service binary crate, which lives outside of
the workspace for now to enable independent testing and Cargo.lock
changes without causing conflicts.
To introduce payjoin-service, it can simply route requests to the
ohttp-relay or payjoin-directory sub-services based on URL path
discrimination. Individual components (e.g health checks, metrics...)
can then be migrated to axum in follow-ups, and use `tower` middleware
where appropriate to reduce boilerplate.
This replaces the direct dependencies on ohttp-relay and
payjoin-directory with a dependency on payjoin-service. The test
services still spin up two instances of the payjoin-service to simulate
a relay and directory running on isolated infrastructure.
@spacebear21 spacebear21 force-pushed the unified-payjoin-service branch 4 times, most recently from 4ad90dc to aeab555 Compare January 16, 2026 18:49
OHTTP privacy guarantees rely on the assumption that the relay and
gateway operate independently from each other. The payjoin service runs
both OHTTP relay and the directory gateway in the same process, so we
make a best-effort attempt to detect requests from the same instance via
a sentinel header. This check eliminates a potential footgun for service
operators and payjoin implementers.
@spacebear21 spacebear21 force-pushed the unified-payjoin-service branch from aeab555 to 89351fc Compare January 16, 2026 20:38
Copy link
Collaborator

@nothingmuch nothingmuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only reviewed the last commit for self loop detection, i think the API surface can be shrunk quite a bit without losing any desired behavior

use hex::{DisplayHex, FromHex};

/// HTTP header name for the sentinel tag.
pub const HEADER_NAME: &str = "x-pj-sentinel";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bikeshedding: maybe "x-ohttp-self-loop-tag" is more descriptive of what this is for?


/// Generate random sentinel tag at startup.
/// The relay and directory share this tag in a best-effort attempt
/// at preventing collusion from the same instance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't actually prevent collusion so i think just "at detecting self loops" is enough?

db: D,
ohttp: ohttp::Server,
metrics: Metrics,
sentinel_tag: Option<SentinelTag>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would prefer if this was unconditional, and either a tag of all 0s or preferably a random tag was generated instead, in order to reduce the conditional nesting a little in the handling

db: D,
ohttp: ohttp::Server,
metrics: Metrics,
sentinel_tag: Option<SentinelTag>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the internal tag is non optional, this could still be optional to allow overriding with a specific tag for testing, but i think i would prefer if we just eliminated the argument entirely

a test that ensures self loops are detected and rejected can confirm the desired behavior without setting this value or saying anything about the loop detection mechanism, and a white-box test for the mechanism can just rewrite the request and ensure that that is still allowed, so this can be eliminated from the API

let path_segments: Vec<&str> = path.split('/').collect();
debug!("Service::serve_request: {:?}", &path_segments);

let sentinel_header = parts
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not check the header here?

i think any self loop can be rejected even if it isn't destined for the OHTTP gateway, there is no valid reason to handle a self looping request as far as i can tell

this way handle_ohttp_gateway can remain agnostic of this detail

///
/// Note that incoming requests should be **rejected** when this function returns `Ok(true)`,
/// as that would indicate the relay and gateway are the same instance.
pub fn verify(tag: &SentinelTag, header_value: &str) -> Result<bool, InvalidHeader> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer verifying an HMAC so should probably be renamed to is_recognized or check_self_loop maybe?

/// Note that incoming requests should be **rejected** when this function returns `Ok(true)`,
/// as that would indicate the relay and gateway are the same instance.
pub fn verify(tag: &SentinelTag, header_value: &str) -> Result<bool, InvalidHeader> {
let header_bytes = <[u8; 32]>::from_hex(header_value).map_err(|_| InvalidHeader)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if instead of parsing the header, the tag was just stored already encoded in its internal representation then it could be compared directly which can simplify the return type to just a bool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants