Skip to content

Implement Bulk Submit data provider (submitter) functionality #2556

@johngrimes

Description

@johngrimes

Pathling currently implements the Data Warehouse (recipient) side of the Argonaut $bulk-submit specification. This issue tracks implementing the Data Provider (submitter) side — enabling Pathling to push its data to a remote server's $bulk-submit endpoint.

This would complement the existing $export operation by allowing Pathling to not only export data on demand, but actively submit it to a configured Data Recipient.

Scope

The submitter implementation should handle the full submission lifecycle:

  1. Export data — Produce bulk export manifests (NDJSON) from Pathling's data warehouse, leveraging the existing $export infrastructure.
  2. Submit manifests — Send one or more $bulk-submit requests with submissionStatus: in-progress and manifestUrl to the Data Recipient.
  3. Mark complete — Send a $bulk-submit request with submissionStatus: complete once all manifests have been submitted.
  4. Poll status — Use $bulk-submit-status to monitor processing progress and retrieve results (including errors).
  5. Handle errors — Parse the status manifest error section and surface issues appropriately.

Additional considerations

  • Authentication — Support OAuth 2.0 client credentials for authenticating with the Data Recipient's $bulk-submit endpoint (both symmetric and asymmetric/JWT).
  • Manifest hosting — The submitter needs to make its exported files available at URLs the Data Recipient can fetch. This may involve serving files via an HTTP endpoint or uploading to object storage.
  • Retry and resilience — Handle transient failures, HTTP 429 rate limiting (with Retry-After), and network errors during submission and polling.
  • Abort — Support aborting an in-progress submission (submissionStatus: aborted).
  • Replace — Support the replacesManifestUrl parameter for correcting previously submitted manifests.
  • Configuration — Define configuration for target Data Recipient endpoints, submitter identity, and OAuth credentials.

Metadata

Metadata

Assignees

No one assigned

    Labels

    new featureNew feature or requestserverIssues relating to Pathling server.

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions