Skip to content

Commit 1f5250c

Browse files
committed
docs: integration of more detailed documentation
1 parent 58bd13b commit 1f5250c

File tree

8 files changed

+355
-73
lines changed

8 files changed

+355
-73
lines changed

.github/workflows/docs.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: Deploy docs to GitHub Pages
2+
3+
on:
4+
push:
5+
branches:
6+
- main # or 'master' if that's your default branch
7+
8+
jobs:
9+
deploy:
10+
runs-on: ubuntu-latest
11+
12+
steps:
13+
- name: Checkout repository
14+
uses: actions/checkout@v3
15+
16+
- name: Set up Python
17+
uses: actions/setup-python@v4
18+
with:
19+
python-version: '3.10'
20+
21+
- name: Install dependencies
22+
run: |
23+
python -m pip install --upgrade pip
24+
pip install mkdocs mkdocs-material mkdocs-jupyter mkdocs-mermaid2-plugin
25+
26+
- name: Deploy to GitHub Pages
27+
run: |
28+
mkdocs gh-deploy --force

README.md

Lines changed: 8 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -1,76 +1,14 @@
1-
# APEx Dispatch API (FastAPI)
21

3-
This repository contains the implementation of the APEx Upscaling Service API using FastAPI.
2+
# APEx Dispatch API
43

5-
## Getting Started: Running the API Locally
4+
The **APEx Dispatch API** provides a service for **executing and upscaling EO-based services** across multiple platforms.
5+
It is built with **FastAPI** and is designed for scalable, asynchronous processing.
66

7-
1. **Install dependencies:**
7+
## 📦 Getting Started
88

9-
```bash
10-
pip install -r requirements.txt
11-
```
9+
Learn more on how to get started [here](docs/getting_started.md)
1210

13-
2. **Configure environment variables:**
11+
## 🤝 Contributing
1412

15-
Create a `.env` file and set your environment variables accordingly (e.g., `DATABASE_URL`).
16-
17-
3. **Set up the database:**
18-
19-
Follow the [Database Setup](#database-setup) instructions below to prepare your local PostgreSQL instance.
20-
21-
4. **Run the FastAPI application:**
22-
23-
```bash
24-
uvicorn app.main:app --reload
25-
```
26-
27-
## Running Tests
28-
29-
Execute the test suite using:
30-
31-
```bash
32-
pytest
33-
```
34-
35-
## Database Setup
36-
37-
1. **(Optional) Create a Docker volume to persist PostgreSQL data:**
38-
39-
```bash
40-
docker volume create local-postgres-data
41-
```
42-
43-
2. **(Optional) Inspect the volume mount point:**
44-
45-
```bash
46-
docker volume inspect local-postgres-data
47-
```
48-
49-
This shows the physical location of your data on the host machine.
50-
51-
3. **Start a PostgreSQL container linked to the volume:**
52-
53-
```bash
54-
docker run -d --name postgres -p 5432:5432 \
55-
-e POSTGRES_USER=testuser \
56-
-e POSTGRES_PASSWORD=secret \
57-
-e POSTGRES_DB=testdb \
58-
-v local-postgres-data:/var/lib/docker/volumes/local-postgres-data \
59-
postgres:latest
60-
```
61-
62-
4. **Set your database connection string:**
63-
64-
Add the following to your `.env.local` (or `.env`) file:
65-
66-
```env
67-
DATABASE_URL=postgresql+psycopg2://testuser:secret@localhost:5432/testdb
68-
```
69-
70-
5. **Apply database migrations:**
71-
72-
Make sure your database schema is up-to-date by running:
73-
74-
```bash
75-
alembic upgrade head
76-
```
13+
We welcome contributions!
14+
Please read our [contributing guidelines](docs/contributing.md) before submitting pull requests.

docs/architecture.md

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
# Architecture Overview
2+
3+
The **APEx Dispatch API** acts as a **broker service** that allows clients to trigger job executions on external Earth Observation platforms.
4+
Instead of interacting directly with platform-specific APIs, clients can use the **uniform Dispatch API interface**, while the dispatcher handles the translation and job management.
5+
6+
## Key Concepts
7+
8+
### Dispatch API
9+
10+
The **Dispatch API** is the core component of the system. It acts as the entry point for clients who want to execute jobs or perform upscaling tasks. When a job request is submitted, the dispatcher takes care of translating it into a **platform-specific request** using standards such as [openEO](https://openeo.org/) or [OGC API – Processes](https://ogcapi.ogc.org/processes/) that is sent to an existing EO platform, such as CDSE or the Geohazard Exploitation Platform. Beyond handling the translation, the dispatcher is also responsible for storing all relevant job information, including metadata and references to the external platform where the job is ultimately executed.
11+
12+
```mermaid
13+
flowchart LR
14+
C["Client"] --> D["APEx Dispatch API"]
15+
D-.->C
16+
D-->P1[Platform 1]
17+
P1-.->D
18+
D-->P2[Platform 2]
19+
P2-.->D
20+
D-->P3[Platform 3]
21+
P3-.->D
22+
```
23+
24+
### Processing Job Execution
25+
26+
When a client wants to perform a task, it submits a job to the Dispatch API. A job request typically contains two main pieces of information: the service that needs to be executed and the parameters required for that service. Once received, the dispatcher forwards this request to the chosen external platform, which carries out the execution. In response, the platform provides a job identifier, which the dispatcher records internally to keep track of the execution.
27+
28+
After a job has been submitted and forwarded to an external platform, the dispatcher maintains an internal record of it. This record includes a unique internal job identifier, which the client can use for reference, as well as the mapping to the external platform’s job ID. Additional metadata, such as the job status, the creation timestamp, and the parameters used during submission, are also stored. This internal tracking mechanism ensures that the client has a single point of reference for all jobs, regardless of where they are executed.
29+
30+
``` mermaid
31+
sequenceDiagram
32+
participant UI as Client
33+
box APEx
34+
participant API as APEx Dispatch API
35+
end
36+
box Platform
37+
participant Platform as API (openEO / OGC API Process)
38+
end
39+
40+
UI->>API: POST /unit_jobs
41+
42+
API->>API: Create processing job
43+
API->>Platform: Submit processing job
44+
Platform-->>API: Return platform job ID
45+
API->>API:Store platform job ID
46+
API->>API:Set job status as "submitted"
47+
48+
API-->>UI: Return processing job summary
49+
```
50+
51+
### Upscaling Task Execution
52+
53+
In addition to individual job submissions, the dispatcher also supports **upscaling** activities. In this case, a client submits a request that includes not just the target service and execution parameters, but also a **parameter dimension with multiple values**. The dispatcher uses this information to generate multiple job requests, each corresponding to one value in the parameter dimension, and forwards them to the external platform. From the client’s perspective, however, this entire batch of jobs is managed as a single **upscaling task**. The dispatcher keeps track of the execution of all related jobs and exposes them as part of one unified task, simplifying monitoring and retrieval for the user.
54+
55+
``` mermaid
56+
sequenceDiagram
57+
participant UI as Client
58+
box APEx
59+
participant API as APEx Dispatch API
60+
end
61+
box Platform
62+
participant Platform as API (openEO / OGC API Process)
63+
end
64+
65+
UI->>API: POST /upscale_tasks
66+
67+
API->>API: Create upscaling task
68+
69+
70+
loop For each job of upscaling task
71+
API->>API: Create processing job
72+
API->>Platform: Submit processing job
73+
Platform-->>API: Return platform job ID
74+
API->>API:Store platform job ID
75+
API->>API:Set job status as "submitted"
76+
end
77+
78+
API-->>UI: Return upscaling task summary
79+
```
80+
81+
### Status Retrieval
82+
83+
To check the progress of their jobs and upscale tasks, clients use a single status endpoint exposed by the Dispatch API. When such a request arrives, the dispatcher looks up the corresponding external job reference stored in its internal records. It then queries the external platform to obtain the most up-to-date status. This status information is returned to the client, allowing them to monitor their job execution transparently through the dispatcher without needing to interact with the external platform directly.
84+
85+
``` mermaid
86+
sequenceDiagram
87+
participant UI as Client
88+
box APEx
89+
participant API as APEx Dispatch API
90+
end
91+
box Platform
92+
participant Platform as API (openEO / OGC API Process)
93+
end
94+
95+
UI->>+API: Set up websocket to /job_status
96+
97+
loop Every X minutes
98+
loop For each running processing job
99+
API->>Platform: Request job status
100+
Platform-->>API: Send job status
101+
API->>API: Update job status
102+
end
103+
104+
loop For each running upscaling task
105+
loop For each running processing job in upscaling task
106+
API->>Platform: Request job status
107+
Platform-->>API: Send job status
108+
API->>API: Update job status
109+
end
110+
API->>API: Compute upscaling task status
111+
end
112+
end
113+
API-->>-UI: Return summary list of processing jobs and upscaling tasks
114+
```
115+
116+
## Authentication and Authorization
117+
118+
Authentication and authorization are critical components of the APEx Dispatch API, as jobs launched through the API result in resource consumption on external platforms. To support remote job execution and manage this resource usage effectively, the project has identified two distinct scenarios:
119+
120+
### APEx Service Account (Current Implementation)
121+
122+
In this scenario, all jobs are executed on the external platforms using a generic APEx service account that has access to them. This means that each job or upscaling task triggered through the API is executed on the platform under the APEx account, rather than the actual user’s identity. However, the Dispatch API maintains the link between the platform job ID and the user who initiated the request in its database.
123+
124+
```mermaid
125+
flowchart LR
126+
C["Alice"] -- Request job---> D["APEx Dispatch API"]
127+
D--Launch job as user APEx -->P1[Platform ]
128+
```
129+
130+
**Pros:**
131+
132+
* Provides a seamless user experience: users do not need to create or manage platform-specific accounts.
133+
* Simplifies integration for clients using the Dispatch API.
134+
135+
**Cons:**
136+
137+
* For each new platform, a dedicated APEx account must be created and funded appropriately. Estimating the required funding in advance is challenging, especially since the account is shared across all users.
138+
* No user-level auditing or accounting is available. Users can continue triggering jobs as long as the APEx account has sufficient funds. This poses a risk of misuse, potentially leading to service disruption for all users if the APEx account is depleted.
139+
* Implementing safeguards would require advanced accounting features within the Dispatch API, requiring the translation of existing business models into a uniform business logic. This adds complexity and may introduce additional costs by layering over existing platform models.
140+
141+
### User Impersonation (Preferred Approach)
142+
143+
The preferred solution is to execute jobs on behalf of the user who initiates the request via the APEx Dispatch API. In this model, all accounting and access control are handled directly by the platform, and users are responsible for maintaining sufficient access and funding—potentially supported through the ESA Network of Resources (NoR).
144+
145+
```mermaid
146+
flowchart LR
147+
C["Alice"] -- Request job---> D["APEx Dispatch API"]
148+
D--Launch job as user Alice -->P1[Platform ]
149+
```
150+
151+
**Pros:**
152+
153+
* No need for custom accounting logic in the APEx Dispatch API, as platforms handle this natively.
154+
* APEx avoids introducing a layer over the platform’s existing business model, preserving operational simplicity.
155+
156+
**Cons:**
157+
158+
* Propagating user identity across platforms is a technical challenge and currently lacks a proven, ready-to-use solution.
159+
* May require modifications on the target platform to support user impersonation, depending on the chosen implementation strategy.

CONTRIBUTE.md renamed to docs/contributing.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Contributing to the APEx Dispatch API
1+
# Contributing
22

33
## Making Contributions
44

@@ -27,7 +27,7 @@ Contributions to the APEx Dispatch API are welcome! If you have suggestions for
2727

2828
## Registration of a new Platform Implementation
2929

30-
To add a new platform implementation, you will need to create a new class that inherits from the `BaseProcessingPlatform` class located at [`app/platforms/base.py`](app/platforms/base.py). In this new class, you will need to implement all the abstract methods defined in the [`BaseProcessingPlatform`](app/platforms/base.py) class. This will ensure that your new platform implementation adheres to the expected interface and functionality.
30+
To add a new platform implementation, you will need to create a new class that inherits from the `BaseProcessingPlatform` class located at `app/platforms/base.py`. In this new class, you will need to implement all the abstract methods defined in the `BaseProcessingPlatform` class. This will ensure that your new platform implementation adheres to the expected interface and functionality.
3131

3232
To register the new implementation, it is important to add the following directive right above the class definition:
3333

@@ -40,6 +40,6 @@ class OGCAPIProcessPlatform(BaseProcessingPlatform):
4040
...
4141
```
4242

43-
The processing type, defined by `ProcessTypeEnum`, is the unique identifier for the platform implementation. It is used to distinguish between different platform implementations in the system. This value is used by the different request endpoints to determine which platform implementation to use for processing the request. To add a new platform implementation, you will need to define a new `ProcessTypeEnum` value in the [`app/schemas/enum.py`](app/schemas/enum.py) file. This value should be unique and descriptive of the platform you are implementing.
43+
The processing type, defined by `ProcessTypeEnum`, is the unique identifier for the platform implementation. It is used to distinguish between different platform implementations in the system. This value is used by the different request endpoints to determine which platform implementation to use for processing the request. To add a new platform implementation, you will need to define a new `ProcessTypeEnum` value in the `app/schemas/enum.py` file. This value should be unique and descriptive of the platform you are implementing.
4444

4545
Once you have completed the above steps, the new platform implementation will be registered automatically and made available for use in the APEx Dispatch API. You can then proceed to implement the specific functionality required for your platform.

0 commit comments

Comments
 (0)