Skip to content

Commit 4259229

Browse files
committed
Merge remote-tracking branch 'origin/main' into nextcloudclient
# Conflicts: # README.md # databusclient/client.py # poetry.lock # pyproject.toml # test.sh
2 parents a56f01d + cfdca3b commit 4259229

File tree

9 files changed

+544
-151
lines changed

9 files changed

+544
-151
lines changed
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: Build and push Docker image to DockerHub
2+
3+
on:
4+
push:
5+
branches: [ "main" ]
6+
workflow_dispatch: # allows manual trigger
7+
8+
jobs:
9+
push_to_registry:
10+
name: Push Docker image to Docker Hub
11+
runs-on: ubuntu-latest
12+
13+
steps:
14+
- name: Check out the repo
15+
uses: actions/checkout@v3
16+
17+
- name: Log in to Docker Hub
18+
uses: docker/login-action@v3
19+
with:
20+
username: ${{ secrets.DBP_DOCKERHUB_CREDENTIAL_USERNAME }}
21+
password: ${{ secrets.DBP_DOCKERHUB_CREDENTIAL_TOKEN_PUSHIMAGES }}
22+
23+
- name: Build and push Docker image
24+
uses: docker/build-push-action@v6
25+
with:
26+
context: .
27+
push: true
28+
tags: dbpedia/databus-python-client:latest

Dockerfile

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
FROM python:3.10-slim
2+
3+
WORKDIR /data
4+
5+
COPY . .
6+
7+
# Install dependencies
8+
RUN pip install .
9+
10+
# Use ENTRYPOINT for the CLI
11+
ENTRYPOINT ["databusclient"]

README.md

Lines changed: 168 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,69 @@
11
# Databus Client Python
22

3-
## Install
3+
## Quickstart Example
4+
Commands to download the DBpedia Knowledge Graphs generated by Live Fusion.
5+
DBpedia Live Fusion publishes two different kinds of KGs:
6+
7+
1. Open Core Knowledge Graphs under CC-BY-SA license, open with copyleft/share-alike, no registration needed
8+
2. Industry Knowledge Graphs under BUSL 1.1 license, unrestricted for research and experimentation, commercial license for productive use, free registration needed.
9+
10+
11+
### Registration (Access Token)
12+
13+
1. If you do not have a DBpedia Account yet (Forum/Databus), please register at https://account.dbpedia.org
14+
2. Login at https://account.dbpedia.org and create your token.
15+
3. Save the token to a file `vault-token.dat`.
16+
17+
### Docker vs. Python
18+
The databus-python-client comes as **docker** or **python** with these patterns.
19+
`$DOWNLOADTARGET` can be any Databus URI including collections OR SPARQL query (or several thereof). Details are documented below.
420
```bash
21+
# Docker
22+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download $DOWNLOADTARGET --token vault-token.dat
23+
# Python
524
python3 -m pip install databusclient
25+
databusclient download $DOWNLOADTARGET --token vault-token.dat
26+
```
27+
28+
### Download Live Fusion KG Snapshot (BUSL 1.1, registration needed)
29+
TODO One slogan sentence. [More information](https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-kg-snapshot)
30+
```bash
31+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-kg-snapshot --token vault-token.dat
32+
```
33+
34+
### Download Enriched Knowledge Graphs (BUSL 1.1, registration needed)
35+
**DBpedia Wikipedia Extraction Enriched**
36+
TODO One slogan sentence and link
37+
Currently EN DBpedia only.
38+
39+
```bash
40+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikipedia-kg-enriched-snapshot --token vault-token.dat
41+
```
42+
**DBpedia Wikidata Extraction Enriched**
43+
TODO One slogan sentence and link
44+
45+
```bash
46+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/dbpedia-wikidata-kg-enriched-snapshot --token vault-token.dat
47+
```
48+
49+
### Download DBpedia Wikipedia Knowledge Graphs (CC-BY-SA, no registration needed)
50+
TODO One slogan sentence and link
51+
52+
```bash
53+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikipedia-kg-snapshot
54+
```
55+
### Download DBpedia Wikidata Knowledge Graphs (CC-BY-SA, no registration needed)
56+
TODO One slogan sentence and link
57+
58+
```bash
59+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/dbpedia-wikidata-kg-snapshot
660
```
761

62+
## Docker Image Usage
63+
64+
A docker image is available at [dbpedia/databus-python-client](https://hub.docker.com/r/dbpedia/databus-python-client). See [download section](#usage-of-docker-image) for details.
65+
66+
867
## Deploy to Databus
968
Please add databus API_KEY to .env file
1069
Use metadata.json file to list all files which should be added to the databus
@@ -48,6 +107,13 @@ python deploy.py \
48107

49108
```
50109
## CLI Usage
110+
111+
**Installation**
112+
```bash
113+
python3 -m pip install databusclient
114+
```
115+
116+
**Running**
51117
```bash
52118
databusclient --help
53119
```
@@ -65,15 +131,80 @@ Options:
65131
66132
Commands:
67133
deploy
68-
downoad
134+
download
69135
```
70-
### Deploy command
136+
137+
138+
139+
### Download command
71140
```
72-
databusclient deploy --help
141+
databusclient download --help
73142
```
143+
74144
```
145+
Usage: databusclient download [OPTIONS] DATABUSURIS...
75146
147+
Arguments:
148+
DATABUSURIS... databus uris to download from https://databus.dbpedia.org,
149+
or a query statement that returns databus uris from https://databus.dbpedia.org/sparql
150+
to be downloaded [required]
151+
152+
Download datasets from databus, optionally using vault access if vault
153+
options are provided.
154+
155+
Options:
156+
--localdir TEXT Local databus folder (if not given, databus folder
157+
structure is created in current working directory)
158+
--databus TEXT Databus URL (if not given, inferred from databusuri, e.g.
159+
https://databus.dbpedia.org/sparql)
160+
--token TEXT Path to Vault refresh token file
161+
--authurl TEXT Keycloak token endpoint URL [default:
162+
https://auth.dbpedia.org/realms/dbpedia/protocol/openid-
163+
connect/token]
164+
--clientid TEXT Client ID for token exchange [default: vault-token-
165+
exchange]
166+
--help Show this message and exit. Show this message and exit.
167+
```
76168

169+
Examples of using download command
170+
171+
**File**: download of a single file
172+
```
173+
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01/mappingbased-literals_lang=az.ttl.bz2
174+
```
175+
176+
**Version**: download of all files of a specific version
177+
```
178+
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
179+
```
180+
181+
**Artifact**: download of all files with latest version of an artifact
182+
```
183+
databusclient download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals
184+
```
185+
186+
**Group**: download of all files with lates version of all artifacts of a group
187+
```
188+
databusclient download https://databus.dbpedia.org/dbpedia/mappings
189+
```
190+
191+
If no `--localdir` is provided, the current working directory is used as base directory. The downloaded files will be stored in the working directory in a folder structure according to the databus structure, i.e. `./$ACCOUNT/$GROUP/$ARTIFACT/$VERSION/`.
192+
193+
**Collection**: download of all files within a collection
194+
```
195+
databusclient download https://databus.dbpedia.org/dbpedia/collections/dbpedia-snapshot-2022-12
196+
```
197+
198+
**Query**: download of all files returned by a query (sparql endpoint must be provided with `--databus`)
199+
```
200+
databusclient download 'PREFIX dcat: <http://www.w3.org/ns/dcat#> SELECT ?x WHERE { ?sub dcat:downloadURL ?x . } LIMIT 10' --databus https://databus.dbpedia.org/sparql
201+
```
202+
203+
### Deploy command
204+
```
205+
databusclient deploy --help
206+
```
207+
```
77208
Usage: databusclient deploy [OPTIONS] DISTRIBUTIONS...
78209
79210
Arguments:
@@ -82,23 +213,23 @@ Arguments:
82213
content variants of a distribution, fileExt and Compression can be set, if not they are inferred from the path [required]
83214
84215
Options:
85-
--versionid TEXT target databus version/dataset identifier of the form <h
216+
--version-id TEXT Target databus version/dataset identifier of the form <h
86217
ttps://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VE
87218
RSION> [required]
88-
--title TEXT dataset title [required]
89-
--abstract TEXT dataset abstract max 200 chars [required]
90-
--description TEXT dataset description [required]
91-
--license TEXT license (see dalicc.net) [required]
92-
--apikey TEXT apikey [required]
219+
--title TEXT Dataset title [required]
220+
--abstract TEXT Dataset abstract max 200 chars [required]
221+
--description TEXT Dataset description [required]
222+
--license TEXT License (see dalicc.net) [required]
223+
--apikey TEXT API key [required]
93224
--help Show this message and exit.
94225
```
95226
Examples of using deploy command
96227
```
97-
databusclient deploy --versionid https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 --title title1 --abstract abstract1 --description description1 --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
228+
databusclient deploy --version-id https://databus.dbpedia.org/user1/group1/artifact1/2022-05-18 --title title1 --abstract abstract1 --description description1 --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
98229
```
99230

100231
```
101-
databusclient deploy --versionid https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18 --title "Client Testing" --abstract "Testing the client...." --description "Testing the client...." --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
232+
databusclient deploy --version-id https://dev.databus.dbpedia.org/denis/group1/artifact1/2022-05-18 --title "Client Testing" --abstract "Testing the client...." --description "Testing the client...." --license http://dalicc.net/licenselibrary/AdaptivePublicLicense10 --apikey MYSTERIOUS 'https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml|type=swagger'
102233
```
103234

104235
A few more notes for CLI usage:
@@ -107,6 +238,31 @@ A few more notes for CLI usage:
107238
* For complete inferred: Just use the URL with `https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml`
108239
* If other parameters are used, you need to leave them empty like `https://raw.githubusercontent.com/dbpedia/databus/master/server/app/api/swagger.yml||yml|7a751b6dd5eb8d73d97793c3c564c71ab7b565fa4ba619e4a8fd05a6f80ff653:367116`
109240

241+
242+
243+
#### Authentication with vault
244+
245+
For downloading files from the vault, you need to provide a vault token. See [getting-the-access-refresh-token](https://github.com/dbpedia/databus-vault-access?tab=readme-ov-file#step-1-getting-the-access-refresh-token) for details. You can come back here once you have a `vault-token.dat` file. To use it, just provide the path to the file with `--token /path/to/vault-token.dat`.
246+
247+
Example:
248+
```
249+
databusclient download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23 --token vault-token.dat
250+
```
251+
252+
If vault authentication is required for downloading a file, the client will use the token. If no vault authentication is required, the token will not be used.
253+
254+
#### Usage of docker image
255+
256+
A docker image is available at [dbpedia/databus-python-client](https://hub.docker.com/r/dbpedia/databus-python-client). You can use it like this:
257+
258+
```
259+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia/mappings/mappingbased-literals/2022.12.01
260+
```
261+
If using vault authentication, make sure the token file is available in the container, e.g. by placing it in the current working directory.
262+
```
263+
docker run --rm -v $(pwd):/data dbpedia/databus-python-client download https://databus.dbpedia.org/dbpedia-enterprise/live-fusion-snapshots/fusion/2025-08-23/fusion_props=all_subjectns=commons-wikimedia-org_vocab=all.ttl.gz --token vault-token.dat
264+
```
265+
110266
## Module Usage
111267

112268
### Step 1: Create lists of distributions for the dataset

databusclient/cli.py

Lines changed: 50 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,43 +1,61 @@
11
#!/usr/bin/env python3
2-
import typer
2+
import click
33
from typing import List
44
from databusclient import client
55

6-
app = typer.Typer()
6+
7+
@click.group()
8+
def app():
9+
"""Databus Client CLI"""
10+
pass
711

812

913
@app.command()
10-
def deploy(
11-
version_id: str = typer.Option(
12-
...,
13-
help="target databus version/dataset identifier of the form "
14-
"<https://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VERSION>",
15-
),
16-
title: str = typer.Option(..., help="dataset title"),
17-
abstract: str = typer.Option(..., help="dataset abstract max 200 chars"),
18-
description: str = typer.Option(..., help="dataset description"),
19-
license_uri: str = typer.Option(..., help="license (see dalicc.net)"),
20-
apikey: str = typer.Option(..., help="apikey"),
21-
distributions: List[str] = typer.Argument(
22-
...,
23-
help="distributions in the form of List[URL|CV|fileext|compression|sha256sum:contentlength] where URL is the "
24-
"download URL and CV the "
25-
"key=value pairs (_ separated) content variants of a distribution. filext and compression are optional "
26-
"and if left out inferred from the path. If the sha256sum:contentlength part is left out it will be "
27-
"calcuted by downloading the file.",
28-
),
29-
):
30-
typer.echo(version_id)
31-
dataid = client.create_dataset(
32-
version_id, title, abstract, description, license_uri, distributions
33-
)
14+
@click.option(
15+
"--version-id", "version_id",
16+
required=True,
17+
help="Target databus version/dataset identifier of the form "
18+
"<https://databus.dbpedia.org/$ACCOUNT/$GROUP/$ARTIFACT/$VERSION>",
19+
)
20+
@click.option("--title", required=True, help="Dataset title")
21+
@click.option("--abstract", required=True, help="Dataset abstract max 200 chars")
22+
@click.option("--description", required=True, help="Dataset description")
23+
@click.option("--license", "license_url", required=True, help="License (see dalicc.net)")
24+
@click.option("--apikey", required=True, help="API key")
25+
@click.argument(
26+
"distributions",
27+
nargs=-1,
28+
required=True,
29+
)
30+
def deploy(version_id, title, abstract, description, license_url, apikey, distributions: List[str]):
31+
"""
32+
Deploy a dataset version with the provided metadata and distributions.
33+
"""
34+
click.echo(f"Deploying dataset version: {version_id}")
35+
dataid = client.create_dataset(version_id, title, abstract, description, license_url, distributions)
3436
client.deploy(dataid=dataid, api_key=apikey)
3537

3638

3739
@app.command()
38-
def download(
39-
localDir: str = typer.Option(..., help="local databus folder"),
40-
databus: str = typer.Option(..., help="databus URL"),
41-
databusuris: List[str] = typer.Argument(...,help="any kind of these: databus identifier, databus collection identifier, query file")
42-
):
43-
client.download(localDir=localDir,endpoint=databus,databusURIs=databusuris)
40+
@click.argument("databusuris", nargs=-1, required=True)
41+
@click.option("--localdir", help="Local databus folder (if not given, databus folder structure is created in current working directory)")
42+
@click.option("--databus", help="Databus URL (if not given, inferred from databusuri, e.g. https://databus.dbpedia.org/sparql)")
43+
@click.option("--token", help="Path to Vault refresh token file")
44+
@click.option("--authurl", default="https://auth.dbpedia.org/realms/dbpedia/protocol/openid-connect/token", show_default=True, help="Keycloak token endpoint URL")
45+
@click.option("--clientid", default="vault-token-exchange", show_default=True, help="Client ID for token exchange")
46+
def download(databusuris: List[str], localdir, databus, token, authurl, clientid):
47+
"""
48+
Download datasets from databus, optionally using vault access if vault options are provided.
49+
"""
50+
client.download(
51+
localDir=localdir,
52+
endpoint=databus,
53+
databusURIs=databusuris,
54+
token=token,
55+
auth_url=authurl,
56+
client_id=clientid,
57+
)
58+
59+
60+
if __name__ == "__main__":
61+
app()

0 commit comments

Comments
 (0)