Skip to content

Commit cf89311

Browse files
authored
Merge branch 'develop' into enh/add-asset-type
2 parents b9bceda + 375781a commit cf89311

File tree

58 files changed

+1096
-703
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+1096
-703
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ repos:
4040
hooks:
4141
- id: pytest-check
4242
name: pytest-check
43-
entry: pytest src/tests
43+
entry: pytest src/tests --versions ''
4444
language: system
4545
pass_filenames: false
4646
exclude: ".*.md"

docs/README.md

Lines changed: 9 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,20 @@
1-
# 🇪🇺AI-on-Demand Metadata Catalogue
1+
# 🇪🇺 AI-on-Demand Metadata Catalogue
22

3-
This repository contains code and configurations for the AI-on-Demand Metadata Catalogue.
4-
The metadata catalogue provides a unified view of AI assets and resources stored across the AI landscape.
5-
It collects metadata from platforms such as [_Zendodo_](https://zenodo.org), [_Hugging Face_](https://huggingface.co) and [_OpenML_](https://openml.org),
3+
The [AI-on-Demand](https://aiod.eu) Metadata Catalogue is part of the provides a unified view of AI assets and resources stored across the AI landscape.
4+
It collects metadata from platforms such as [_Hugging Face_](https://huggingface.co), [_OpenML_](https://openml.org), and [_Zenodo_](https://zenodo.org),
65
and is connected to European projects like [Bonsapps](https://bonsapps.eu) and [AIDA](https://www.i-aida.org).
7-
Metadata of datasets, models, papers, news, and more from all of these sources is available through a REST API at [api.aiod.eu](https://api.aiod.eu/).
6+
Metadata of datasets, models, papers, news, and more from all of these sources is available through a REST API at [https://api.aiod.eu](https://api.aiod.eu/).
87

98
**🧑‍🔬 For most users:**
109
Many users will only use the REST API indirectly, for example;
11-
through [My Resources](https://github.com/aiondemand/AIOD-marketplace-frontend/) to browse assets,
12-
through [RAIL](https://github.com/aiondemand/aiod-rail) to conduct ML experiments,
13-
or through the [Python SDK](https://github.com/aiondemand/aiondemand) to access the metadata in Python scripts.
14-
For documentation on how to use the REST API directly, visit the ["Using the API"](https://aiondemand.github.io/AIOD-rest-api/Using/).
10+
through the [Python SDK](https://aiondemand.github.io/aiondemand/) to access all (meta)data in Python scripts, the
11+
[AIoD website](https://aiod.eu), including services such as [My Resources](https://github.com/aiondemand/AIOD-marketplace-frontend/) to browse assets,
12+
and [RAIL](https://github.com/aiondemand/aiod-rail) to conduct reproducible ML experiments.
13+
For documentation on how to use the REST API directly, visit the ["Using the API"](https://aiondemand.github.io/AIOD-rest-api/using/) guide.
1514

1615
**🧑‍💻 For service developers:**
1716
To use the metadata catalogue from your service, use the [Python SDK](https://github.com/aiondemand/aiondemand)
18-
or use the REST API directly as detailed in the ["Using the API"](https://aiondemand.github.io/AIOD-rest-api/Using/) documentation.
17+
or use the REST API directly as detailed in the ["Using the API"](https://aiondemand.github.io/AIOD-rest-api/using/) documentation.
1918

2019
**🌍 Hosting:** For information on how to host the metadata catalogue, see the ["Hosting" documentation](https://aiondemand.github.io/AIOD-rest-api/hosting/).
2120

docs/developer/releases.md

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -45,19 +45,22 @@ This information can also be extracted using the Github REST API.
4545

4646

4747
## Creating a release
48-
To create a new release,
48+
To create a new release:
49+
4950
1. Make sure all requested functionality is merged with the `develop` branch.
50-
2. From develop: `git checkout -b release/[VERSION]`. Example of version: `1.1.20231129`
51-
3. Update the version in `pyproject.toml`.
51+
2. Create a release branch from develop: `git checkout -b release/[VERSION]`. Example of version: `1.1.20231129`
52+
3. Update the version in `pyproject.toml` on the release branch.
5253
4. Test all (most of) the functionality. Checkout the project in a new directory and remove all
5354
your local images, and make sure it works out-of-the box.
5455
5. Go to https://github.com/aiondemand/AIOD-rest-api/releases and draft a new release from the
55-
release branch. Look at all closed PRs and create a changelog
56-
6. Create a PR from release branch to master
57-
7. After that's merged, create a PR from master to develop
58-
8. Deploy on the server(s):
56+
release branch. Look at all closed PRs and create a changelog.
57+
6. Create a PR from release branch to develop. Make sure that when it is merged, the release branch is preserved.
58+
7. Deploy on the server(s):
5959
- Check which services currently work (before the update). It's a sanity check for if a service _doesn't_ work later.
60-
- Update the code on the server by checking out the release
61-
- Merge configurations as necessary
60+
- Bring the services down.
61+
- Update the code on the server by checking out the release.
62+
- If the release contains new configuration options or configuration defaults, make sure that the override files are updated as needed.
63+
- Start the services without connectors.
6264
- Make sure the latest database migrations are applied: see ["Schema Migrations"](schema/migration.md#update-the-database)
65+
- Restart the services with connectors.
6366
9. Notify everyone (e.g., in the API channel in Slack).

docs/developer/users.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@ There is no special privilege for user A, and this also means that e.g., user B
3131

3232
!!! info
3333
The upload and review process is described from a user perspective in ["Uploading"](../using/upload.md).
34+
The review process may be disabled by setting the `--disable-reviews` flag when starting the server.
3435

3536
An asset uploaded by a user is by default in `draft` state.
3637
The user may request the asset to be `published` by submitting it for review through the REST API.

docs/hosting/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -114,6 +114,7 @@ Then run the startup commands again (either `up.sh` or `docker compose`).
114114
By default, the server will create a database on the provided MySQL server if it does not yet exist.
115115
You can change this behavior through the **build-db** command-line parameter,
116116
it takes the following options:
117+
117118
* never: *never* creates the database, not even if there does not exist one yet.
118119
Use this only if you expect the database to be created through other means, such
119120
as MySQL group replication.
12.2 KB
Loading

docs/media/upload_and_review.svg

Lines changed: 1 addition & 1 deletion
Loading

docs/stylesheets/extra.css

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
:root > * {
2+
--md-primary-fg-color: #0047BB; /* banner */
3+
}
4+
5+
[data-md-color-scheme=slate] {
6+
--md-default-fg-color: #C5C6C8;
7+
--md-default-fg-color--light: #C5C6C8;
8+
}
9+
10+
[data-md-color-scheme=default] {
11+
--md-default-fg-color: #545557;
12+
}
13+
14+
15+
/* AIOD colors
16+
dark blue: #0047BB
17+
alternative dark blue: #003399:
18+
light blue: #41B6E6
19+
yellow: #FFED00
20+
dark gray: #646567
21+
light gray: #C5C6C8
22+
*/
23+
24+
/* At the default height and margin, the logo for AIoD is not recognizable. */
25+
.md-header__button.md-logo {
26+
margin: 0;
27+
padding: 0;
28+
}
29+
30+
.md-header__button.md-logo img, .md-header__button.md-logo svg {
31+
height: 2.4rem;
32+
width: 2.4rem;
33+
}

docs/using/index.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,17 @@ The REST API allows you to retrieve, update, or remove asset metadata in the met
44
The assets are indexed from many different platforms, such as educational resources from [AIDA](https://www.i-aida.org),
55
datasets from [HuggingFace](https://huggingface.co), models from [OpenML](https://openml.org), and many more.
66

7-
The REST API is available at [`https://api.aiod.eu`](https://api.aiod.eu) and documentation on endpoints
7+
The REST API is available at [https://api.aiod.eu](https://api.aiod.eu) and documentation on endpoints
88
is available on complementary [Swagger](https://api.aiod.eu/docs) and [ReDoc](https://api.aiod.eu/redoc) pages.
99

1010
To use the REST API, simply make HTTP requests to the different endpoints.
11-
Generally, these are `GET` requests when retrieving data, `PUT` requests when modifying data, `POST` requests when adding data, and `DEL` requests when deleting data.
11+
Generally, these are `GET` requests when retrieving data, `PUT` requests when modifying data, `POST` requests when adding data, and `DEL` requests when deleting data. The video and text below show examples on how to use the REST API.
12+
13+
## Introduction Video
14+
15+
<iframe width="560" height="315" src="https://www.youtube.com/embed/2nDj1_VjcWM?si=ncD7xaifSmbU5uzZ" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
16+
17+
## A Quick Example
1218
Here are some examples on how to list datasets in different environments:
1319

1420
=== "Python (requests)"

docs/using/migration-v1-v2.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Migration Guide
2+
3+
4+
This is the migration guide for migrating from version 1 to version 2 of the AI-on-Demand REST API.
5+
We provide some context for these changes, and provide concrete advice on what to change.
6+
The current API is planned to sunset June 10th (may change, but never earlier).
7+
8+
## Changes for API v2 Versioning of the REST API
9+
10+
Previously, versions would be specified at the concept level, for example:
11+
12+
```
13+
https://api.aiod.eu/datasets/v1/24
14+
```
15+
16+
This indicated you wanted dataset metadata in the schema of version 1 (`v1`).
17+
This can be confusing when you have requests that provide mixed results, such as the endpoints to retrieve generic AI resources
18+
(for example, `https://api.aiod.eu/ai_resources/v1/1`).
19+
Instead, we will start versioning the catalogue as a whole.
20+
For this reason, we are moving the location version identifier to the start to more explicitly signal it is API-wide, for example:
21+
22+
```
23+
https://api.aiod.eu/v2/datasets/24
24+
```
25+
26+
Additionally, we now also provide an unversioned endpoint that always fetches the latest schema:
27+
28+
```
29+
https://api.aiod.eu/datasets/24
30+
```
31+
32+
Here are some other examples:
33+
34+
| Old URL | New Versioned URL | Latest Version |
35+
|------------------------|------------------------|---------------------|
36+
| /ai_resources/v1/1 | /v2/ai_resources/1 | /ai_resources/1 |
37+
| /counts/v1 | /v2/counts | /counts |
38+
| /datasets/v1/1/content | /v2/datasets/1/content | /datasets/1/content |
39+
| /search/events/v1 | /v2/search/events | /search/events |
40+
| /user/resources/v1 | /v2/user/resources | /user/resources |
41+
42+
So you need to update the URL to either the new v2 url (e.g., `/v2/datasets`) or the unversioned URL (e.g., `/datasets`).
43+
Which you pick is up to you. Here are some considerations:
44+
45+
- The versioned endpoints will work exactly the same way as long as they are available. There will be no breaking changes.
46+
- The versioned endpoints will eventually be deprecated when we need to make breaking changes to the API.
47+
We will employ a deprecation cycle and you will need to update the script in that time or otherwise it will stop working, even if the breaking changes that caused the version bump do not affect your script.
48+
- The unversioned endpoint doesn’t sunset. As long as your script is compatible, and remains compatible with the latest schema (for example, because the schema did not change), your code will continue to work.
49+
- The unversioned endpoint may change at any time without warning (for a new release), so your script may also suddenly stop working. At that point, you could pin it to the previous version and work on supporting the latest version again.
50+
51+
You could also consider using both endpoints, for example using the unversioned endpoint and falling back to a versioned endpoint if your requests fail.
52+
53+
## Identifiers of the assets on the platform
54+
55+
!!! warning "Important"
56+
57+
This only applies if you have saved identifiers of assets somewhere.
58+
If you do not store identifiers, you can safely ignore this section.
59+
60+
When you fetch assets on the metadata catalogue, you will find multiple identifiers in their descriptions, for example:
61+
62+
```json
63+
{
64+
"platform": "huggingface",
65+
"platform_resource_identifier": "621ffdd236468d709f181d6f",
66+
"name": "bigIR/ar_cov19",
67+
"ai_asset_identifier": 53,
68+
"ai_resource_identifier": 63,
69+
"identifier": 24,
70+
71+
},
72+
```
73+
74+
Having different identifiers can lead to confusing situations.
75+
Identifiers are frequently used to establish relationships between assets
76+
(e.g., this dataset is related to that publication, or this person is a member of that organisation).
77+
However, depending on the nature of the relationship, you would need to use a different identifier.
78+
This is error-prone and may lead to users accidentally linking the wrong assets.
79+
80+
To address this issue, we are unifying the identifiers to make sure that every asset in the metadata catalogue has one single unique identifier for the AI-on-Demand platform. For example:
81+
82+
```json
83+
{
84+
"platform": "huggingface",
85+
"platform_resource_identifier": "621ffdd236468d709f181d6f",
86+
"name": "bigIR/ar_cov19",
87+
"ai_asset_identifier": 63, # instead of 53
88+
"ai_resource_identifier": 63, # remained the same
89+
"identifier": 63, # instead of 24
90+
91+
},
92+
```
93+
94+
In this example, referring to the dataset within AI-on-Demand always uses identifier ‘63’.
95+
The only identifier we do not update, is the ‘platform_resource_identifier’, as that specifies where to find the original resource the metadata describes.
96+
This identifier is never used internally within AI-on-Demand for linking assets, and so is not subject to accidental misuse described above.
97+
98+
If you have stored identifiers, you can find tables which map the old identifiers to the new identifiers here: (link to be added)
99+
100+
For technical reasons, we cannot support a transitional period where both identifiers are compatible with the API.
101+
We plan to migrate to the new identifiers on June 11th.
102+
So if you access assets using identifiers after that date, make sure to convert them first or you will likely receive the wrong assets or errors.
103+
104+
### Why is the no deprecation cycle for the change to identifiers?
105+
106+
While we provide a migration period for the URLs, we cannot provide one for the migration of identifiers.
107+
Allowing both identifiers to be used in a transitional period adds a lot of complexity and possibilities for errors when linking assets.
108+
Most crucially, there may be cases when a user wants to link assets by identifiers where we cannot tell if they are using old identifiers or new identifiers.
109+
While we can support them under different endpoints, there is no way for use to ensure a user does not (accidentally) link assets referencing old identifiers on a new endpoint, or vice versa.
110+
Maintaining the integrity of the data in the metadata catalogue is our highest priority, and so we decided that unfortunately we cannot support a grace period.
111+
We hope for your understanding and will do our best to avoid such a scenario in the future.

0 commit comments

Comments
 (0)