Skip to content
This repository was archived by the owner on Jul 14, 2025. It is now read-only.

Commit 15adc8b

Browse files
committed
Add training metadata and make distinction between training & inferencing image links
1 parent 1f579a0 commit 15adc8b

File tree

4 files changed

+94
-38
lines changed

4 files changed

+94
-38
lines changed

README.md

Lines changed: 57 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -55,24 +55,67 @@ these models for the following types of use-cases:
5555

5656
| Field Name | Type | Description |
5757
| -------------------------- | ------------------------- | ----------- |
58-
| ml-model:learning_approach | string | **REQUIRED**. The learning approach used to train the model. It is STRONGLY RECOMMENDED that you use one of the values described below, but other values are allowed. |
59-
| ml-model:prediction_type | string | **REQUIRED.** The type of prediction that the model makes. It is STRONGLY RECOMMENDED that you use one of the values described below, but other values are allowed. |
58+
| ml-model:learning_approach | string | **REQUIRED**. The learning approach used to train the model. It is STRONGLY RECOMMENDED that you use one of the values [described below](#ml-modellearning_approach), but other values are allowed. |
59+
| ml-model:prediction_type | string | **REQUIRED.** The type of prediction that the model makes. It is STRONGLY RECOMMENDED that you use one of the values [described below](#ml-modelprediction_type), but other values are allowed. |
6060
| ml-model:architecture | string | **REQUIRED.** Identifies the architecture employed by the model (e.g. RCNN, U-Net, etc.). This may be any string identifier, but publishers are encouraged to use well-known identifiers whenever possible. |
61+
| ml-model:training-environment | [Training Environment Object](#training-environment-object) | Describes the environment used to train the model. See the Link [relation types](#relation-types) defined below for definitions of the data used during training. |
62+
63+
### Training Environment Object
64+
65+
| Field Name | Type | Description |
66+
| -------------------------- | ------------------------- | ----------- |
67+
| operating-system | string | Identifies the operating system on which the model was trained. See the [Operating System](#operating-system) description below for recommended values. |
68+
| processor-type | string | The type of processor used during training. Must be one of `"cpu"` or `"gpu"`. |
69+
70+
#### Operating System
71+
72+
It is STRONGLY RECOMMENDED that one of the following operating system identifiers (taken from the Python [`sys.platform`
73+
values](https://docs.python.org/3/library/sys.html#sys.platform) be used whenever possible:
74+
75+
- `aix`
76+
- `linux`
77+
- `win32`
78+
- `cygwin`
79+
- `darwin`
80+
81+
### Additional Field Information
82+
83+
#### ml-model:learning_approach
84+
85+
Describes the learning approach used to train the model. It is STRONGLY RECOMMENDED that you use one of the
86+
following values, but other values are allowed.
87+
88+
- `"supervised"`
89+
- `"unsupervised"`
90+
- `"semi-supervised"`
91+
- `"reinforcement-learning"`
92+
93+
#### ml-model:prediction_type
94+
95+
Describes the type of predictions made by the model. It is STRONGLY RECOMMENDED that you use one of the
96+
following values, but other values are allowed. Note that not all Prediction Type values are valid
97+
for a given [Learning Approach](#ml-modellearning_approach).
98+
99+
- `"object-detection"`
100+
- `"classification"`
101+
- `"segmentation"`
102+
- `"regression"`
61103

62104
## Asset Objects
63105

64106
### Roles
65107

66-
| Role Name | Description |
67-
| ------------------------ | ----------- |
68-
| ml-model:inference-runtime | Represents a file containing instructions for running a containerized version of the model to generate inferences. See the [Inference Runtimes](#inference-runtimes) section below for details on related fields. |
108+
| Role Name | Description |
109+
| -------------------------- | ----------- |
110+
| ml-model:inference-runtime | Represents a file containing instructions for running a containerized version of the model to generate inferences. See the [Inference/Training Runtimes](#inferencetraining-runtimes) section below for details on related fields. |
111+
| ml-model:training-runtime | Represents a file containing instructions for running a container to train the model. See the [Inference/Training Runtimes](#inferencetraining-runtimes) section below for details on related fields. |
69112

70-
### Inference Runtimes
113+
### Inference/Training Runtimes
71114

72-
An Asset with the `ml-model:inference-runtime` role represents a file containing instructions for running a containerized version of the model to
73-
generate inferences. Currently, only [Compose files](https://github.com/compose-spec/compose-spec/blob/master/spec.md#compose-file) are supported,
74-
but support is planned for other formats, including [Common Workflow Language (CWL)](https://www.commonwl.org/) and [Workflow Description Language
75-
(WDL)](https://openwdl.org/).
115+
Assets with the `ml-model:inference-runtime` or `ml-model:training-runtime` role represents files containing instructions for running a containerized
116+
version of the model to either generate inferences or train the model, respectively. Currently, only [Compose
117+
files](https://github.com/compose-spec/compose-spec/blob/master/spec.md#compose-file) are supported, but support is planned for other formats,
118+
including [Common Workflow Language (CWL)](https://www.commonwl.org/) and [Workflow Description Language (WDL)](https://openwdl.org/).
76119

77120
The `"type"` field should be used to indicate the format of this asset. Assets in the Compose format should have a `"type"` value of
78121
`"text/x-yaml; application=compose"`.
@@ -110,37 +153,17 @@ $ INPUT_DATA=/local/path/to/model/inputs; \
110153
It is RECOMMENDED that model publishers use the Asset `description` field to describe any other requirements or constraints for running the model
111154
container.
112155

113-
### Additional Field Information
114-
115-
#### ml-model:learning_approach
116-
117-
Describes the learning approach used to train the model. It is STRONGLY RECOMMENDED that you use one of the
118-
following values, but other values are allowed.
119-
120-
- `"supervised"`
121-
- `"unsupervised"`
122-
- `"semi-supervised"`
123-
- `"reinforcement-learning"`
124-
125-
#### ml-model:prediction_type
126-
127-
Describes the type of predictions made by the model. It is STRONGLY RECOMMENDED that you use one of the
128-
following values, but other values are allowed. Note that not all Prediction Type values are valid
129-
for a given [Learning Approach](#ml-modellearning_approach).
130-
131-
- `"object-detection"`
132-
- `"classification"`
133-
- `"segmentation"`
134-
- `"regression"`
135-
136156
## Relation types
137157

138158
The following types should be used as applicable `rel` types in the
139159
[Link Object](https://github.com/radiantearth/stac-spec/tree/master/item-spec/item-spec.md#link-object).
140160

141161
| Type | Description |
142162
| ---------------------------- | ----------- |
143-
| ml-model:image | Links with this relation type refer to Docker images built using the model. The `href` value for links of this type should contain a fully-qualified URI for the image as would be required for a command like `docker pull`. These URIs should be of the form `<registry_domain>/<user_or_organization_name>/<image_name>:<tag>`. Links with this relation type should have a `"type"` value of `"docker-image"` to indicate a Docker image. |
163+
| ml-model:inferencing-image | Links with this relation type refer to Docker images that may be used to generate inferences using the model. The `href` value for links of this type should contain a fully-qualified URI for the image as would be required for a command like `docker pull`. These URIs should be of the form `<registry_domain>/<user_or_organization_name>/<image_name>:<tag>`. Links with this relation type should have a `"type"` value of `"docker-image"` to indicate a Docker image. |
164+
| ml-model:training-image | Links with this relation type refer to Docker images that may be used to train the model. The `href` value for links of this type should contain a fully-qualified URI for the image as would be required for a command like `docker pull`. These URIs should be of the form `<registry_domain>/<user_or_organization_name>/<image_name>:<tag>`. Links with this relation type should have a `"type"` value of `"docker-image"` to indicate a Docker image. |
165+
| ml-model:train-data | Links with this relation type refer to datasets used to train the model. It is STRONGLY RECOMMENDED that these links refer to a STAC Collection implementing the [Label Extension](https://github.com/stac-extensions/label) |
166+
| ml-model:test-data | Links with this relation type refer to datasets used to test the model during training. It is STRONGLY RECOMMENDED that these links refer to a STAC Collection implementing the [Label Extension](https://github.com/stac-extensions/label). |
144167

145168
## Interpretation of STAC Fields
146169

examples/dummy/inferencing.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
services:
22
model-inference:
3-
image: docker.io/someusername/some_model_image:1
3+
image: registry.hub.docker.com/my-user/my-inferencing-model:v1
44
volumes:
55
- "${INPUT_VOLUME}:/var/data/input"
66
- "${OUTPUT_VOLUME}:/var/data/output"

examples/dummy/item.json

Lines changed: 25 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,11 @@
5454
],
5555
"ml-model:learning_approach": "supervised",
5656
"ml-model:prediction_type": "object-detection",
57-
"ml-model:architecture": "RCNN"
57+
"ml-model:architecture": "RCNN",
58+
"ml-model:training-environment": {
59+
"processor-type": "gpu",
60+
"operating-system": "linux"
61+
}
5862
},
5963
"links": [
6064
{
@@ -76,10 +80,28 @@
7680
"title": "Containing Collection"
7781
},
7882
{
79-
"rel": "ml-model:image",
80-
"href": "registry.hub.docker.com/my-user/my-model:v1",
83+
"rel": "ml-model:inferencing-image",
84+
"href": "registry.hub.docker.com/my-user/my-inferencing-model:v1",
8185
"type": "docker-image",
8286
"title": "My Model (v1)"
87+
},
88+
{
89+
"rel": "ml-model:training-image",
90+
"href": "registry.hub.docker.com/my-user/my-training-model:v1",
91+
"type": "docker-image",
92+
"title": "Image for Training Model"
93+
},
94+
{
95+
"rel": "ml-model:train-data",
96+
"href": "https://some-domain.com/training-data/collection.json",
97+
"type": "application/json",
98+
"title": "Training Data"
99+
},
100+
{
101+
"rel": "ml-model:test-data",
102+
"href": "https://some-domain.com/test-data/collection.json",
103+
"type": "application/json",
104+
"title": "Test Data"
83105
}
84106
],
85107
"assets": {

json-schema/schema.json

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,17 @@
177177
},
178178
"ml-model:architecture": {
179179
"type": "string"
180+
},
181+
"ml-model:training-environment": {
182+
"type": "object",
183+
"properties": {
184+
"operating-system": {
185+
"type": "string"
186+
},
187+
"processor-type": {
188+
"type": "string"
189+
}
190+
}
180191
}
181192
},
182193
"patternProperties": {

0 commit comments

Comments
 (0)