Add inference metadata [WIP] (#3)

duckontheweb · web-flow · commit 1f579a0822ca · 2021-12-12T13:39:26.000-05:00
* Add inferencing asset fields

* Include model asset in JSON schema &amp; examples

* Use Compose file instead of Docker image asset

* Rename Inferencing Images -&gt; Inferencing Runtimes

* Add rel type for Docker image links

* Clarify media and relation types
diff --git a/README.md b/README.md
@@ -59,6 +59,57 @@ these models for the following types of use-cases:
 | ml-model:prediction_type   | string                    | **REQUIRED.** The type of prediction that the model makes. It is STRONGLY RECOMMENDED that you use one of the values described below, but other values are allowed.   |
 | ml-model:architecture      | string                    | **REQUIRED.** Identifies the architecture employed by the model (e.g. RCNN, U-Net, etc.). This may be any string identifier, but publishers are encouraged to use well-known identifiers whenever possible. |
 
+## Asset Objects
+
+### Roles
+
+| Role Name                | Description |
+| ------------------------ | ----------- |
+| ml-model:inference-runtime | Represents a file containing instructions for running a containerized version of the model to generate inferences. See the [Inference Runtimes](#inference-runtimes) section below for details on related fields. |
+
+### Inference Runtimes
+
+An Asset with the `ml-model:inference-runtime` role represents a file containing instructions for running a containerized version of the model to
+generate inferences. Currently, only [Compose files](https://github.com/compose-spec/compose-spec/blob/master/spec.md#compose-file) are supported,
+but support is planned for other formats, including [Common Workflow Language (CWL)](https://www.commonwl.org/) and [Workflow Description Language
+(WDL)](https://openwdl.org/).
+
+The `"type"` field should be used to indicate the format of this asset. Assets in the Compose format should have a `"type"` value of
+`"text/x-yaml; application=compose"`.
+
+While the Compose file defines nearly all of the parameters required to run the containerized model image, we still need a way to define which host
+directory containing input data should be mounted to the container and to which host directory the output predictions should be written. The Compose
+file MUST define volume mounts for input and output data using the Compose
+[Interpolation syntax](https://github.com/compose-spec/compose-spec/blob/master/spec.md#interpolation). The input data volume MUST be defined by an
+`INPUT_VOLUME` variable and the output data volume MUST be defined by an `OUTPUT_DATA` variable. 
+
+For example, the following Compose file snippet would mount the host input directory to `/var/data/input` in the container and would mount the host
+output data directory to `/var/data/output` in the host container. In this contrived example, the script to run the model takes 2 arguments: the
+location to the input data directory and the location to the output data directory.
+
+```yaml
+services:
+  ...
+  model_runtime:
+    ...
+    volumes:
+      - "${INPUT_DATA}:/var/data/input"
+      - "${OUTPUT_DATA}:/var/data/output"
+    command: "run-model.sh /var/data/input /var/data/output"
+```
+
+A user would then set the `INPUT_DATA` and `OUTPUT_DATA` environment variables when running the model. An example using `docker-compose` might look
+like the following:
+
+```console
+$ INPUT_DATA=/local/path/to/model/inputs; \
+  OUTPUT_DATA=/local/path/to/model/outputs; \
+  docker-compose up -f path/to/inference-runtime.yml
+```
+
+It is RECOMMENDED that model publishers use the Asset `description` field to describe any other requirements or constraints for running the model
+container.
+
 ### Additional Field Information
 
 #### ml-model:learning_approach
@@ -82,6 +133,15 @@ for a given [Learning Approach](#ml-modellearning_approach).
 - `"segmentation"`
 - `"regression"`
 
+## Relation types
+
+The following types should be used as applicable `rel` types in the
+[Link Object](https://github.com/radiantearth/stac-spec/tree/master/item-spec/item-spec.md#link-object).
+
+| Type                         | Description |
+| ---------------------------- | ----------- |
+| ml-model:image               | Links with this relation type refer to Docker images built using the model. The `href` value for links of this type should contain a fully-qualified URI for the image as would be required for a command like `docker pull`. These URIs should be of the form `<registry_domain>/<user_or_organization_name>/<image_name>:<tag>`. Links with this relation type should have a `"type"` value of `"docker-image"` to indicate a Docker image. |
+
 ## Interpretation of STAC Fields
 
 The semantics of ML model metadata can sometimes differ significantly from the use-cases for which STAC was originally intended (Earth observation
@@ -108,15 +168,6 @@ It is RECOMMENDED that following STAC Extensions be used in conjunction with the
 - [Scientific Citation Extension](https://github.com/stac-extensions/scientific): This extension should be used to describe how the model should
   cited in publications, as well as to reference any existing publications associated with the model.
 
-## Relation types
-
-The following types should be used as applicable `rel` types in the
-[Link Object](https://github.com/radiantearth/stac-spec/tree/master/item-spec/item-spec.md#link-object).
-
-| Type                | Description |
-| ------------------- | ----------- |
-| TBD                 | More detail to come... |
-
 ## Contributing
 
 All contributions are subject to the
diff --git a/examples/dummy/inferencing.yml b/examples/dummy/inferencing.yml
@@ -0,0 +1,8 @@
+services:
+  model-inference:
+    image: docker.io/someusername/some_model_image:1
+    volumes:
+      - "${INPUT_VOLUME}:/var/data/input"
+      - "${OUTPUT_VOLUME}:/var/data/output"
+    entrypoint: bash /app/scripts/run-model.sh
+
diff --git a/examples/dummy/item.json b/examples/dummy/item.json
@@ -4,7 +4,7 @@
     "https://stac-extensions.github.io/ml-model/v1.0.0/schema.json"
   ],
   "type": "Feature",
-  "id": "item",
+  "id": "model-item",
   "bbox": [
     34.18,
     0.47,
@@ -74,16 +74,27 @@
       "href": "./collection.json",
       "type": "application/json",
       "title": "Containing Collection"
+    },
+    {
+      "rel": "ml-model:image",
+      "href": "registry.hub.docker.com/my-user/my-model:v1",
+      "type": "docker-image",
+      "title": "My Model (v1)"
     }
   ],
   "assets": {
     "model": {
-      "href": "https://github.com/m-mohr/gmlmc-hackathon-tree-detection/blob/main/epoch_019.ckpt",
-      "type": "application/x.ckpt",
-      "title": "Model checkpoint file",
+      "href": "./inferencing.yml",
+      "type": "text/x-yaml; application=compose",
+      "title": "Model inferencing runtime",
       "roles": [
-        "model"
+        "ml-model:inference-runtime"
       ]
+    },
+    "other": {
+      "href": "https://some-domain.com/another/thing.json",
+      "type": "application/json",
+      "title": "Some other asset"
     }
   }
 }
diff --git a/json-schema/schema.json b/json-schema/schema.json
@@ -32,15 +32,15 @@
                   ]
                 },
                 {
-                  "$ref": "#/definitions/fields"
+                  "$ref": "#/definitions/common_fields"
                 }
               ]
             },
             "assets": {
               "$comment": "This validates the fields in Item Assets, but does not require them.",
               "type": "object",
               "additionalProperties": {
-                "$ref": "#/definitions/fields"
+                "$ref": "#/definitions/common_fields"
               }
             }
           }
@@ -82,7 +82,7 @@
                         "$ref": "#/definitions/require_any_field"
                       },
                       {
-                        "$ref": "#/definitions/fields"
+                        "$ref": "#/definitions/common_fields"
                       }
                     ]
                   }
@@ -107,7 +107,7 @@
                         "$ref": "#/definitions/require_any_field"
                       },
                       {
-                        "$ref": "#/definitions/fields"
+                        "$ref": "#/definitions/common_fields"
                       }
                     ]
                   }
@@ -165,8 +165,8 @@
         }
       ]
     },
-    "fields": {
-      "$comment": "Add your new fields here. Don't require them here, do that above in the corresponding schema.",
+    "common_fields": {
+      "$comment": "Add your new Item fields here. Don't require them here, do that above in the corresponding schema.",
       "type": "object",
       "properties": {
         "ml-model:learning_approach": {

Original file line number	Diff line number	Diff line change
`@@ -32,15 +32,15 @@`
`32`	`32`	`]`
`33`	`33`	`},`
`34`	`34`	`{`
`35`		`- "$ref": "#/definitions/fields"`
	`35`	`+ "$ref": "#/definitions/common_fields"`
`36`	`36`	`}`
`37`	`37`	`]`
`38`	`38`	`},`
`39`	`39`	`"assets": {`
`40`	`40`	`"$comment": "This validates the fields in Item Assets, but does not require them.",`
`41`	`41`	`"type": "object",`
`42`	`42`	`"additionalProperties": {`
`43`		`- "$ref": "#/definitions/fields"`
	`43`	`+ "$ref": "#/definitions/common_fields"`
`44`	`44`	`}`
`45`	`45`	`}`
`46`	`46`	`}`
`@@ -82,7 +82,7 @@`
`82`	`82`	`"$ref": "#/definitions/require_any_field"`
`83`	`83`	`},`
`84`	`84`	`{`
`85`		`- "$ref": "#/definitions/fields"`
	`85`	`+ "$ref": "#/definitions/common_fields"`
`86`	`86`	`}`
`87`	`87`	`]`
`88`	`88`	`}`
`@@ -107,7 +107,7 @@`
`107`	`107`	`"$ref": "#/definitions/require_any_field"`
`108`	`108`	`},`
`109`	`109`	`{`
`110`		`- "$ref": "#/definitions/fields"`
	`110`	`+ "$ref": "#/definitions/common_fields"`
`111`	`111`	`}`
`112`	`112`	`]`
`113`	`113`	`}`
`@@ -165,8 +165,8 @@`
`165`	`165`	`}`
`166`	`166`	`]`
`167`	`167`	`},`
`168`		`- "fields": {`
`169`		`- "$comment": "Add your new fields here. Don't require them here, do that above in the corresponding schema.",`
	`168`	`+ "common_fields": {`
	`169`	`+ "$comment": "Add your new Item fields here. Don't require them here, do that above in the corresponding schema.",`
`170`	`170`	`"type": "object",`
`171`	`171`	`"properties": {`
`172`	`172`	`"ml-model:learning_approach": {`