SeldonIO · paulb-seldon · Jul 15, 2025 · Jul 15, 2025
diff --git a/docs-gb/SUMMARY.md b/docs-gb/SUMMARY.md
@@ -161,6 +161,39 @@
     * [RClone Storage Initializer - upgrading your cluster (AWS S3 / MinIO)](notebooks/global-rclone-upgrade.md) 
 
 ## Reference
+ * [Annotation Based Configuration](reference/annotations.md)
+ * [Benchmarking](reference/benchmarking.md)
+ * [General Availability](reference/ga.md)
+ * [Helm Charts](reference/helm_charts.md)
+ * [Images](reference/images.md)
+ * [Logging and Log Level](reference/log_level.md)
+ * [Private Docker Registry](reference/private_registries.md)
+ * [Prediction APIs]
+    * [Open Inference Protocol](reference/v2-protocol.md)
+    * [Scalar Value Types](reference/v2-protocol.md) *
+    * [Microservice API](reference/internal-api.md)
+    * [External API](reference/external-prediction.md)
+    * [Prediction Proto Buffer Spec](reference/prediction.md)
+    * [Prediction Open API Spec] *
+ * [Python API Reference]*
+ * [Release Highlights](reference/release-highlights/)
+    * [Release 1.7.0 Hightlights](reference/release-highlights/release-1.7.0.md)
+    * [Release 1.6.0 Hightlights](reference/release-highlights/release-1.6.0.md)
+    * [Release 1.5.0 Hightlights](reference/release-highlights/release-1.5.0.md)
+    * [Release 1.1.0 Hightlights](reference/release-highlights/release-1.1.0.md)
+    * [Release 1.0.0 Hightlights](reference/release-highlights/release-1.0.0.md)
+    * [Release 0.4.1 Hightlights](reference/release-highlights/release-0.4.1.md)
+    * [Release 0.4.0 Hightlights](reference/release-highlights/release-0.4.0.md)
+    * [Release 0.3.0 Hightlights](reference/release-highlights/release-0.3.0.md)
+    * [Release 0.2.7 Hightlights](reference/release-highlights/release-0.2.7.md)
+    * [Release 0.2.6 Hightlights](reference/release-highlights/release-0.2.6.md)
+    * [Release 0.2.5 Hightlights](reference/release-highlights/release-0.2.5.md)
+    * [Release 0.2.3 Hightlights](reference/release-highlights/release-0.2.3.md)  
+ * [Seldon Deployment CRD]*
+ * [Service Orchestrator](reference/svcorch.md)
+ * [Kubeflow](reference/kubeflow.md)
+ * [Archived Docs](https://docs.seldon.io/projects/seldon-core/en/1.18/nav/archive.html)
+
 
 ## Contributing
  * [Overview](developer/readme.md)

diff --git a/docs-gb/reference/annotations.md b/docs-gb/reference/annotations.md
@@ -0,0 +1,49 @@
+# Annotation Based Configuration
+
+You can configure aspects of Seldon Core via annotations in the SeldonDeployment resource and also the optional API OAuth Gateway. Please create an issue if you would like some configuration added.
+
+## SeldonDeployment Annotations
+
+### gRPC API Control
+
+ * ```seldon.io/grpc-max-message-size``` : Maximum gRPC message size (bytes)
+   * Locations : SeldonDeployment.spec.annotations
+   * Default is MaxInt32
+   * [gRPC message size example](model_rest_grpc_settings.md)
+ * ```seldon.io/grpc-timeout``` : gRPC timeout (msecs)
+   * Locations : SeldonDeployment.spec.annotations
+   * Default is no timeout
+   * [gRPC timeout example](model_rest_grpc_settings.md)
+
+
+### REST API Control
+
+.. Note::
+   When using REST APIs, timeouts will only apply to each node and not to the
+   full inference graph.
+   Therefore, each sub-request for each individual node in the graph will be
+   able to take up to ``seldon.io/rest-timeout`` milliseconds.
+
+* ```seldon.io/rest-timeout``` : REST timeout (msecs)
+  * Locations : SeldonDeployment.spec.annotations
+  * Default is no overall timeout but will use Go's default transport settings which include a 30 sec connection timeout.
+  * [REST timeout example](model_rest_grpc_settings.md)
+
+
+### Service Orchestrator
+
+  * ```seldon.io/engine-separate-pod``` : Use a separate pod for the service orchestrator
+    * Locations : SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations
+    * [Separate svc-orc pod example](model_svcorch_sep.md)
+    * Locations : SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations
+  * ```seldon.io/executor-logger-queue-size``` : Size of request logging worker queue
+    * Locations: SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations
+  * ```seldon.io/executor-logger-write-timeout-ms``` : Write timeout for adding to logging work queue
+    * Locations: SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations
+
+
+### Misc
+
+ * ```seldon.io/svc-name``` : Custom service name for predictor. You will be responsible that it doesn't clash with any existing service name in the namespace of the deployed SeldonDeployment.
+   * Locations : SeldonDeployment.spec.predictors[].annotations
+   * [custom service name example](custom_svc_name.md)
diff --git a/docs-gb/reference/benchmarking.md b/docs-gb/reference/benchmarking.md
@@ -0,0 +1,60 @@
+# Seldon-core Benchmarking and Load Testing
+
+This page is a work in progress to provide benchmarking and load testing.
+
+This work is ongoing and we welcome feedback
+
+## Tools
+
+ * For REST tests we use [vegeta](https://github.com/tsenart/vegeta)
+ * For gRPC tests we use [ghz](https://ghz.sh/)
+
+## Service Orchestrator
+
+These benchmark tests are to evaluate the extra latency added by including the service orchestrator.
+
+ * [Service orchestrator benchmark](../examples/bench_svcOrch.html)
+
+### Results
+
+On A 3 node DigitalOcean cluster 24vCPUs 96 GB, running Tensorflow Flowers image classfier.
+
+| Test | Additional latency |
+| ---  | ------------------ |
+| REST | 9ms |
+| gRPC | 4ms |
+
+Further work:
+
+ * Statistical confidence test
+
+
+## Tensorflow
+
+Test the max throughput and HPA usage.
+
+ * [Tensorflow benchmark](../examples/bench_tensorflow.html)
+
+### Results
+
+On A 3 node DigitalOcean cluster 24vCPUs 96 GB, running Tensorflow Flowers image classfier with HPA and running at max throughput for a single model. No ramp up, as vegeta does not support this. See notebook for details.
+
+```
+Latencies:
+
+mean: 259.990239 ms
+50th: 131.917169 ms
+90th: 310.053255 ms
+95th: 916.684759 ms
+99th: 2775.05271 ms
+
+Throughput: 23.997572337989126/s
+Errors: False
+```
+
+## Flexible Benchmarking with Argo Workflows
+
+We have also an example that shows how to leverage the batch processing workflow that we showcase in the examples, but to perform benchmarking with Seldon Core models.
+
+ * [Seldon deployment benchmark](../examples/vegeta_bench_argo_workflows.html)
+
diff --git a/docs-gb/reference/external-prediction.md b/docs-gb/reference/external-prediction.md
@@ -0,0 +1,85 @@
+# External Prediction API
+
+![API](./api.png)
+
+The Seldon Core exposes a generic external API to connect your ML runtime prediction to external business applications.
+
+## REST API
+
+### Prediction
+
+ - endpoint : POST /api/v1.0/predictions
+ - payload : JSON representation of `SeldonMessage` - see [proto definition](./prediction.md#proto-buffer-and-grpc-definition)
+ - example payload :
+
+   ```json
+   {"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}}
+   ```
+
+### Feedback
+
+ - endpoint : POST /api/v1.0/feedback
+ - payload : JSON representation of `Feedback` - see [proto definition](./prediction.md#proto-buffer-and-grpc-definition)
+
+### Metadata - Graph Level
+
+- endpoint : GET /api/v1.0/metadata
+- example response :
+
+```json
+{
+  "name": "example",
+  "models": {
+    "model-1": {
+      "name": "Model 1",
+      "platform": "platform-name",
+      "versions": ["model-version"],
+      "inputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 5]}],
+      "outputs": [{"datatype": "BYTES", "name": "output", "shape": [1, 3]}]
+    },
+    "model-2": {
+      "name": "Model 2",
+      "platform": "platform-name",
+      "versions": ["model-version"],
+      "inputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 3]}],
+      "outputs": [{"datatype": "BYTES", "name": "output", "shape": [3]}]
+    }
+  },
+  "graphinputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 5]}],
+  "graphoutputs": [{"datatype": "BYTES", "name": "output", "shape": [3]}]
+}
+```
+
+see metadata [documentation](./metadata.md) for more details.
+
+
+### Metadata - Model Level
+
+- endpoint : GET /api/v1.0/metadata/{MODEL_NAME}
+- example response:
+
+```json
+{
+  "name": "Model 1",
+  "versions": ["model-version"],
+  "platform": "platform-name",
+  "inputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 5]}],
+  "outputs": [{"datatype": "BYTES", "name": "output", "shape": [1, 3]}],
+}
+```
+
+see metadata [documentation](./metadata.md) for more details.
+
+
+## gRPC
+
+```protobuf
+service Seldon {
+  rpc Predict(SeldonMessage) returns (SeldonMessage) {};
+  rpc SendFeedback(Feedback) returns (SeldonMessage) {};
+  rpc ModelMetadata(SeldonModelMetadataRequest) returns (SeldonModelMetadata) {};
+  rpc GraphMetadata(google.protobuf.Empty) returns (SeldonGraphMetadata) {};
+}
+```
+
+see full [proto definition](./prediction.md#proto-buffer-and-grpc-definition)
diff --git a/docs-gb/reference/ga.md b/docs-gb/reference/ga.md
@@ -0,0 +1,12 @@
+# General Availability
+
+Seldon 1.0 release will be a GA release. However, a subset of all the current Seldon Core functionality will come under GA. The GA components are:
+
+ * Seldon Deployment v1 CRD
+ * Seldon v1 REST and gRPC API
+ * Seldon-Core Python wrapper 1.0
+ * Tensorflow, SKLearn, XGboost prepacked servers
+
+Seldon offers commercial support for Seldon Core as well as a commercial enterprise offering for running ML Serving in production using Seldon Core. Please visit the [Seldon website](https://www.seldon.io/) for more information.
+
+
diff --git a/docs-gb/reference/helm_charts.md b/docs-gb/reference/helm_charts.md
@@ -0,0 +1,37 @@
+# Seldon Core Helm Charts
+
+Helm charts are published to our official repo.
+
+## Core Charts
+
+The core charts for installing Seldon Core are shown below.
+
+.. toctree::
+   :maxdepth: 1
+
+   seldon-core-operator <../charts/seldon-core-operator>
+
+For further details see [here](../workflow/install.md).
+
+## Inference Graph Templates
+
+A set of charts to provide example templates for creating particular inference graphs using Seldon Core
+
+.. toctree::
+   :maxdepth: 1
+
+   seldon-single-model <../charts/seldon-single-model>
+   seldon-abtest <../charts/seldon-abtest>
+   seldon-mab <../charts/seldon-mab>
+   seldon-od-model <../charts/seldon-od-model>
+   seldon-od-transformer <../charts/seldon-od-transformer>
+
+[A notebook with examples of using the above charts](https://docs.seldon.io/projects/seldon-core/en/latest/examples/helm_examples.html) is provided.
+
+## Misc
+
+.. toctree::
+   :maxdepth: 1
+
+   seldon-core-loadtesting <../charts/seldon-core-loadtesting>
+   seldon-core-analytics <../charts/seldon-core-analytics>
diff --git a/docs-gb/reference/images.md b/docs-gb/reference/images.md
@@ -0,0 +1,76 @@
+# Latest Seldon Images
+
+
+## Core images
+
+| Description | Image URL | Stable Version | Development |
+|-------------|-----------|----------------|-------------|
+| [Seldon Operator](../workflow/install.md) | [seldonio/seldon-core-operator](https://hub.docker.com/r/seldonio/seldon-core-operator/tags/) | 1.18.0 | 1.19.0-dev |
+| [Seldon Service Orchestrator (Go)](../graph/svcorch.md)| [seldonio/seldon-core-executor](https://hub.docker.com/r/seldonio/seldon-core-executor/tags) | 1.18.0 | 1.19.0-dev |
+
+## Pre-packaged servers
+
+
+| Description | Image URL | Version |
+|-------------|-----------|---------|
+| [MLFlow Server](../servers/mlflow.md) | [seldonio/mlflowserver](https://hub.docker.com/r/seldonio/mlflowserver/tags/) | 1.18.0 |
+| [SKLearn Server](../servers/sklearn.md) | [seldonio/sklearnserver](https://hub.docker.com/r/seldonio/sklearnserver/tags/) | 1.18.0 |
+| [XGBoost Server](../servers/xgboost.md) | [seldonio/xgboostserver](https://hub.docker.com/r/seldonio/xgboostserver/tags/) | 1.18.0 |
+
+## Language wrappers
+
+| Description | Image URL | Stable Version | Development |
+|-------------|-----------|----------------|-------------|
+| [Seldon Python 3 (3.8) Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python3](https://hub.docker.com/r/seldonio/seldon-core-s2i-python3/tags/) | 1.18.0 | 1.19.0-dev |
+| [Seldon Python 3.7 Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python37](https://hub.docker.com/r/seldonio/seldon-core-s2i-python37/tags/) | 1.18.0 | 1.19.0-dev |
+| [Seldon Python 3.8 Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python38](https://hub.docker.com/r/seldonio/seldon-core-s2i-python38/tags/) |  1.18.0  | 1.19.0-dev |
+| [Seldon Python 3.7 GPU Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python37-gpu](https://hub.docker.com/r/seldonio/seldon-core-s2i-python37-gpu/tags/) | 1.18.0 | 1.19.0-dev |
+| [Seldon Python 3.8 GPU Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python38-gpu](https://hub.docker.com/r/seldonio/seldon-core-s2i-python38-gpu/tags/) | 1.18.0 | 1.19.0-dev |
+
+## Server proxies
+
+| Description | Image URL | Stable Version |
+|-------------|-----------|----------------|
+| [SageMaker proxy](https://github.com/SeldonIO/seldon-core/tree/master/integrations/sagemaker) | [seldonio/sagemaker-proxy](https://hub.docker.com/r/seldonio/sagemaker-proxy/tags/) | 0.1 |
+| [Tensorflow Serving proxy](../servers/tensorflow.md) | [seldonio/tfserving-proxy](https://hub.docker.com/r/seldonio/tfserving-proxy/tags/) | 1.18.0 |
+
+
+## Python modules
+
+| Description | Python Version | Version |
+|-------------|----------------|---------|
+| [seldon-core](https://pypi.org/project/seldon-core/) | >3.4,<3.9 | 1.18.0 |
+| [seldon-core](https://pypi.org/project/seldon-core/) | 2,>=3,<3.7 | 0.2.6 (deprecated) |
+
+
+## Incubating
+
+### Language wrappers
+
+| Description | Image URL | Stable Version | Development |
+|-------------|-----------|----------------|-------------|
+| [Seldon Python ONNX Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python3-ngraph-onnx](https://hub.docker.com/r/seldonio/seldon-core-s2i-python3-ngraph-onnx/tags/) | 0.3  |   |
+| [Seldon Java Build Wrapper for S2I](../java/README.md) | [seldonio/seldon-core-s2i-java-build](https://hub.docker.com/r/seldonio/seldon-core-s2i-java-build/tags/) | 0.1 | |
+| [Seldon Java Runtime Wrapper for S2I](../java/README.md) | [seldonio/seldon-core-s2i-java-runtime](https://hub.docker.com/r/seldonio/seldon-core-s2i-java-runtime/tags/) | 0.1 | |
+| [Seldon R Wrapper for S2I](../R/README.md) | [seldonio/seldon-core-s2i-r](https://hub.docker.com/r/seldonio/seldon-core-s2i-r/tags/) | 0.2 | |
+| [Seldon NodeJS Wrapper for S2I](../nodejs/README.md) | [seldonio/seldon-core-s2i-nodejs](https://hub.docker.com/r/seldonio/seldon-core-s2i-nodejs/tags/) | 0.1 | 0.2-SNAPSHOT |
+
+
+### Java packages
+
+You can find these packages in the Maven repository.
+
+| Description | Package | Version |
+|-------------|---------|---------|
+| [Seldon Core Wrapper](https://github.com/SeldonIO/seldon-java-wrapper) | seldon-core-wrapper | 0.1.5 |
+| [Seldon Core JPMML](https://github.com/SeldonIO/JPMML-utils) | seldon-core-jpmml | 0.0.1 |
+
+
+
+## Deprecated
+
+### Language wrappers
+
+| Description | Image URL | Stable Version | Development |
+|-------------|-----------|----------------|-------------|
+| [Seldon Python 2 Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python2](https://hub.docker.com/r/seldonio/seldon-core-s2i-python2/tags/) | 0.5.1 | deprecated |