Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions docs-gb/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,39 @@
* [RClone Storage Initializer - upgrading your cluster (AWS S3 / MinIO)](notebooks/global-rclone-upgrade.md)

## Reference
* [Annotation Based Configuration](reference/annotations.md)
* [Benchmarking](reference/benchmarking.md)
* [General Availability](reference/ga.md)
* [Helm Charts](reference/helm_charts.md)
* [Images](reference/images.md)
* [Logging and Log Level](reference/log_level.md)
* [Private Docker Registry](reference/private_registries.md)
* [Prediction APIs]
* [Open Inference Protocol](reference/v2-protocol.md)
* [Scalar Value Types](reference/v2-protocol.md) *
* [Microservice API](reference/internal-api.md)
* [External API](reference/external-prediction.md)
* [Prediction Proto Buffer Spec](reference/prediction.md)
* [Prediction Open API Spec] *
* [Python API Reference]*
* [Release Highlights](reference/release-highlights/)
* [Release 1.7.0 Hightlights](reference/release-highlights/release-1.7.0.md)
* [Release 1.6.0 Hightlights](reference/release-highlights/release-1.6.0.md)
* [Release 1.5.0 Hightlights](reference/release-highlights/release-1.5.0.md)
* [Release 1.1.0 Hightlights](reference/release-highlights/release-1.1.0.md)
* [Release 1.0.0 Hightlights](reference/release-highlights/release-1.0.0.md)
* [Release 0.4.1 Hightlights](reference/release-highlights/release-0.4.1.md)
* [Release 0.4.0 Hightlights](reference/release-highlights/release-0.4.0.md)
* [Release 0.3.0 Hightlights](reference/release-highlights/release-0.3.0.md)
* [Release 0.2.7 Hightlights](reference/release-highlights/release-0.2.7.md)
* [Release 0.2.6 Hightlights](reference/release-highlights/release-0.2.6.md)
* [Release 0.2.5 Hightlights](reference/release-highlights/release-0.2.5.md)
* [Release 0.2.3 Hightlights](reference/release-highlights/release-0.2.3.md)
* [Seldon Deployment CRD]*
* [Service Orchestrator](reference/svcorch.md)
* [Kubeflow](reference/kubeflow.md)
* [Archived Docs](https://docs.seldon.io/projects/seldon-core/en/1.18/nav/archive.html)


## Contributing
* [Overview](developer/readme.md)
Expand Down
49 changes: 49 additions & 0 deletions docs-gb/reference/annotations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Annotation Based Configuration

You can configure aspects of Seldon Core via annotations in the SeldonDeployment resource and also the optional API OAuth Gateway. Please create an issue if you would like some configuration added.

## SeldonDeployment Annotations

### gRPC API Control

* ```seldon.io/grpc-max-message-size``` : Maximum gRPC message size (bytes)
* Locations : SeldonDeployment.spec.annotations
* Default is MaxInt32
* [gRPC message size example](model_rest_grpc_settings.md)
* ```seldon.io/grpc-timeout``` : gRPC timeout (msecs)
* Locations : SeldonDeployment.spec.annotations
* Default is no timeout
* [gRPC timeout example](model_rest_grpc_settings.md)


### REST API Control

.. Note::
When using REST APIs, timeouts will only apply to each node and not to the
full inference graph.
Therefore, each sub-request for each individual node in the graph will be
able to take up to ``seldon.io/rest-timeout`` milliseconds.

* ```seldon.io/rest-timeout``` : REST timeout (msecs)
* Locations : SeldonDeployment.spec.annotations
* Default is no overall timeout but will use Go's default transport settings which include a 30 sec connection timeout.
* [REST timeout example](model_rest_grpc_settings.md)


### Service Orchestrator

* ```seldon.io/engine-separate-pod``` : Use a separate pod for the service orchestrator
* Locations : SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations
* [Separate svc-orc pod example](model_svcorch_sep.md)
* Locations : SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations
* ```seldon.io/executor-logger-queue-size``` : Size of request logging worker queue
* Locations: SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations
* ```seldon.io/executor-logger-write-timeout-ms``` : Write timeout for adding to logging work queue
* Locations: SeldonDeployment.metadata.annotations, SeldonDeployment.spec.annotations


### Misc

* ```seldon.io/svc-name``` : Custom service name for predictor. You will be responsible that it doesn't clash with any existing service name in the namespace of the deployed SeldonDeployment.
* Locations : SeldonDeployment.spec.predictors[].annotations
* [custom service name example](custom_svc_name.md)
60 changes: 60 additions & 0 deletions docs-gb/reference/benchmarking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Seldon-core Benchmarking and Load Testing

This page is a work in progress to provide benchmarking and load testing.

This work is ongoing and we welcome feedback

## Tools

* For REST tests we use [vegeta](https://github.com/tsenart/vegeta)
* For gRPC tests we use [ghz](https://ghz.sh/)

## Service Orchestrator

These benchmark tests are to evaluate the extra latency added by including the service orchestrator.

* [Service orchestrator benchmark](../examples/bench_svcOrch.html)

### Results

On A 3 node DigitalOcean cluster 24vCPUs 96 GB, running Tensorflow Flowers image classfier.

| Test | Additional latency |
| --- | ------------------ |
| REST | 9ms |
| gRPC | 4ms |

Further work:

* Statistical confidence test


## Tensorflow

Test the max throughput and HPA usage.

* [Tensorflow benchmark](../examples/bench_tensorflow.html)

### Results

On A 3 node DigitalOcean cluster 24vCPUs 96 GB, running Tensorflow Flowers image classfier with HPA and running at max throughput for a single model. No ramp up, as vegeta does not support this. See notebook for details.

```
Latencies:

mean: 259.990239 ms
50th: 131.917169 ms
90th: 310.053255 ms
95th: 916.684759 ms
99th: 2775.05271 ms

Throughput: 23.997572337989126/s
Errors: False
```

## Flexible Benchmarking with Argo Workflows

We have also an example that shows how to leverage the batch processing workflow that we showcase in the examples, but to perform benchmarking with Seldon Core models.

* [Seldon deployment benchmark](../examples/vegeta_bench_argo_workflows.html)

85 changes: 85 additions & 0 deletions docs-gb/reference/external-prediction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
# External Prediction API

![API](./api.png)

The Seldon Core exposes a generic external API to connect your ML runtime prediction to external business applications.

## REST API

### Prediction

- endpoint : POST /api/v1.0/predictions
- payload : JSON representation of `SeldonMessage` - see [proto definition](./prediction.md#proto-buffer-and-grpc-definition)
- example payload :

```json
{"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}}
```

### Feedback

- endpoint : POST /api/v1.0/feedback
- payload : JSON representation of `Feedback` - see [proto definition](./prediction.md#proto-buffer-and-grpc-definition)

### Metadata - Graph Level

- endpoint : GET /api/v1.0/metadata
- example response :

```json
{
"name": "example",
"models": {
"model-1": {
"name": "Model 1",
"platform": "platform-name",
"versions": ["model-version"],
"inputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 5]}],
"outputs": [{"datatype": "BYTES", "name": "output", "shape": [1, 3]}]
},
"model-2": {
"name": "Model 2",
"platform": "platform-name",
"versions": ["model-version"],
"inputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 3]}],
"outputs": [{"datatype": "BYTES", "name": "output", "shape": [3]}]
}
},
"graphinputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 5]}],
"graphoutputs": [{"datatype": "BYTES", "name": "output", "shape": [3]}]
}
```

see metadata [documentation](./metadata.md) for more details.


### Metadata - Model Level

- endpoint : GET /api/v1.0/metadata/{MODEL_NAME}
- example response:

```json
{
"name": "Model 1",
"versions": ["model-version"],
"platform": "platform-name",
"inputs": [{"datatype": "BYTES", "name": "input", "shape": [1, 5]}],
"outputs": [{"datatype": "BYTES", "name": "output", "shape": [1, 3]}],
}
```

see metadata [documentation](./metadata.md) for more details.


## gRPC

```protobuf
service Seldon {
rpc Predict(SeldonMessage) returns (SeldonMessage) {};
rpc SendFeedback(Feedback) returns (SeldonMessage) {};
rpc ModelMetadata(SeldonModelMetadataRequest) returns (SeldonModelMetadata) {};
rpc GraphMetadata(google.protobuf.Empty) returns (SeldonGraphMetadata) {};
}
```

see full [proto definition](./prediction.md#proto-buffer-and-grpc-definition)
12 changes: 12 additions & 0 deletions docs-gb/reference/ga.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# General Availability

Seldon 1.0 release will be a GA release. However, a subset of all the current Seldon Core functionality will come under GA. The GA components are:

* Seldon Deployment v1 CRD
* Seldon v1 REST and gRPC API
* Seldon-Core Python wrapper 1.0
* Tensorflow, SKLearn, XGboost prepacked servers

Seldon offers commercial support for Seldon Core as well as a commercial enterprise offering for running ML Serving in production using Seldon Core. Please visit the [Seldon website](https://www.seldon.io/) for more information.


37 changes: 37 additions & 0 deletions docs-gb/reference/helm_charts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Seldon Core Helm Charts

Helm charts are published to our official repo.

## Core Charts

The core charts for installing Seldon Core are shown below.

.. toctree::
:maxdepth: 1

seldon-core-operator <../charts/seldon-core-operator>

For further details see [here](../workflow/install.md).

## Inference Graph Templates

A set of charts to provide example templates for creating particular inference graphs using Seldon Core

.. toctree::
:maxdepth: 1

seldon-single-model <../charts/seldon-single-model>
seldon-abtest <../charts/seldon-abtest>
seldon-mab <../charts/seldon-mab>
seldon-od-model <../charts/seldon-od-model>
seldon-od-transformer <../charts/seldon-od-transformer>

[A notebook with examples of using the above charts](https://docs.seldon.io/projects/seldon-core/en/latest/examples/helm_examples.html) is provided.

## Misc

.. toctree::
:maxdepth: 1

seldon-core-loadtesting <../charts/seldon-core-loadtesting>
seldon-core-analytics <../charts/seldon-core-analytics>
76 changes: 76 additions & 0 deletions docs-gb/reference/images.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Latest Seldon Images


## Core images

| Description | Image URL | Stable Version | Development |
|-------------|-----------|----------------|-------------|
| [Seldon Operator](../workflow/install.md) | [seldonio/seldon-core-operator](https://hub.docker.com/r/seldonio/seldon-core-operator/tags/) | 1.18.0 | 1.19.0-dev |
| [Seldon Service Orchestrator (Go)](../graph/svcorch.md)| [seldonio/seldon-core-executor](https://hub.docker.com/r/seldonio/seldon-core-executor/tags) | 1.18.0 | 1.19.0-dev |

## Pre-packaged servers


| Description | Image URL | Version |
|-------------|-----------|---------|
| [MLFlow Server](../servers/mlflow.md) | [seldonio/mlflowserver](https://hub.docker.com/r/seldonio/mlflowserver/tags/) | 1.18.0 |
| [SKLearn Server](../servers/sklearn.md) | [seldonio/sklearnserver](https://hub.docker.com/r/seldonio/sklearnserver/tags/) | 1.18.0 |
| [XGBoost Server](../servers/xgboost.md) | [seldonio/xgboostserver](https://hub.docker.com/r/seldonio/xgboostserver/tags/) | 1.18.0 |

## Language wrappers

| Description | Image URL | Stable Version | Development |
|-------------|-----------|----------------|-------------|
| [Seldon Python 3 (3.8) Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python3](https://hub.docker.com/r/seldonio/seldon-core-s2i-python3/tags/) | 1.18.0 | 1.19.0-dev |
| [Seldon Python 3.7 Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python37](https://hub.docker.com/r/seldonio/seldon-core-s2i-python37/tags/) | 1.18.0 | 1.19.0-dev |
| [Seldon Python 3.8 Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python38](https://hub.docker.com/r/seldonio/seldon-core-s2i-python38/tags/) | 1.18.0 | 1.19.0-dev |
| [Seldon Python 3.7 GPU Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python37-gpu](https://hub.docker.com/r/seldonio/seldon-core-s2i-python37-gpu/tags/) | 1.18.0 | 1.19.0-dev |
| [Seldon Python 3.8 GPU Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python38-gpu](https://hub.docker.com/r/seldonio/seldon-core-s2i-python38-gpu/tags/) | 1.18.0 | 1.19.0-dev |

## Server proxies

| Description | Image URL | Stable Version |
|-------------|-----------|----------------|
| [SageMaker proxy](https://github.com/SeldonIO/seldon-core/tree/master/integrations/sagemaker) | [seldonio/sagemaker-proxy](https://hub.docker.com/r/seldonio/sagemaker-proxy/tags/) | 0.1 |
| [Tensorflow Serving proxy](../servers/tensorflow.md) | [seldonio/tfserving-proxy](https://hub.docker.com/r/seldonio/tfserving-proxy/tags/) | 1.18.0 |


## Python modules

| Description | Python Version | Version |
|-------------|----------------|---------|
| [seldon-core](https://pypi.org/project/seldon-core/) | >3.4,<3.9 | 1.18.0 |
| [seldon-core](https://pypi.org/project/seldon-core/) | 2,>=3,<3.7 | 0.2.6 (deprecated) |


## Incubating

### Language wrappers

| Description | Image URL | Stable Version | Development |
|-------------|-----------|----------------|-------------|
| [Seldon Python ONNX Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python3-ngraph-onnx](https://hub.docker.com/r/seldonio/seldon-core-s2i-python3-ngraph-onnx/tags/) | 0.3 | |
| [Seldon Java Build Wrapper for S2I](../java/README.md) | [seldonio/seldon-core-s2i-java-build](https://hub.docker.com/r/seldonio/seldon-core-s2i-java-build/tags/) | 0.1 | |
| [Seldon Java Runtime Wrapper for S2I](../java/README.md) | [seldonio/seldon-core-s2i-java-runtime](https://hub.docker.com/r/seldonio/seldon-core-s2i-java-runtime/tags/) | 0.1 | |
| [Seldon R Wrapper for S2I](../R/README.md) | [seldonio/seldon-core-s2i-r](https://hub.docker.com/r/seldonio/seldon-core-s2i-r/tags/) | 0.2 | |
| [Seldon NodeJS Wrapper for S2I](../nodejs/README.md) | [seldonio/seldon-core-s2i-nodejs](https://hub.docker.com/r/seldonio/seldon-core-s2i-nodejs/tags/) | 0.1 | 0.2-SNAPSHOT |


### Java packages

You can find these packages in the Maven repository.

| Description | Package | Version |
|-------------|---------|---------|
| [Seldon Core Wrapper](https://github.com/SeldonIO/seldon-java-wrapper) | seldon-core-wrapper | 0.1.5 |
| [Seldon Core JPMML](https://github.com/SeldonIO/JPMML-utils) | seldon-core-jpmml | 0.0.1 |



## Deprecated

### Language wrappers

| Description | Image URL | Stable Version | Development |
|-------------|-----------|----------------|-------------|
| [Seldon Python 2 Wrapper for S2I](../python/python_wrapping_s2i.md) | [seldonio/seldon-core-s2i-python2](https://hub.docker.com/r/seldonio/seldon-core-s2i-python2/tags/) | 0.5.1 | deprecated |
Loading
Loading