diff --git a/docs-gb/SUMMARY.md b/docs-gb/SUMMARY.md
index 5dfd5c1bb..40e545978 100644
--- a/docs-gb/SUMMARY.md
+++ b/docs-gb/SUMMARY.md
@@ -24,15 +24,19 @@
   * [Alibi-Explain](runtimes/alibi-explain.md)
   * [HuggingFace](runtimes/huggingface.md)
   * [Custom](runtimes/custom.md)
-* [Reference](reference/README.md)
-  * [MLServer Settings](reference/settings.md)
-  * [Model Settings](reference/model-settings.md)
-  * [MLServer CLI](reference/cli.md)
-  * [Python API](reference/python-api/README.md)
-    * [MLModel](reference/api/model.md)
-    * [Types](reference/api/types.md)
-    * [Codecs](reference/api/codecs.md)
-    * [Metrics](reference/api/metrics.md)
+
+* [API Reference](api/api-reference.md)
+  * [MLServer Settings](api/Settings.md)
+  * [Model Settings](api/ModelSettings.md)
+  * [Model Parameters](api/ModelParameters.md)
+  * [MLServer CLI](api/CLI.md)
+  <!-- * [MLServer CLI](api-reference/mlserver_cli.md) -->
+  * [Python API](api/PythonAPI.md)
+    * [MLModel](api/MLModel.md)
+    * [Types](api/Types.md)
+    * [Codecs](api/Codecs.md)
+    * [Metrics](api/Metrics.md)
+
 * [Examples](examples/README.md)
   * [Serving Scikit-Learn models](examples/sklearn/README.md)
   * [Serving XGBoost models](examples/xgboost/README.md)
diff --git a/docs-gb/api/CLI.md b/docs-gb/api/CLI.md
new file mode 100644
index 000000000..1f06ea6e7
--- /dev/null
+++ b/docs-gb/api/CLI.md
@@ -0,0 +1,143 @@
+# MLServer CLI
+
+The MLServer package includes a mlserver CLI designed to help with common tasks in a model’s lifecycle. You can see a high-level outline at any time via:
+
+```bash
+mlserver --help
+```
+
+## root
+
+Command-line interface to manage MLServer models.
+
+```bash
+root [OPTIONS] COMMAND [ARGS]...
+```
+
+### Options
+
+- `--version` (Default: `False`)
+  Show the version and exit.
+
+## build
+
+Build a Docker image for a custom MLServer runtime.
+
+```bash
+root build [OPTIONS] FOLDER
+```
+
+### Options
+
+- `-t`, `--tag` `<text>`
+
+- `--no-cache` (Default: `False`)
+
+### Arguments
+
+- `FOLDER`
+  Required argument
+
+## dockerfile
+
+Generate a Dockerfile
+
+```bash
+root dockerfile [OPTIONS] FOLDER
+```
+
+### Options
+
+- `-i`, `--include-dockerignore` (Default: `False`)
+
+### Arguments
+
+- `FOLDER`
+  Required argument
+
+## infer
+
+Deprecated: This experimental feature will be removed in future work.
+    Execute batch inference requests against V2 inference server.
+
+> Deprecated: This experimental feature will be removed in future work.
+
+```bash
+root infer [OPTIONS]
+```
+
+### Options
+
+- `--url`, `-u` `<text>` (Default: `localhost:8080`; Env: `MLSERVER_INFER_URL`)
+  URL of the MLServer to send inference requests to. Should not contain http or https.
+
+- `--model-name`, `-m` `<text>` (Required; Env: `MLSERVER_INFER_MODEL_NAME`)
+  Name of the model to send inference requests to.
+
+- `--input-data-path`, `-i` `<path>` (Required; Env: `MLSERVER_INFER_INPUT_DATA_PATH`)
+  Local path to the input file containing inference requests to be processed.
+
+- `--output-data-path`, `-o` `<path>` (Required; Env: `MLSERVER_INFER_OUTPUT_DATA_PATH`)
+  Local path to the output file for the inference responses to be  written to.
+
+- `--workers`, `-w` `<integer>` (Default: `10`; Env: `MLSERVER_INFER_WORKERS`)
+
+- `--retries`, `-r` `<integer>` (Default: `3`; Env: `MLSERVER_INFER_RETRIES`)
+
+- `--batch-size`, `-s` `<integer>` (Default: `1`; Env: `MLSERVER_INFER_BATCH_SIZE`)
+  Send inference requests grouped together as micro-batches.
+
+- `--binary-data`, `-b` (Default: `False`; Env: `MLSERVER_INFER_BINARY_DATA`)
+  Send inference requests as binary data (not fully supported).
+
+- `--verbose`, `-v` (Default: `False`; Env: `MLSERVER_INFER_VERBOSE`)
+  Verbose mode.
+
+- `--extra-verbose`, `-vv` (Default: `False`; Env: `MLSERVER_INFER_EXTRA_VERBOSE`)
+  Extra verbose mode (shows detailed requests and responses).
+
+- `--transport`, `-t` `<choice>` (Options: `rest` | `grpc`; Default: `rest`; Env: `MLSERVER_INFER_TRANSPORT`)
+  Transport type to use to send inference requests. Can be 'rest' or 'grpc' (not yet supported).
+
+- `--request-headers`, `-H` `<text>` (Env: `MLSERVER_INFER_REQUEST_HEADERS`)
+  Headers to be set on each inference request send to the server. Multiple options are allowed as: -H 'Header1: Val1' -H 'Header2: Val2'. When setting up as environmental provide as 'Header1:Val1 Header2:Val2'.
+
+- `--timeout` `<integer>` (Default: `60`; Env: `MLSERVER_INFER_CONNECTION_TIMEOUT`)
+  Connection timeout to be passed to tritonclient.
+
+- `--batch-interval` `<float>` (Default: `0`; Env: `MLSERVER_INFER_BATCH_INTERVAL`)
+  Minimum time interval (in seconds) between requests made by each worker.
+
+- `--batch-jitter` `<float>` (Default: `0`; Env: `MLSERVER_INFER_BATCH_JITTER`)
+  Maximum random jitter (in seconds) added to batch interval between requests.
+
+- `--use-ssl` (Default: `False`; Env: `MLSERVER_INFER_USE_SSL`)
+  Use SSL in communications with inference server.
+
+- `--insecure` (Default: `False`; Env: `MLSERVER_INFER_INSECURE`)
+  Disable SSL verification in communications. Use with caution.
+
+## init
+
+Generate a base project template
+
+```bash
+root init [OPTIONS]
+```
+
+### Options
+
+- `-t`, `--template` `<text>` (Default: `https://github.com/EthicalML/sml-security/`)
+
+## start
+
+Start serving a machine learning model with MLServer.
+
+```bash
+root start [OPTIONS] FOLDER
+```
+
+### Arguments
+
+- `FOLDER`
+  Required argument
diff --git a/docs-gb/api/Codecs.md b/docs-gb/api/Codecs.md
new file mode 100644
index 000000000..bd59c1daf
--- /dev/null
+++ b/docs-gb/api/Codecs.md
@@ -0,0 +1,542 @@
+# Codecs
+
+## Base64Codec
+
+Codec that convers to / from a base64 input.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_input()
+
+```python
+decode_input(request_input: RequestInput) -> List[bytes]
+```
+
+Decode a request input into a high-level Python type.
+
+### decode_output()
+
+```python
+decode_output(response_output: ResponseOutput) -> List[bytes]
+```
+
+Decode a response output into a high-level Python type.
+
+### encode_input()
+
+```python
+encode_input(name: str, payload: List[bytes], use_bytes: bool = True, kwargs) -> RequestInput
+```
+
+Encode the given payload into a ``RequestInput``.
+
+### encode_output()
+
+```python
+encode_output(name: str, payload: List[bytes], use_bytes: bool = True, kwargs) -> ResponseOutput
+```
+
+Encode the given payload into a response output.
+
+## CodecError
+
+### Methods
+
+### add_note()
+
+```python
+add_note(...)
+```
+
+Exception.add_note(note) --
+add a note to the exception
+
+### with_traceback()
+
+```python
+with_traceback(...)
+```
+
+Exception.with_traceback(tb) --
+set self.__traceback__ to tb and return self.
+
+## DatetimeCodec
+
+Codec that convers to / from a datetime input.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_input()
+
+```python
+decode_input(request_input: RequestInput) -> List[datetime]
+```
+
+Decode a request input into a high-level Python type.
+
+### decode_output()
+
+```python
+decode_output(response_output: ResponseOutput) -> List[datetime]
+```
+
+Decode a response output into a high-level Python type.
+
+### encode_input()
+
+```python
+encode_input(name: str, payload: List[Union[str, datetime]], use_bytes: bool = True, kwargs) -> RequestInput
+```
+
+Encode the given payload into a ``RequestInput``.
+
+### encode_output()
+
+```python
+encode_output(name: str, payload: List[Union[str, datetime]], use_bytes: bool = True, kwargs) -> ResponseOutput
+```
+
+Encode the given payload into a response output.
+
+## InputCodec
+
+The InputCodec interface lets you define type conversions of your raw input
+data to / from the Open Inference Protocol.
+Note that this codec applies at the individual input (output) level.
+
+For request-wide transformations (e.g. dataframes), use the
+``RequestCodec`` interface instead.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_input()
+
+```python
+decode_input(request_input: RequestInput) -> Any
+```
+
+Decode a request input into a high-level Python type.
+
+### decode_output()
+
+```python
+decode_output(response_output: ResponseOutput) -> Any
+```
+
+Decode a response output into a high-level Python type.
+
+### encode_input()
+
+```python
+encode_input(name: str, payload: Any, kwargs) -> RequestInput
+```
+
+Encode the given payload into a ``RequestInput``.
+
+### encode_output()
+
+```python
+encode_output(name: str, payload: Any, kwargs) -> ResponseOutput
+```
+
+Encode the given payload into a response output.
+
+## NumpyCodec
+
+Decodes an request input (response output) as a NumPy array.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_input()
+
+```python
+decode_input(request_input: RequestInput) -> ndarray
+```
+
+Decode a request input into a high-level Python type.
+
+### decode_output()
+
+```python
+decode_output(response_output: ResponseOutput) -> ndarray
+```
+
+Decode a response output into a high-level Python type.
+
+### encode_input()
+
+```python
+encode_input(name: str, payload: ndarray, kwargs) -> RequestInput
+```
+
+Encode the given payload into a ``RequestInput``.
+
+### encode_output()
+
+```python
+encode_output(name: str, payload: ndarray, kwargs) -> ResponseOutput
+```
+
+Encode the given payload into a response output.
+
+## NumpyRequestCodec
+
+Decodes the first input (output) of request (response) as a NumPy array.
+This codec can be useful for cases where the whole payload is a single
+NumPy tensor.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_request()
+
+```python
+decode_request(request: InferenceRequest) -> Any
+```
+
+Decode an inference request into a high-level Python object.
+
+### decode_response()
+
+```python
+decode_response(response: InferenceResponse) -> Any
+```
+
+Decode an inference response into a high-level Python object.
+
+### encode_request()
+
+```python
+encode_request(payload: Any, kwargs) -> InferenceRequest
+```
+
+Encode the given payload into an inference request.
+
+### encode_response()
+
+```python
+encode_response(model_name: str, payload: Any, model_version: Optional[str] = None, kwargs) -> InferenceResponse
+```
+
+Encode the given payload into an inference response.
+
+## PandasCodec
+
+Decodes a request (response) into a Pandas DataFrame, assuming each input
+(output) head corresponds to a column of the DataFrame.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_request()
+
+```python
+decode_request(request: InferenceRequest) -> DataFrame
+```
+
+Decode an inference request into a high-level Python object.
+
+### decode_response()
+
+```python
+decode_response(response: InferenceResponse) -> DataFrame
+```
+
+Decode an inference response into a high-level Python object.
+
+### encode_outputs()
+
+```python
+encode_outputs(payload: DataFrame, use_bytes: bool = True) -> List[ResponseOutput]
+```
+
+### encode_request()
+
+```python
+encode_request(payload: DataFrame, use_bytes: bool = True, kwargs) -> InferenceRequest
+```
+
+Encode the given payload into an inference request.
+
+### encode_response()
+
+```python
+encode_response(model_name: str, payload: DataFrame, model_version: Optional[str] = None, use_bytes: bool = True, kwargs) -> InferenceResponse
+```
+
+Encode the given payload into an inference response.
+
+## RequestCodec
+
+The ``RequestCodec`` interface lets you define request-level conversions
+between high-level Python types and the Open Inference Protocol.
+This can be useful where the encoding of your payload encompases multiple
+input heads (e.g. dataframes, where each column can be thought as a
+separate input head).
+
+For individual input-level encoding / decoding, use the ``InputCodec``
+interface instead.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_request()
+
+```python
+decode_request(request: InferenceRequest) -> Any
+```
+
+Decode an inference request into a high-level Python object.
+
+### decode_response()
+
+```python
+decode_response(response: InferenceResponse) -> Any
+```
+
+Decode an inference response into a high-level Python object.
+
+### encode_request()
+
+```python
+encode_request(payload: Any, kwargs) -> InferenceRequest
+```
+
+Encode the given payload into an inference request.
+
+### encode_response()
+
+```python
+encode_response(model_name: str, payload: Any, model_version: Optional[str] = None, kwargs) -> InferenceResponse
+```
+
+Encode the given payload into an inference response.
+
+## StringCodec
+
+Encodes a list of Python strings as a BYTES input (output).
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_input()
+
+```python
+decode_input(request_input: RequestInput) -> List[str]
+```
+
+Decode a request input into a high-level Python type.
+
+### decode_output()
+
+```python
+decode_output(response_output: ResponseOutput) -> List[str]
+```
+
+Decode a response output into a high-level Python type.
+
+### encode_input()
+
+```python
+encode_input(name: str, payload: List[str], use_bytes: bool = True, kwargs) -> RequestInput
+```
+
+Encode the given payload into a ``RequestInput``.
+
+### encode_output()
+
+```python
+encode_output(name: str, payload: List[str], use_bytes: bool = True, kwargs) -> ResponseOutput
+```
+
+Encode the given payload into a response output.
+
+## StringRequestCodec
+
+Decodes the first input (output) of request (response) as a list of
+strings.
+This codec can be useful for cases where the whole payload is a single
+list of strings.
+
+### Methods
+
+### can_encode()
+
+```python
+can_encode(payload: Any) -> bool
+```
+
+Evaluate whether the codec can encode (decode) the payload.
+
+### decode_request()
+
+```python
+decode_request(request: InferenceRequest) -> Any
+```
+
+Decode an inference request into a high-level Python object.
+
+### decode_response()
+
+```python
+decode_response(response: InferenceResponse) -> Any
+```
+
+Decode an inference response into a high-level Python object.
+
+### encode_request()
+
+```python
+encode_request(payload: Any, kwargs) -> InferenceRequest
+```
+
+Encode the given payload into an inference request.
+
+### encode_response()
+
+```python
+encode_response(model_name: str, payload: Any, model_version: Optional[str] = None, kwargs) -> InferenceResponse
+```
+
+Encode the given payload into an inference response.
+
+## decode_args()
+
+```python
+decode_args(predict: Callable) -> Callable[[ForwardRef('MLModel'), <class 'mlserver.types.dataplane.InferenceRequest'>], Coroutine[Any, Any, InferenceResponse]]
+```
+
+_No description available._
+
+## decode_inference_request()
+
+```python
+decode_inference_request(inference_request: InferenceRequest, model_settings: Optional[ModelSettings] = None, metadata_inputs: Dict[str, MetadataTensor] = {}) -> Optional[Any]
+```
+
+_No description available._
+
+## decode_request_input()
+
+```python
+decode_request_input(request_input: RequestInput, metadata_inputs: Dict[str, MetadataTensor] = {}) -> Optional[Any]
+```
+
+_No description available._
+
+## encode_inference_response()
+
+```python
+encode_inference_response(payload: Any, model_settings: ModelSettings) -> Optional[InferenceResponse]
+```
+
+_No description available._
+
+## encode_response_output()
+
+```python
+encode_response_output(payload: Any, request_output: RequestOutput, metadata_outputs: Dict[str, MetadataTensor] = {}) -> Optional[ResponseOutput]
+```
+
+_No description available._
+
+## get_decoded()
+
+```python
+get_decoded(parametrised_obj: Union[InferenceRequest, RequestInput, RequestOutput, ResponseOutput, InferenceResponse]) -> Any
+```
+
+_No description available._
+
+## get_decoded_or_raw()
+
+```python
+get_decoded_or_raw(parametrised_obj: Union[InferenceRequest, RequestInput, RequestOutput, ResponseOutput, InferenceResponse]) -> Any
+```
+
+_No description available._
+
+## has_decoded()
+
+```python
+has_decoded(parametrised_obj: Union[InferenceRequest, RequestInput, RequestOutput, ResponseOutput, InferenceResponse]) -> bool
+```
+
+_No description available._
+
+## register_input_codec()
+
+```python
+register_input_codec(CodecKlass: Union[type[InputCodec], InputCodec])
+```
+
+_No description available._
+
+## register_request_codec()
+
+```python
+register_request_codec(CodecKlass: Union[type[RequestCodec], RequestCodec])
+```
+
+_No description available._
+
diff --git a/docs-gb/api/MLModel.md b/docs-gb/api/MLModel.md
new file mode 100644
index 000000000..969b6efaa
--- /dev/null
+++ b/docs-gb/api/MLModel.md
@@ -0,0 +1,127 @@
+# MLModel
+
+Abstract inference runtime which exposes the main interface to interact
+with ML models.
+
+## Methods
+
+### decode()
+
+```python
+decode(request_input: RequestInput, default_codec: Union[type[ForwardRef('InputCodec')], ForwardRef('InputCodec'), None] = None) -> Any
+```
+
+Helper to decode a **request input** into its corresponding high-level
+Python object.
+This method will find the most appropiate :doc:`input codec
+</user-guide/content-type>` based on the model's metadata and the
+input's content type.
+Otherwise, it will fall back to the codec specified in the
+``default_codec`` kwarg.
+
+### decode_request()
+
+```python
+decode_request(inference_request: InferenceRequest, default_codec: Union[type[ForwardRef('RequestCodec')], ForwardRef('RequestCodec'), None] = None) -> Any
+```
+
+Helper to decode an **inference request** into its corresponding
+high-level Python object.
+This method will find the most appropiate :doc:`request codec
+</user-guide/content-type>` based on the model's metadata and the
+requests's content type.
+Otherwise, it will fall back to the codec specified in the
+``default_codec`` kwarg.
+
+### encode()
+
+```python
+encode(payload: Any, request_output: RequestOutput, default_codec: Union[type[ForwardRef('InputCodec')], ForwardRef('InputCodec'), None] = None) -> ResponseOutput
+```
+
+Helper to encode a high-level Python object into its corresponding
+**response output**.
+This method will find the most appropiate :doc:`input codec
+</user-guide/content-type>` based on the model's metadata, request
+output's content type or payload's type.
+Otherwise, it will fall back to the codec specified in the
+``default_codec`` kwarg.
+
+### encode_response()
+
+```python
+encode_response(payload: Any, default_codec: Union[type[ForwardRef('RequestCodec')], ForwardRef('RequestCodec'), None] = None) -> InferenceResponse
+```
+
+Helper to encode a high-level Python object into its corresponding
+**inference response**.
+This method will find the most appropiate :doc:`request codec
+</user-guide/content-type>` based on the payload's type.
+Otherwise, it will fall back to the codec specified in the
+``default_codec`` kwarg.
+
+### load()
+
+```python
+load() -> bool
+```
+
+Method responsible for loading the model from a model artefact.
+This method will be called on each of the parallel workers (when
+:doc:`parallel inference </user-guide/parallel-inference>`) is
+enabled).
+Its return value will represent the model's readiness status.
+A return value of ``True`` will mean the model is ready.
+
+**This method can be overriden to implement your custom load
+logic.**
+
+### metadata()
+
+```python
+metadata() -> MetadataModelResponse
+```
+
+_No description available._
+
+### predict()
+
+```python
+predict(payload: InferenceRequest) -> InferenceResponse
+```
+
+Method responsible for running inference on the model.
+
+
+**This method can be overriden to implement your custom inference
+logic.**
+
+### predict_stream()
+
+```python
+predict_stream(payloads: AsyncIterator[InferenceRequest]) -> AsyncIterator[InferenceResponse]
+```
+
+Method responsible for running generation on the model, streaming a set
+of responses back to the client.
+
+
+**This method can be overriden to implement your custom inference
+logic.**
+
+### unload()
+
+```python
+unload() -> bool
+```
+
+Method responsible for unloading the model, freeing any resources (e.g.
+CPU memory, GPU memory, etc.).
+This method will be called on each of the parallel workers (when
+:doc:`parallel inference </user-guide/parallel-inference>`) is
+enabled).
+A return value of ``True`` will mean the model is now unloaded.
+
+**This method can be overriden to implement your custom unload
+logic.**
+
diff --git a/docs-gb/api/Metrics.md b/docs-gb/api/Metrics.md
new file mode 100644
index 000000000..4f86b5a2e
--- /dev/null
+++ b/docs-gb/api/Metrics.md
@@ -0,0 +1,53 @@
+# Metrics
+
+## MetricsServer
+
+### Methods
+
+### on_worker_stop()
+
+```python
+on_worker_stop(worker: Worker) -> None
+```
+
+### start()
+
+```python
+start()
+```
+
+### stop()
+
+```python
+stop(sig: Optional[int] = None)
+```
+
+## configure_metrics()
+
+```python
+configure_metrics(settings: Settings)
+```
+
+_No description available._
+
+## log()
+
+```python
+log(metrics)
+```
+
+Logs a new set of metric values.
+Each kwarg of this method will be treated as a separate metric / value
+pair.
+If any of the metrics does not exist, a new one will be created with a
+default description.
+
+## register()
+
+```python
+register(name: str, description: str) -> Histogram
+```
+
+Registers a new metric with its description.
+If the metric already exists, it will just return the existing one.
+
diff --git a/docs-gb/api/ModelParameters.md b/docs-gb/api/ModelParameters.md
new file mode 100644
index 000000000..ede0c1fee
--- /dev/null
+++ b/docs-gb/api/ModelParameters.md
@@ -0,0 +1,24 @@
+# ModelParameters
+
+### Config
+
+| Attribute | Type | Default |
+|-----------|------|---------|
+| `extra` | `str` | `"allow"` |
+| `env_prefix` | `str` | `"MLSERVER_MODEL_"` |
+| `env_file` | `str` | `".env"` |
+| `protected_namespaces` | `tuple` | `('model_', 'settings_')` |
+
+### Fields
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `autogenerate_inference_pool_gid` | `bool` | `False` | Flag to autogenerate the inference pool group id for this model. |
+| `content_type` | `Optional[str]` | `None` | Default content type to use for requests and responses. |
+| `environment_path` | `Optional[str]` | `None` | Path to a directory that contains the python environment to be used to load this model. |
+| `environment_tarball` | `Optional[str]` | `None` | Path to the environment tarball which should be used to load this model. |
+| `extra` | `Optional[dict]` | `<factory>` | Arbitrary settings, dependent on the inference runtime implementation. |
+| `format` | `Optional[str]` | `None` | Format of the model (only available on certain runtimes). |
+| `inference_pool_gid` | `Optional[str]` | `None` | Inference pool group id to be used to serve this model. |
+| `uri` | `Optional[str]` | `None` | URI where the model artifacts can be found. This path must be either absolute or relative to where MLServer is running. |
+| `version` | `Optional[str]` | `None` | Version of the model. |
diff --git a/docs-gb/api/ModelSettings.md b/docs-gb/api/ModelSettings.md
new file mode 100644
index 000000000..f9fa46b02
--- /dev/null
+++ b/docs-gb/api/ModelSettings.md
@@ -0,0 +1,27 @@
+# ModelSettings
+
+### Config
+
+| Attribute | Type | Default |
+|-----------|------|---------|
+| `extra` | `str` | `"ignore"` |
+| `env_prefix` | `str` | `"MLSERVER_MODEL_"` |
+| `env_file` | `str` | `".env"` |
+| `protected_namespaces` | `tuple` | `('model_', 'settings_')` |
+
+### Fields
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `cache_enabled` | `bool` | `False` | Enable caching for a specific model. This parameter can be used to disable cache for a specific model, if the server-level caching is enabled. If the server-level caching is disabled, this parameter value will have no effect. |
+| `implementation_` | `str` | `-` | *Python path* to the inference runtime to use to serve this model (e.g. `mlserver_sklearn.SKLearnModel`). |
+| `inputs` | `List[MetadataTensor]` | `<factory>` | Metadata about the inputs accepted by the model. |
+| `max_batch_size` | `int` | `0` | When adaptive batching is enabled, maximum number of requests to group together in a single batch. |
+| `max_batch_time` | `float` | `0.0` | When adaptive batching is enabled, maximum amount of time (in seconds) to wait for enough requests to build a full batch. |
+| `name` | `str` | `''` | Name of the model. |
+| `outputs` | `List[MetadataTensor]` | `<factory>` | Metadata about the outputs returned by the model. |
+| `parallel_workers` | `Optional[int]` | `None` | Use the `parallel_workers` field in the server-wide settings instead. |
+| `parameters` | `Optional[ModelParameters]` | `None` | Extra parameters for each instance of this model. |
+| `platform` | `str` | `''` | Framework used to train and serialise the model (e.g. sklearn). |
+| `versions` | `List[str]` | `<factory>` | Versions of dependencies used to train the model (e.g. sklearn/0.20.1). |
+| `warm_workers` | `bool` | `False` | Inference workers will now always be `warmed up` at start time. |
diff --git a/docs-gb/api/PythonAPI.md b/docs-gb/api/PythonAPI.md
new file mode 100644
index 000000000..45984fdc4
--- /dev/null
+++ b/docs-gb/api/PythonAPI.md
@@ -0,0 +1,30 @@
+# Python API
+
+MLServer exposes a Python framework to build custom inference runtimes, define request/response types, plug codecs for payload conversion, and emit metrics. This page provides a high-level overview and links to the API docs.
+
+- [MLModel](./MLModel.md)
+  - Base class to implement custom inference runtimes.
+  - Core lifecycle: `load()`, `predict()`, `unload()`.
+  - Helpers for encoding/decoding requests and responses.
+  - Access to model metadata and settings.
+  - Extend this class to implement your own model logic.
+- [Types](./Types.md)
+  - Data structures and enums for the V2 inference protocol.
+  - Includes Pydantic models like `InferenceRequest`, `InferenceResponse`, `RequestInput`, `ResponseOutput`.
+  - See model fields (type and default) and JSON Schemas in the docs.
+- [Codecs](./Codecs.md)
+  - Encode/decode payloads between Open Inference Protocol types and Python types.
+  - Base classes: `InputCodec` (inputs/outputs) and `RequestCodec` (requests/responses).
+  - Built-ins include codecs such as `NumpyCodec`, `Base64Codec`, `StringCodec`, etc.
+- [Metrics](./Metrics.md)
+  - Emit and configure metrics within MLServer.
+  - Use `log()` to record custom metrics; see server lifecycle hooks and utilities.
+
+{% hint style="tip" %}
+When creating a custom runtime, start by subclassing `MLModel`, use the structures from [Types](./Types.md) for requests/responses, pick or implement the appropriate [Codecs](./Codecs.md), and optionally emit [Metrics](./Metrics.md) from your model code.
+{% endhint %}
+
+
+
+
+
diff --git a/docs-gb/api/Settings.md b/docs-gb/api/Settings.md
new file mode 100644
index 000000000..8a4d5d6d0
--- /dev/null
+++ b/docs-gb/api/Settings.md
@@ -0,0 +1,46 @@
+# Settings
+
+### Config
+
+| Attribute | Type | Default |
+|-----------|------|---------|
+| `extra` | `str` | `"ignore"` |
+| `env_prefix` | `str` | `"MLSERVER_"` |
+| `env_file` | `str` | `".env"` |
+| `protected_namespaces` | `tuple` | `()` |
+
+### Fields
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `cache_enabled` | `bool` | `False` | Enable caching for the model predictions. |
+| `cache_size` | `int` | `100` | Cache size to be used if caching is enabled. |
+| `cors_settings` | `Optional[CORSSettings]` | `None` | - |
+| `debug` | `bool` | `True` | - |
+| `environments_dir` | `str` | `'-'` | - |
+| `extensions` | `List[str]` | `[]` | - |
+| `grpc_max_message_length` | `Optional[int]` | `None` | - |
+| `grpc_port` | `int` | `8081` | - |
+| `gzip_enabled` | `bool` | `True` | Enable GZipMiddleware. |
+| `host` | `str` | `'0.0.0.0'` | - |
+| `http_port` | `int` | `8080` | - |
+| `kafka_enabled` | `bool` | `False` | Enable Kafka integration for the server. |
+| `kafka_servers` | `str` | `'localhost:9092'` | Comma-separated list of Kafka servers. |
+| `kafka_topic_input` | `str` | `'mlserver-input'` | Kafka topic for input messages. |
+| `kafka_topic_output` | `str` | `'mlserver-output'` | Kafka topic for output messages. |
+| `load_models_at_startup` | `bool` | `True` | - |
+| `logging_settings` | `Union[str, Dict[Any, Any], None]` | `None` | Path to logging config file or dictionary configuration. |
+| `metrics_dir` | `str` | `'-'` | Directory used to share metrics across parallel workers. Equivalent to the `PROMETHEUS_MULTIPROC_DIR` env var in `prometheus-client`. Note that this won't be used if the `parallel_workers` flag is disabled. By default, the `.metrics` folder of the current working directory will be used. |
+| `metrics_endpoint` | `Optional[str]` | `'/metrics'` | Endpoint used to expose Prometheus metrics. Alternatively, can be set to `None` to disable it. |
+| `metrics_port` | `int` | `8082` | Port used to expose metrics endpoint. |
+| `metrics_rest_server_prefix` | `str` | `'rest_server'` | Metrics rest server string prefix to be exported. |
+| `model_repository_implementation` | `Optional[ImportString]` | `None` | - |
+| `model_repository_implementation_args` | `dict` | `{}` | - |
+| `model_repository_root` | `str` | `'.'` | - |
+| `parallel_workers` | `int` | `1` | - |
+| `parallel_workers_timeout` | `int` | `5` | - |
+| `root_path` | `str` | `''` | - |
+| `server_name` | `str` | `'mlserver'` | - |
+| `server_version` | `str` | `'1.7.0.dev0'` | - |
+| `tracing_server` | `Optional[str]` | `None` | Server name used to export OpenTelemetry tracing to collector service. |
+| `use_structured_logging` | `bool` | `False` | Use JSON-formatted structured logging instead of default format. |
diff --git a/docs-gb/api/Types.md b/docs-gb/api/Types.md
new file mode 100644
index 000000000..83fd55250
--- /dev/null
+++ b/docs-gb/api/Types.md
@@ -0,0 +1,1443 @@
+# Types
+
+## Datatype
+
+An enumeration.
+
+## InferenceErrorResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `error` | `Optional[str]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "properties": {
+    "error": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Error"
+    }
+  },
+  "title": "InferenceErrorResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## InferenceRequest
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `id` | `Optional[str]` | `None` | - |
+| `inputs` | `List[RequestInput]` | `-` | - |
+| `outputs` | `Optional[List[RequestOutput]]` | `None` | - |
+| `parameters` | `Optional[Parameters]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "Datatype": {
+      "enum": [
+        "BOOL",
+        "UINT8",
+        "UINT16",
+        "UINT32",
+        "UINT64",
+        "INT8",
+        "INT16",
+        "INT32",
+        "INT64",
+        "FP16",
+        "FP32",
+        "FP64",
+        "BYTES"
+      ],
+      "title": "Datatype",
+      "type": "string"
+    },
+    "Parameters": {
+      "additionalProperties": true,
+      "properties": {
+        "content_type": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Content Type"
+        },
+        "headers": {
+          "anyOf": [
+            {
+              "type": "object"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Headers"
+        }
+      },
+      "title": "Parameters",
+      "type": "object"
+    },
+    "RequestInput": {
+      "properties": {
+        "name": {
+          "title": "Name",
+          "type": "string"
+        },
+        "shape": {
+          "items": {
+            "type": "integer"
+          },
+          "title": "Shape",
+          "type": "array"
+        },
+        "datatype": {
+          "$ref": "#/$defs/Datatype"
+        },
+        "parameters": {
+          "anyOf": [
+            {
+              "$ref": "#/$defs/Parameters"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null
+        },
+        "data": {
+          "$ref": "#/$defs/TensorData"
+        }
+      },
+      "required": [
+        "name",
+        "shape",
+        "datatype",
+        "data"
+      ],
+      "title": "RequestInput",
+      "type": "object"
+    },
+    "RequestOutput": {
+      "properties": {
+        "name": {
+          "title": "Name",
+          "type": "string"
+        },
+        "parameters": {
+          "anyOf": [
+            {
+              "$ref": "#/$defs/Parameters"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null
+        }
+      },
+      "required": [
+        "name"
+      ],
+      "title": "RequestOutput",
+      "type": "object"
+    },
+    "TensorData": {
+      "anyOf": [
+        {
+          "items": {},
+          "type": "array"
+        },
+        {}
+      ],
+      "title": "TensorData"
+    }
+  },
+  "properties": {
+    "id": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Id"
+    },
+    "parameters": {
+      "anyOf": [
+        {
+          "$ref": "#/$defs/Parameters"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null
+    },
+    "inputs": {
+      "items": {
+        "$ref": "#/$defs/RequestInput"
+      },
+      "title": "Inputs",
+      "type": "array"
+    },
+    "outputs": {
+      "anyOf": [
+        {
+          "items": {
+            "$ref": "#/$defs/RequestOutput"
+          },
+          "type": "array"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Outputs"
+    }
+  },
+  "required": [
+    "inputs"
+  ],
+  "title": "InferenceRequest",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## InferenceResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `id` | `Optional[str]` | `None` | - |
+| `model_name` | `str` | `-` | - |
+| `model_version` | `Optional[str]` | `None` | - |
+| `outputs` | `List[ResponseOutput]` | `-` | - |
+| `parameters` | `Optional[Parameters]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "Datatype": {
+      "enum": [
+        "BOOL",
+        "UINT8",
+        "UINT16",
+        "UINT32",
+        "UINT64",
+        "INT8",
+        "INT16",
+        "INT32",
+        "INT64",
+        "FP16",
+        "FP32",
+        "FP64",
+        "BYTES"
+      ],
+      "title": "Datatype",
+      "type": "string"
+    },
+    "Parameters": {
+      "additionalProperties": true,
+      "properties": {
+        "content_type": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Content Type"
+        },
+        "headers": {
+          "anyOf": [
+            {
+              "type": "object"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Headers"
+        }
+      },
+      "title": "Parameters",
+      "type": "object"
+    },
+    "ResponseOutput": {
+      "properties": {
+        "name": {
+          "title": "Name",
+          "type": "string"
+        },
+        "shape": {
+          "items": {
+            "type": "integer"
+          },
+          "title": "Shape",
+          "type": "array"
+        },
+        "datatype": {
+          "$ref": "#/$defs/Datatype"
+        },
+        "parameters": {
+          "anyOf": [
+            {
+              "$ref": "#/$defs/Parameters"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null
+        },
+        "data": {
+          "$ref": "#/$defs/TensorData"
+        }
+      },
+      "required": [
+        "name",
+        "shape",
+        "datatype",
+        "data"
+      ],
+      "title": "ResponseOutput",
+      "type": "object"
+    },
+    "TensorData": {
+      "anyOf": [
+        {
+          "items": {},
+          "type": "array"
+        },
+        {}
+      ],
+      "title": "TensorData"
+    }
+  },
+  "properties": {
+    "model_name": {
+      "title": "Model Name",
+      "type": "string"
+    },
+    "model_version": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Model Version"
+    },
+    "id": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Id"
+    },
+    "parameters": {
+      "anyOf": [
+        {
+          "$ref": "#/$defs/Parameters"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null
+    },
+    "outputs": {
+      "items": {
+        "$ref": "#/$defs/ResponseOutput"
+      },
+      "title": "Outputs",
+      "type": "array"
+    }
+  },
+  "required": [
+    "model_name",
+    "outputs"
+  ],
+  "title": "InferenceResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## MetadataModelErrorResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `error` | `str` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "properties": {
+    "error": {
+      "title": "Error",
+      "type": "string"
+    }
+  },
+  "required": [
+    "error"
+  ],
+  "title": "MetadataModelErrorResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## MetadataModelResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `inputs` | `Optional[List[MetadataTensor]]` | `None` | - |
+| `name` | `str` | `-` | - |
+| `outputs` | `Optional[List[MetadataTensor]]` | `None` | - |
+| `parameters` | `Optional[Parameters]` | `None` | - |
+| `platform` | `str` | `-` | - |
+| `versions` | `Optional[List[str]]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "Datatype": {
+      "enum": [
+        "BOOL",
+        "UINT8",
+        "UINT16",
+        "UINT32",
+        "UINT64",
+        "INT8",
+        "INT16",
+        "INT32",
+        "INT64",
+        "FP16",
+        "FP32",
+        "FP64",
+        "BYTES"
+      ],
+      "title": "Datatype",
+      "type": "string"
+    },
+    "MetadataTensor": {
+      "properties": {
+        "name": {
+          "title": "Name",
+          "type": "string"
+        },
+        "datatype": {
+          "$ref": "#/$defs/Datatype"
+        },
+        "shape": {
+          "items": {
+            "type": "integer"
+          },
+          "title": "Shape",
+          "type": "array"
+        },
+        "parameters": {
+          "anyOf": [
+            {
+              "$ref": "#/$defs/Parameters"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null
+        }
+      },
+      "required": [
+        "name",
+        "datatype",
+        "shape"
+      ],
+      "title": "MetadataTensor",
+      "type": "object"
+    },
+    "Parameters": {
+      "additionalProperties": true,
+      "properties": {
+        "content_type": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Content Type"
+        },
+        "headers": {
+          "anyOf": [
+            {
+              "type": "object"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Headers"
+        }
+      },
+      "title": "Parameters",
+      "type": "object"
+    }
+  },
+  "properties": {
+    "name": {
+      "title": "Name",
+      "type": "string"
+    },
+    "versions": {
+      "anyOf": [
+        {
+          "items": {
+            "type": "string"
+          },
+          "type": "array"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Versions"
+    },
+    "platform": {
+      "title": "Platform",
+      "type": "string"
+    },
+    "inputs": {
+      "anyOf": [
+        {
+          "items": {
+            "$ref": "#/$defs/MetadataTensor"
+          },
+          "type": "array"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Inputs"
+    },
+    "outputs": {
+      "anyOf": [
+        {
+          "items": {
+            "$ref": "#/$defs/MetadataTensor"
+          },
+          "type": "array"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Outputs"
+    },
+    "parameters": {
+      "anyOf": [
+        {
+          "$ref": "#/$defs/Parameters"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null
+    }
+  },
+  "required": [
+    "name",
+    "platform"
+  ],
+  "title": "MetadataModelResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## MetadataServerErrorResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `error` | `str` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "properties": {
+    "error": {
+      "title": "Error",
+      "type": "string"
+    }
+  },
+  "required": [
+    "error"
+  ],
+  "title": "MetadataServerErrorResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## MetadataServerResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `extensions` | `List[str]` | `-` | - |
+| `name` | `str` | `-` | - |
+| `version` | `str` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "properties": {
+    "name": {
+      "title": "Name",
+      "type": "string"
+    },
+    "version": {
+      "title": "Version",
+      "type": "string"
+    },
+    "extensions": {
+      "items": {
+        "type": "string"
+      },
+      "title": "Extensions",
+      "type": "array"
+    }
+  },
+  "required": [
+    "name",
+    "version",
+    "extensions"
+  ],
+  "title": "MetadataServerResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## MetadataTensor
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `datatype` | `Datatype` | `-` | - |
+| `name` | `str` | `-` | - |
+| `parameters` | `Optional[Parameters]` | `None` | - |
+| `shape` | `List[int]` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "Datatype": {
+      "enum": [
+        "BOOL",
+        "UINT8",
+        "UINT16",
+        "UINT32",
+        "UINT64",
+        "INT8",
+        "INT16",
+        "INT32",
+        "INT64",
+        "FP16",
+        "FP32",
+        "FP64",
+        "BYTES"
+      ],
+      "title": "Datatype",
+      "type": "string"
+    },
+    "Parameters": {
+      "additionalProperties": true,
+      "properties": {
+        "content_type": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Content Type"
+        },
+        "headers": {
+          "anyOf": [
+            {
+              "type": "object"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Headers"
+        }
+      },
+      "title": "Parameters",
+      "type": "object"
+    }
+  },
+  "properties": {
+    "name": {
+      "title": "Name",
+      "type": "string"
+    },
+    "datatype": {
+      "$ref": "#/$defs/Datatype"
+    },
+    "shape": {
+      "items": {
+        "type": "integer"
+      },
+      "title": "Shape",
+      "type": "array"
+    },
+    "parameters": {
+      "anyOf": [
+        {
+          "$ref": "#/$defs/Parameters"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null
+    }
+  },
+  "required": [
+    "name",
+    "datatype",
+    "shape"
+  ],
+  "title": "MetadataTensor",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## Parameters
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `content_type` | `Optional[str]` | `None` | - |
+| `headers` | `Optional[Dict[str, Any]]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "additionalProperties": true,
+  "properties": {
+    "content_type": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Content Type"
+    },
+    "headers": {
+      "anyOf": [
+        {
+          "type": "object"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Headers"
+    }
+  },
+  "title": "Parameters",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## RepositoryIndexRequest
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `ready` | `Optional[bool]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "properties": {
+    "ready": {
+      "anyOf": [
+        {
+          "type": "boolean"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Ready"
+    }
+  },
+  "title": "RepositoryIndexRequest",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## RepositoryIndexResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `root` | `List[RepositoryIndexResponseItem]` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "RepositoryIndexResponseItem": {
+      "properties": {
+        "name": {
+          "title": "Name",
+          "type": "string"
+        },
+        "version": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Version"
+        },
+        "state": {
+          "$ref": "#/$defs/State"
+        },
+        "reason": {
+          "title": "Reason",
+          "type": "string"
+        }
+      },
+      "required": [
+        "name",
+        "state",
+        "reason"
+      ],
+      "title": "RepositoryIndexResponseItem",
+      "type": "object"
+    },
+    "State": {
+      "enum": [
+        "UNKNOWN",
+        "READY",
+        "UNAVAILABLE",
+        "LOADING",
+        "UNLOADING"
+      ],
+      "title": "State",
+      "type": "string"
+    }
+  },
+  "items": {
+    "$ref": "#/$defs/RepositoryIndexResponseItem"
+  },
+  "title": "RepositoryIndexResponse",
+  "type": "array"
+}
+
+```
+
+
+</details>
+
+## RepositoryIndexResponseItem
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `name` | `str` | `-` | - |
+| `reason` | `str` | `-` | - |
+| `state` | `State` | `-` | - |
+| `version` | `Optional[str]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "State": {
+      "enum": [
+        "UNKNOWN",
+        "READY",
+        "UNAVAILABLE",
+        "LOADING",
+        "UNLOADING"
+      ],
+      "title": "State",
+      "type": "string"
+    }
+  },
+  "properties": {
+    "name": {
+      "title": "Name",
+      "type": "string"
+    },
+    "version": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Version"
+    },
+    "state": {
+      "$ref": "#/$defs/State"
+    },
+    "reason": {
+      "title": "Reason",
+      "type": "string"
+    }
+  },
+  "required": [
+    "name",
+    "state",
+    "reason"
+  ],
+  "title": "RepositoryIndexResponseItem",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## RepositoryLoadErrorResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `error` | `Optional[str]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "properties": {
+    "error": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Error"
+    }
+  },
+  "title": "RepositoryLoadErrorResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## RepositoryUnloadErrorResponse
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `error` | `Optional[str]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "properties": {
+    "error": {
+      "anyOf": [
+        {
+          "type": "string"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null,
+      "title": "Error"
+    }
+  },
+  "title": "RepositoryUnloadErrorResponse",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## RequestInput
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `data` | `TensorData` | `-` | - |
+| `datatype` | `Datatype` | `-` | - |
+| `name` | `str` | `-` | - |
+| `parameters` | `Optional[Parameters]` | `None` | - |
+| `shape` | `List[int]` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "Datatype": {
+      "enum": [
+        "BOOL",
+        "UINT8",
+        "UINT16",
+        "UINT32",
+        "UINT64",
+        "INT8",
+        "INT16",
+        "INT32",
+        "INT64",
+        "FP16",
+        "FP32",
+        "FP64",
+        "BYTES"
+      ],
+      "title": "Datatype",
+      "type": "string"
+    },
+    "Parameters": {
+      "additionalProperties": true,
+      "properties": {
+        "content_type": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Content Type"
+        },
+        "headers": {
+          "anyOf": [
+            {
+              "type": "object"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Headers"
+        }
+      },
+      "title": "Parameters",
+      "type": "object"
+    },
+    "TensorData": {
+      "anyOf": [
+        {
+          "items": {},
+          "type": "array"
+        },
+        {}
+      ],
+      "title": "TensorData"
+    }
+  },
+  "properties": {
+    "name": {
+      "title": "Name",
+      "type": "string"
+    },
+    "shape": {
+      "items": {
+        "type": "integer"
+      },
+      "title": "Shape",
+      "type": "array"
+    },
+    "datatype": {
+      "$ref": "#/$defs/Datatype"
+    },
+    "parameters": {
+      "anyOf": [
+        {
+          "$ref": "#/$defs/Parameters"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null
+    },
+    "data": {
+      "$ref": "#/$defs/TensorData"
+    }
+  },
+  "required": [
+    "name",
+    "shape",
+    "datatype",
+    "data"
+  ],
+  "title": "RequestInput",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## RequestOutput
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `name` | `str` | `-` | - |
+| `parameters` | `Optional[Parameters]` | `None` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "Parameters": {
+      "additionalProperties": true,
+      "properties": {
+        "content_type": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Content Type"
+        },
+        "headers": {
+          "anyOf": [
+            {
+              "type": "object"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Headers"
+        }
+      },
+      "title": "Parameters",
+      "type": "object"
+    }
+  },
+  "properties": {
+    "name": {
+      "title": "Name",
+      "type": "string"
+    },
+    "parameters": {
+      "anyOf": [
+        {
+          "$ref": "#/$defs/Parameters"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null
+    }
+  },
+  "required": [
+    "name"
+  ],
+  "title": "RequestOutput",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## ResponseOutput
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `data` | `TensorData` | `-` | - |
+| `datatype` | `Datatype` | `-` | - |
+| `name` | `str` | `-` | - |
+| `parameters` | `Optional[Parameters]` | `None` | - |
+| `shape` | `List[int]` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "$defs": {
+    "Datatype": {
+      "enum": [
+        "BOOL",
+        "UINT8",
+        "UINT16",
+        "UINT32",
+        "UINT64",
+        "INT8",
+        "INT16",
+        "INT32",
+        "INT64",
+        "FP16",
+        "FP32",
+        "FP64",
+        "BYTES"
+      ],
+      "title": "Datatype",
+      "type": "string"
+    },
+    "Parameters": {
+      "additionalProperties": true,
+      "properties": {
+        "content_type": {
+          "anyOf": [
+            {
+              "type": "string"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Content Type"
+        },
+        "headers": {
+          "anyOf": [
+            {
+              "type": "object"
+            },
+            {
+              "type": "null"
+            }
+          ],
+          "default": null,
+          "title": "Headers"
+        }
+      },
+      "title": "Parameters",
+      "type": "object"
+    },
+    "TensorData": {
+      "anyOf": [
+        {
+          "items": {},
+          "type": "array"
+        },
+        {}
+      ],
+      "title": "TensorData"
+    }
+  },
+  "properties": {
+    "name": {
+      "title": "Name",
+      "type": "string"
+    },
+    "shape": {
+      "items": {
+        "type": "integer"
+      },
+      "title": "Shape",
+      "type": "array"
+    },
+    "datatype": {
+      "$ref": "#/$defs/Datatype"
+    },
+    "parameters": {
+      "anyOf": [
+        {
+          "$ref": "#/$defs/Parameters"
+        },
+        {
+          "type": "null"
+        }
+      ],
+      "default": null
+    },
+    "data": {
+      "$ref": "#/$defs/TensorData"
+    }
+  },
+  "required": [
+    "name",
+    "shape",
+    "datatype",
+    "data"
+  ],
+  "title": "ResponseOutput",
+  "type": "object"
+}
+
+```
+
+
+</details>
+
+## State
+
+An enumeration.
+
+## TensorData
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `root` | `Union[List[Any], Any]` | `-` | - |
+<details><summary>JSON Schema</summary>
+
+
+```json
+
+{
+  "anyOf": [
+    {
+      "items": {},
+      "type": "array"
+    },
+    {}
+  ],
+  "title": "TensorData"
+}
+
+```
+
+
+</details>
+
diff --git a/docs-gb/api/api-reference.md b/docs-gb/api/api-reference.md
new file mode 100644
index 000000000..a24874cb5
--- /dev/null
+++ b/docs-gb/api/api-reference.md
@@ -0,0 +1,45 @@
+# API Reference Overview
+
+This page links to the key reference docs for configuring and using MLServer.
+
+## MLServer Settings
+
+Server-wide configuration (e.g., HTTP/GRPC ports) loaded from a `settings.json` in the working directory. Settings can also be provided via environment variables prefixed with `MLSERVER_` (e.g., `MLSERVER_GRPC_PORT`).
+
+- Scope: server-wide (independent from model-specific settings)
+- Sources: `settings.json` or env vars `MLSERVER_*`
+
+[Read the full reference →](./Settings.md)
+
+## Model Settings
+
+Each model has its own configuration (metadata, parallelism, etc.). Typically provided via a `model-settings.json` next to the model artifacts. Alternatively, use env vars prefixed with `MLSERVER_MODEL_` (e.g., `MLSERVER_MODEL_IMPLEMENTATION`). If no `model-settings.json` is found, MLServer will try to load a default model from these env vars. Note: these env vars are shared across models unless overridden by `model-settings.json`. 
+
+- Scope: per-model
+- Sources: `model-settings.json` or env vars `MLSERVER_MODEL_*`
+
+[Read the full reference →](./ModelSettings.md)
+
+## MLServer CLI
+
+The `mlserver` CLI helps with common model lifecycle tasks (build images, init projects, start serving, etc.). For a quick overview:
+
+```bash
+mlserver --help
+```
+
+- Commands include: `build`, `dockerfile`, `infer` (deprecated), `init`, `start`
+- Each command lists its options, arguments, and examples
+
+[Read the full CLI reference →](./CLI.md)
+
+## Python API
+
+Build custom runtimes and integrate with MLServer using Python:
+
+- MLModel: base class for custom inference runtimes
+- Types: request/response schemas and enums (Pydantic)
+- Codecs: payload conversions between protocol types and Python types
+- Metrics: emit and configure metrics
+
+[Browse the Python API →](./PythonAPI.md)