Skip to content

Commit a53a725

Browse files
authored
Links 2025.1 (openvinotoolkit#3193)
* main->releases/2025/1 * 404s * /releases/2025/0/->/releases/2025/1/ * Karol's fixes CVS-164825 CVS-165419
1 parent 897a0cd commit a53a725

File tree

69 files changed

+157
-157
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+157
-157
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -51,13 +51,13 @@ A demonstration on how to use OpenVINO Model Server can be found in our [quick-s
5151

5252
Check also other instructions:
5353

54-
[Preparing model repository](https://docs.openvino.ai/nightly/model-server/ovms_docs_models_repository.html)
54+
[Preparing model repository](https://docs.openvino.ai/2025/model-server/ovms_docs_models_repository.html)
5555

56-
[Deployment](https://docs.openvino.ai/nightly/model-server/ovms_docs_deploying_server.html)
56+
[Deployment](https://docs.openvino.ai/2025/model-server/ovms_docs_deploying_server.html)
5757

58-
[Writing client code](https://docs.openvino.ai/nightly/model-server/ovms_docs_server_app.html)
58+
[Writing client code](https://docs.openvino.ai/2025/model-server/ovms_docs_server_app.html)
5959

60-
[Demos](https://docs.openvino.ai/nightly/model-server/ovms_docs_demos.html)
60+
[Demos](https://docs.openvino.ai/2025/model-server/ovms_docs_demos.html)
6161

6262

6363

@@ -73,7 +73,7 @@ Check also other instructions:
7373

7474
* [Inference Scaling with OpenVINO™ Model Server in Kubernetes and OpenShift Clusters](https://www.intel.com/content/www/us/en/developer/articles/technical/deploy-openvino-in-openshift-and-kubernetes.html)
7575

76-
* [Benchmarking results](https://docs.openvino.ai/nightly/about-openvino/performance-benchmarks.html)
76+
* [Benchmarking results](https://docs.openvino.ai/2025/about-openvino/performance-benchmarks.html)
7777

7878

7979
## Contact

client/go/kserve-api/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ RUN go install google.golang.org/protobuf/cmd/[email protected]
2626
RUN go install google.golang.org/grpc/cmd/[email protected]
2727

2828
# Compile API
29-
RUN wget https://raw.githubusercontent.com/openvinotoolkit/model_server/main/src/kfserving_api/grpc_predict_v2.proto
29+
RUN wget https://raw.githubusercontent.com/openvinotoolkit/model_server/releases/2025/1/src/kfserving_api/grpc_predict_v2.proto
3030
RUN echo 'option go_package = "./grpc-client";' >> grpc_predict_v2.proto
3131
RUN protoc --go_out="./" --go-grpc_out="./" ./grpc_predict_v2.proto
3232

client/python/ovmsclient/lib/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ OVMS client library contains only the necessary dependencies, so the whole packa
66

77
As OpenVINO Model Server API is compatible with TensorFlow Serving, it's possible to use `ovmsclient` with TensorFlow Serving instances on: Predict, GetModelMetadata and GetModelStatus endpoints.
88

9-
See [API documentation](https://github.com/openvinotoolkit/model_server/blob/main/client/python/ovmsclient/lib/docs/README.md) for details on what the library provides.
9+
See [API documentation](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/client/python/ovmsclient/lib/docs/README.md) for details on what the library provides.
1010

1111
```bash
1212
git clone https://github.com/openvinotoolkit/model_server.git
@@ -136,4 +136,4 @@ results = client.predict(inputs=inputs, model_name="model")
136136
#
137137
```
138138

139-
For more details on `ovmsclient` see [API reference](https://github.com/openvinotoolkit/model_server/blob/main/client/python/ovmsclient/lib/docs/README.md)
139+
For more details on `ovmsclient` see [API reference](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/client/python/ovmsclient/lib/docs/README.md)

client/python/ovmsclient/lib/docs/pypi_overview.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ The `ovmsclient` package works both with OpenVINO™ Model Server and Tensor
99
The `ovmsclient` can replace `tensorflow-serving-api` package with reduced footprint and simplified interface.
1010

1111

12-
See [API reference](https://github.com/openvinotoolkit/model_server/blob/main/client/python/ovmsclient/lib/docs/README.md) for usage details.
12+
See [API reference](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/client/python/ovmsclient/lib/docs/README.md) for usage details.
1313

1414

1515
## Usage example
@@ -38,4 +38,4 @@ results = client.predict(inputs=inputs, model_name="model")
3838

3939
```
4040

41-
Learn more on `ovmsclient` [documentation site](https://github.com/openvinotoolkit/model_server/tree/main/client/python/ovmsclient/lib).
41+
Learn more on `ovmsclient` [documentation site](https://github.com/openvinotoolkit/model_server/tree/releases/2025/1/client/python/ovmsclient/lib).

demos/README.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -48,15 +48,15 @@ OpenVINO Model Server demos have been created to showcase the usage of the model
4848
- [VLM Text Generation with continuous batching](continuous_batching/vlm/README.md)
4949
- [OpenAI API text embeddings ](embeddings/README.md)
5050
- [Reranking with Cohere API](rerank/README.md)
51-
- [RAG with OpenAI API endpoint and langchain](https://github.com/openvinotoolkit/model_server/blob/main/demos/continuous_batching/rag/rag_demo.ipynb)
51+
- [RAG with OpenAI API endpoint and langchain](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/continuous_batching/rag/rag_demo.ipynb)
5252

5353
Check out the list below to see complete step-by-step examples of using OpenVINO Model Server with real world use cases:
5454

5555
## With Traditional Models
5656
| Demo | Description |
5757
|---|---|
5858
|[Image Classification](image_classification/python/README.md)|Run prediction on a JPEG image using image classification model via gRPC API.|
59-
|[Using ONNX Model](using_onnx_model/python/README.md)|Run prediction on a JPEG image using image classification ONNX model via gRPC API in two preprocessing variants. This demo uses [pipeline](../docs/dag_scheduler.md) with [image_transformation custom node](https://github.com/openvinotoolkit/model_server/tree/main/src/custom_nodes/image_transformation). |
59+
|[Using ONNX Model](using_onnx_model/python/README.md)|Run prediction on a JPEG image using image classification ONNX model via gRPC API in two preprocessing variants. This demo uses [pipeline](../docs/dag_scheduler.md) with [image_transformation custom node](https://github.com/openvinotoolkit/model_server/tree/releases/2025/1/src/custom_nodes/image_transformation). |
6060
|[Using TensorFlow Model](image_classification_using_tf_model/python/README.md)|Run image classification using directly imported TensorFlow model. |
6161
|[Age gender recognition](age_gender_recognition/python/README.md) | Run prediction on a JPEG image using age gender recognition model via gRPC API.|
6262
|[Face Detection](face_detection/python/README.md)|Run prediction on a JPEG image using face detection model via gRPC API.|
@@ -86,13 +86,13 @@ Check out the list below to see complete step-by-step examples of using OpenVINO
8686
## With DAG Pipelines
8787
| Demo | Description |
8888
|---|---|
89-
|[Horizontal Text Detection in Real-Time](horizontal_text_detection/python/README.md) | Run prediction on camera stream using a horizontal text detection model via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [horizontal_ocr custom node](https://github.com/openvinotoolkit/model_server/tree/main/src/custom_nodes/horizontal_ocr) and [demultiplexer](../docs/demultiplexing.md). |
90-
|[Optical Character Recognition Pipeline](optical_character_recognition/python/README.md) | Run prediction on a JPEG image using a pipeline of text recognition and text detection models with a custom node for intermediate results processing via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [east_ocr custom node](https://github.com/openvinotoolkit/model_server/tree/main/src/custom_nodes/east_ocr) and [demultiplexer](../docs/demultiplexing.md). |
89+
|[Horizontal Text Detection in Real-Time](horizontal_text_detection/python/README.md) | Run prediction on camera stream using a horizontal text detection model via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [horizontal_ocr custom node](https://github.com/openvinotoolkit/model_server/tree/releases/2025/1/src/custom_nodes/horizontal_ocr) and [demultiplexer](../docs/demultiplexing.md). |
90+
|[Optical Character Recognition Pipeline](optical_character_recognition/python/README.md) | Run prediction on a JPEG image using a pipeline of text recognition and text detection models with a custom node for intermediate results processing via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [east_ocr custom node](https://github.com/openvinotoolkit/model_server/tree/releases/2025/1/src/custom_nodes/east_ocr) and [demultiplexer](../docs/demultiplexing.md). |
9191
|[Single Face Analysis Pipeline](single_face_analysis_pipeline/python/README.md)|Run prediction on a JPEG image using a simple pipeline of age-gender recognition and emotion recognition models via gRPC API to analyze image with a single face. This demo uses [pipeline](../docs/dag_scheduler.md) |
92-
|[Multi Faces Analysis Pipeline](multi_faces_analysis_pipeline/python/README.md)|Run prediction on a JPEG image using a pipeline of age-gender recognition and emotion recognition models via gRPC API to extract multiple faces from the image and analyze all of them. This demo uses [pipeline](../docs/dag_scheduler.md) with [model_zoo_intel_object_detection custom node](https://github.com/openvinotoolkit/model_server/tree/main/src/custom_nodes/model_zoo_intel_object_detection) and [demultiplexer](../docs/demultiplexing.md) |
92+
|[Multi Faces Analysis Pipeline](multi_faces_analysis_pipeline/python/README.md)|Run prediction on a JPEG image using a pipeline of age-gender recognition and emotion recognition models via gRPC API to extract multiple faces from the image and analyze all of them. This demo uses [pipeline](../docs/dag_scheduler.md) with [model_zoo_intel_object_detection custom node](https://github.com/openvinotoolkit/model_server/tree/releases/2025/1/src/custom_nodes/model_zoo_intel_object_detection) and [demultiplexer](../docs/demultiplexing.md) |
9393
|[Model Ensemble Pipeline](model_ensemble/python/README.md)|Combine multiple image classification models into one [pipeline](../docs/dag_scheduler.md) and aggregate results to improve classification accuracy. |
94-
|[Face Blur Pipeline](face_blur/python/README.md)|Detect faces and blur image using a pipeline of object detection models with a custom node for intermediate results processing via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [face_blur custom node](https://github.com/openvinotoolkit/model_server/tree/main/src/custom_nodes/face_blur). |
95-
|[Vehicle Analysis Pipeline](vehicle_analysis_pipeline/python/README.md)|Detect vehicles and recognize their attributes using a pipeline of vehicle detection and vehicle attributes recognition models with a custom node for intermediate results processing via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [model_zoo_intel_object_detection custom node](https://github.com/openvinotoolkit/model_server/tree/main/src/custom_nodes/model_zoo_intel_object_detection). |
94+
|[Face Blur Pipeline](face_blur/python/README.md)|Detect faces and blur image using a pipeline of object detection models with a custom node for intermediate results processing via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [face_blur custom node](https://github.com/openvinotoolkit/model_server/tree/releases/2025/1/src/custom_nodes/face_blur). |
95+
|[Vehicle Analysis Pipeline](vehicle_analysis_pipeline/python/README.md)|Detect vehicles and recognize their attributes using a pipeline of vehicle detection and vehicle attributes recognition models with a custom node for intermediate results processing via gRPC API. This demo uses [pipeline](../docs/dag_scheduler.md) with [model_zoo_intel_object_detection custom node](https://github.com/openvinotoolkit/model_server/tree/releases/2025/1/src/custom_nodes/model_zoo_intel_object_detection). |
9696

9797
## With C++ Client
9898
| Demo | Description |

demos/age_gender_recognition/python/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Install python dependencies:
5353
```console
5454
pip3 install -r requirements.txt
5555
```
56-
Run [age_gender_recognition.py](https://github.com/openvinotoolkit/model_server/blob/main/demos/age_gender_recognition/python/age_gender_recognition.py) script to make an inference:
56+
Run [age_gender_recognition.py](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/age_gender_recognition/python/age_gender_recognition.py) script to make an inference:
5757
```console
5858
python age_gender_recognition.py --image_input_path age-gender-recognition-retail-0001.jpg --rest_port 8000
5959
```

demos/benchmark/python/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -379,4 +379,4 @@ docker run -v ${PWD}/workspace:/workspace --network host benchmark_client -a loc
379379
```
380380
381381
Many other client options together with benchmarking examples are presented in
382-
[an additional PDF document](https://github.com/openvinotoolkit/model_server/blob/main/docs/python-benchmarking-client-16feb.pdf).
382+
[an additional PDF document](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/docs/python-benchmarking-client-16feb.pdf).

demos/bert_question_answering/python/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
This document demonstrates how to run inference requests for [BERT model](https://github.com/openvinotoolkit/open_model_zoo/tree/2022.1.0/models/intel/bert-small-uncased-whole-word-masking-squad-int8-0002) with OpenVINO Model Server. It provides questions answering functionality.
66

7-
In this example docker container with [bert-client image](https://github.com/openvinotoolkit/model_server/blob/main/demos/bert_question_answering/python/Dockerfile) runs the script [bert_question_answering.py](https://github.com/openvinotoolkit/model_server/blob/main/demos/bert_question_answering/python/bert_question_answering.py). It runs inference request for each paragraph on a given page in order to answer the provided question. Since each paragraph can have different size the functionality of dynamic shape is used.
7+
In this example docker container with [bert-client image](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/bert_question_answering/python/Dockerfile) runs the script [bert_question_answering.py](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/bert_question_answering/python/bert_question_answering.py). It runs inference request for each paragraph on a given page in order to answer the provided question. Since each paragraph can have different size the functionality of dynamic shape is used.
88

99
NOTE: With `min_request_token_num` parameter you can specify the minimum size of the request. If the paragraph has too short, it is concatenated with the next one until it has required length. When there is no paragraphs left to concatenate request is created with the remaining content.
1010

demos/continuous_batching/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,8 @@ LLM engine parameters will be defined inside the `graph.pbtxt` file.
3333

3434
Download export script, install it's dependencies and create directory for the models:
3535
```console
36-
curl https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/0/demos/common/export_models/export_model.py -o export_model.py
37-
pip3 install -r https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/0/demos/common/export_models/requirements.txt
36+
curl https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/1/demos/common/export_models/export_model.py -o export_model.py
37+
pip3 install -r https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/1/demos/common/export_models/requirements.txt
3838
mkdir models
3939
```
4040

@@ -321,16 +321,16 @@ P99 TPOT (ms): 246.48
321321

322322
The service deployed above can be used in RAG chain using `langchain` library with OpenAI endpoint as the LLM engine.
323323

324-
Check the example in the [RAG notebook](https://github.com/openvinotoolkit/model_server/blob/main/demos/continuous_batching/rag/rag_demo.ipynb)
324+
Check the example in the [RAG notebook](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/continuous_batching/rag/rag_demo.ipynb)
325325

326326
## Scaling the Model Server
327327

328-
Check this simple [text generation scaling demo](https://github.com/openvinotoolkit/model_server/blob/main/demos/continuous_batching/scaling/README.md).
328+
Check this simple [text generation scaling demo](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/continuous_batching/scaling/README.md).
329329

330330

331331
## Testing the model accuracy over serving API
332332

333-
Check the [guide of using lm-evaluation-harness](https://github.com/openvinotoolkit/model_server/blob/main/demos/continuous_batching/accuracy/README.md)
333+
Check the [guide of using lm-evaluation-harness](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/continuous_batching/accuracy/README.md)
334334

335335
## Use Speculative Decoding
336336

demos/continuous_batching/rag/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44
## Creating models repository for all the endpoints
55

66
```console
7-
curl https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/0/demos/common/export_models/export_model.py -o export_model.py
8-
pip3 install -r https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/0/demos/common/export_models/requirements.txt
7+
curl https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/1/demos/common/export_models/export_model.py -o export_model.py
8+
pip3 install -r https://raw.githubusercontent.com/openvinotoolkit/model_server/refs/heads/releases/2025/1/demos/common/export_models/requirements.txt
99

1010
mkdir models
1111
python export_model.py text_generation --source_model meta-llama/Meta-Llama-3-8B-Instruct --weight-format int8 --kv_cache_precision u8 --config_file_path models/config_all.json --model_repository_path models
@@ -28,4 +28,4 @@ ovms --rest_port 8000 --config_path ./models/config_all.json
2828

2929
## Using RAG
3030

31-
When the model server is deployed and serving all 3 endpoints, run the [jupyter notebook](https://github.com/openvinotoolkit/model_server/blob/main/demos/continuous_batching/rag/rag_demo.ipynb) to use RAG chain with a fully remote execution.
31+
When the model server is deployed and serving all 3 endpoints, run the [jupyter notebook](https://github.com/openvinotoolkit/model_server/blob/releases/2025/1/demos/continuous_batching/rag/rag_demo.ipynb) to use RAG chain with a fully remote execution.

0 commit comments

Comments
 (0)