You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
add http way for calling the service (#4325) (#4330)
* add http serve way
* fix copy error
* update http code
* fix en code
* update content
* update content again
* update content again and again
* update http content
* udpate serving content
* Polish doc
* update en version
* update en version
* update en version
* update en version
---------
Co-authored-by: Bobholamovic <[email protected]>
Copy file name to clipboardExpand all lines: docs/pipeline_deploy/serving.en.md
+70-2Lines changed: 70 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -350,7 +350,13 @@ I1216 11:37:21.643494 35 http_server.cc:167] Started Metrics Service at 0.0.0.0:
350
350
351
351
### 2.4 Invoke the Service
352
352
353
-
Currently, only the Python client is supported for calling the service. Supported Python versions are 3.8 to 3.12.
353
+
Users can call the pipeline service through the Python client provided by the SDK or by manually constructing HTTP requests (with no restriction on client programming languages).
354
+
355
+
356
+
The services deployed using the high-stability serving solution offer the primary operations that match those of the basic serving solution. For each primary operation, the endpoint names and the request and response data fields are consistent with the basic serving solution. Please refer to the "Development Integration/Deployment" section in the tutorials for each pipeline. The tutorials for each pipeline can be found [here](../pipeline_usage/pipeline_develop_guide.en.md).
357
+
358
+
359
+
#### 2.4.1 Use Python Client
354
360
355
361
Navigate to the `client` directory of the high-stability serving SDK, and run the following command to install dependencies:
The Python client currently supports Python versions 3.8 to 3.12.
370
+
363
371
The `client.py` script in the `client` directory contains examples of how to call the service and provides a command-line interface.
364
372
365
-
The services deployed using the high-stability serving solution offer the primary operations that match those of the basic serving solution. For each primary operation, the endpoint names and the request and response data fields are consistent with the basic serving solution. Please refer to the "Development Integration/Deployment" section in the tutorials for each pipeline. The tutorials for each pipeline can be found [here](../pipeline_usage/pipeline_develop_guide.en.md).
373
+
#### 2.4.2 Manually Construct HTTP Requests
374
+
375
+
The following method demonstrates how to call the service using the HTTP interface in scenarios where the Python client is not applicable.
376
+
377
+
First, you need to manually construct the HTTP request body. The request body must be in JSON format and contains the following fields:
378
+
379
+
-`inputs`: Input tensor information. The input tensor name `name` is uniformly set to `input`, the shape is `[1, 1]`, and the data type `datatype` is `BYTES`. The tensor data `data` contains a single JSON string, and the content of this JSON should follow the pipeline-specific format (consistent with the basic serving solution).
380
+
-`outputs`: Output tensor information. The output tensor name `name` is uniformly set to `output`.
381
+
382
+
Taking the general OCR pipeline as an example, the constructed request body is as follows:
Send the constructed request body to the corresponding HTTP inference endpoint of the service. By default, the service listens on HTTP port `8000`, and the inference request URL follows the format `http://{hostname}:8000/v2/models/{endpoint name}/infer`.
408
+
409
+
Using the general OCR pipeline as an example, the following is a `curl` command to send the request:
410
+
411
+
```bash
412
+
# Assuming `REQUEST_JSON` is the request body constructed in the previous step
413
+
curl -s -X POST http://localhost:8000/v2/models/ocr/infer \
414
+
-H 'Content-Type: application/json' \
415
+
-d "${REQUEST_JSON}"
416
+
```
417
+
418
+
Finally, the response from the service needs to be parsed. The raw response body has the following structure:
`outputs[0].data[0]` is a JSON string. The internal fields follow the same format as the response body in the basic serving solution. For detailed parsing rules, please refer to the usage guide for each specific pipeline.
0 commit comments