You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update cache_messager.py
* fix too many open files problem
* fix too many open files problem
* fix too many open files problem
* fix ci bugs
* Update api_server.py
* add parameter
* format
* format
* format
* format
* Update parameters.md
* Update parameters.md
* Update serving_completion.py
* Update serving_chat.py
* Update envs.py
---------
Co-authored-by: Jiang-Jia-Jun <[email protected]>
Copy file name to clipboardExpand all lines: docs/parameters.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,6 +8,8 @@ When using FastDeploy to deploy models (including offline inference and service
8
8
|:--------------|:----|:-----------|
9
9
|```port```|`int`| Only required for service deployment, HTTP service port number, default: 8000 |
10
10
|```metrics_port```|`int`| Only required for service deployment, metrics monitoring port number, default: 8001 |
11
+
|```max_waiting_time```|`int`| Only required for service deployment, maximum wait time for establishing a connection upon service request. Default: -1 (indicates no wait time limit).|
12
+
|```max_concurrency```|`int`| Only required for service deployment, the actual number of connections established by the service, default 512 |
11
13
|```engine_worker_queue_port```|`int`| FastDeploy internal engine communication port, default: 8002 |
12
14
|```cache_queue_port```|`int`| FastDeploy internal KVCache process communication port, default: 8003 |
13
15
|```max_model_len```|`int`| Default maximum supported context length for inference, default: 2048 |
0 commit comments