We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent b08533e commit a9d4f7eCopy full SHA for a9d4f7e
protobuf/model_config.proto
@@ -1662,7 +1662,7 @@ message ModelEnsembling
1662
1663
//@@ .. cpp:var:: uint32 max_inflight_requests
1664
//@@
1665
- //@@ The maximum number of concurrent inflight requests to ensemble steps.
+ //@@ The maximum number of concurrent inflight requests at each ensemble step.
1666
//@@ This limit prevents unbounded memory growth when decoupled models
1667
//@@ produce responses faster than downstream models can consume them.
1668
//@@ Default value is 0, which indicates that no limit is enforced.
0 commit comments