Skip to content

Commit e83e1a8

Browse files
committed
Clarify failure injection rate documentation
Removed redundant lines and updated comments and help text to clarify that 'failure-injection-rate' is the probability of injecting failures, not specifically tied to failure mode.
1 parent 972092a commit e83e1a8

File tree

2 files changed

+3
-5
lines changed

2 files changed

+3
-5
lines changed

README.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,6 @@ The simulator supports two modes of operation:
3535

3636
Additionally, the simulator can inject OpenAI API compatible error responses for testing error handling using the `failure-injection-rate` parameter.
3737

38-
Additionally, the simulator can inject OpenAI API compatible error responses for testing error handling using the `failure-injection-rate` parameter.
39-
4038
Timing of the response is defined by the `time-to-first-token` and `inter-token-latency` parameters. In case P/D is enabled for a request, `kv-cache-transfer-latency` will be used instead of `time-to-first-token`.
4139

4240
For a request with `stream=true`: `time-to-first-token` or `kv-cache-transfer-latency` defines the delay before the first token is returned, `inter-token-latency` defines the delay between subsequent tokens in the stream.
@@ -126,7 +124,7 @@ For more details see the <a href="https://docs.vllm.ai/en/stable/getting_started
126124
- `tokenizers-cache-dir`: the directory for caching tokenizers
127125
- `hash-seed`: seed for hash generation (if not set, is read from PYTHONHASHSEED environment variable)
128126
- `zmq-endpoint`: ZMQ address to publish events
129-
- `failure-injection-rate`: probability (0-100) of injecting failures when in failure mode, optional, default is 10
127+
- `failure-injection-rate`: probability (0-100) of injecting failures, optional, default is 10
130128
- `failure-types`: list of specific failure types to inject (rate_limit, invalid_api_key, context_length, server_error, invalid_request, model_not_found), optional, if empty all types are used
131129
- `event-batch-size`: the maximum number of kv-cache events to be sent together, defaults to 16
132130
-->

pkg/common/config.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -136,7 +136,7 @@ type Configuration struct {
136136
// EventBatchSize is the maximum number of kv-cache events to be sent together, defaults to 16
137137
EventBatchSize int `yaml:"event-batch-size"`
138138

139-
// FailureInjectionRate is the probability (0-100) of injecting failures when in failure mode
139+
// FailureInjectionRate is the probability (0-100) of injecting failures
140140
FailureInjectionRate int `yaml:"failure-injection-rate"`
141141
// FailureTypes is a list of specific failure types to inject (empty means all types)
142142
FailureTypes []string `yaml:"failure-types"`
@@ -386,7 +386,7 @@ func ParseCommandParamsAndLoadConfig() (*Configuration, error) {
386386
f.StringVar(&config.ZMQEndpoint, "zmq-endpoint", config.ZMQEndpoint, "ZMQ address to publish events")
387387
f.IntVar(&config.EventBatchSize, "event-batch-size", config.EventBatchSize, "Maximum number of kv-cache events to be sent together")
388388

389-
f.IntVar(&config.FailureInjectionRate, "failure-injection-rate", config.FailureInjectionRate, "Probability (0-100) of injecting failures when in failure mode")
389+
f.IntVar(&config.FailureInjectionRate, "failure-injection-rate", config.FailureInjectionRate, "Probability (0-100) of injecting failures")
390390

391391
failureTypes := getParamValueFromArgs("failure-types")
392392
var dummyFailureTypes multiString

0 commit comments

Comments
 (0)