Unexpected TTFT across several instances

**What happened**:

According to the documentation by default the seed is different for each instance because it is taken from the nanoseconds part of the time the instance is run.

I´m running 3 instances and I´m getting exactly the same TTFT for all 3 instances. More precisely, using the following promql:

```
histogram_quantile(0.3,
  sum by(le, instance) (
    rate(vllm:time_to_first_token_seconds_bucket[30s])
  )
)
```

I get the same values or with minimal differences at least. I have tried changing the percentile and the time frame but I always get the same values of TTFT across all instances. How is this possible?

**What you expected to happen**:

Different values in the last X seconds

**How to reproduce it (as minimally and precisely as possible)**:

I´m running 3 instances with the following parameters:
```
        - args:
            - --model
            - TinyLlama/TinyLlama-1.1B-Chat-v1.0
            #- --max-model-len
            #- "2048"
            - --served-model-name=HighEndLLM
            - --port
            - "8000"
            - --mode=random
            - --time-to-first-token=5000
            - --enable-kvcache
            - --max-num-seqs=25
            - --time-factor-under-load=3
            - --inter-token-latency=100
            # only if prefill/decode dissagregation enabled
            #- --kv-cache-transfer-latency=10
            # can't be more than 30%
            - --time-to-first-token-std-dev=1500
            - --inter-token-latency-std-dev=30
            #- --kv-cache-transfer-time-std-dev=3

```

And sending the following workload:

```
ab -v 1 -n 10000 -c 200 -T application/json -p /tmp/request.json http://$m/v1/completions
```
**Anything else we need to know?**:

Thanks!

**Environment**:

ghcr.io/llm-d/llm-d-inference-sim:v0.6.1


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unexpected TTFT across several instances #267

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unexpected TTFT across several instances #267

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions