[Bug]: max batched tokens not compatible with max model length on non-X86 CPU Backend

### Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
Your output of `python collect_env.py` here
```

</details>


### 🐛 Describe the bug

 For non-x86 CPU backends it seems chunked prefill isn’t supported `([arg_utils.py:1376] Chunked prefill is not supported for ARM and POWER, S390X and RISC-V CPUs; disabling it for V1 backend.)` and there isn’t code to update max_num_batched_tokens to be compatible with max_model_len

```
(APIServer pid=63635) pydantic_core._pydantic_core.ValidationError: 1 validation error for SchedulerConfig
(APIServer pid=63635)   Value error, max_num_batched_tokens (2048) is smaller than max_model_len (40960). This effectively limits the maximum sequence length to max_num_batched_tokens and makes vLLM reject longer sequences. Please increase max_num_batched_tokens or decrease max_model_len. [type=value_error, input_value=ArgsKwargs((), {'runner_t..., 'stream_interval': 1}), input_type=ArgsKwargs]
```

This means the user has to manually set this themselves.

We should `self.max_num_batched_tokens = model_config.max_model_len` when chunked prefill is disabled.

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: max batched tokens not compatible with max model length on non-X86 CPU Backend #28981

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: max batched tokens not compatible with max model length on non-X86 CPU Backend #28981

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions