JSON Serialization Error: NaN Values Causing Serialization Failure

# JSON Serialization Error: NaN Values Causing Serialization Failure

## Problem Description

When using the Llama-3.1-8B-Instruct model for inference, a JSON serialization error occurs. The error happens when the `/v1/chat/completions/tokens` endpoint tries to return a response, failing to serialize `ChatCompletionResponse` to JSON.

## Error Message

```
ValueError: Out of range float values are not JSON compliant: nan
```

### Full Error Stack Trace

```
[0;36m(ApiServer_0 pid=467463)[0;0m ERROR:    Exception in ASGI application
[0;36m(ApiServer_0 pid=467463)[0;0m Traceback (most recent call last):
  ...
  File "/proj-vertical-llms-pvc/users/zhihan/webarena/WebAgent-R1/WebAgent-R1-prime/prime-rl/src/prime_rl/inference/vllm/server.py", line 119, in _chat_with_tokens
    return JSONResponse(content=generator.model_dump())
  ...
  File "/root/.local/share/uv/python/cpython-3.12.12-linux-x86_64-gnu/lib/python3.12/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
[0;36m(ApiServer_0 pid=467463)[0;0m ValueError: Out of range float values are not JSON compliant: nan
```

## Problem Analysis

1. **Error Location**: The error occurs in the `_chat_with_tokens` function in `src/prime_rl/inference/vllm/server.py` when trying to convert a `ChatCompletionResponse` object to a dictionary via `model_dump()` and serialize it to JSON.

2. **Root Cause**: The `ChatCompletionResponse` object contains `nan` (Not a Number) float values, which are not supported by the JSON standard (along with `inf` and other special float values).

3. **Model Differences**:
   - ✅ **Qwen3 Series Models**: When using Qwen3-4B-Instruct-2507 and Qwen2.5-7B-Instruct, **this error was not observed**
   - ❌ **Llama3.1 8B**: After switching to Llama-3.1-8B-Instruct, **this error started occurring**

## Configuration

Full configuration file content:

```toml

inference_gpu_ids = [0,1]
trainer_gpu_ids = [2,3,4,5,6,7]

max_steps = 200          # Number of training steps
seq_len = 16384           # Sequence length for training

[model]
name = "Llama-3.1-8B-Instruct"   

[wandb]
project = "webarena-rl"
name = "webarena-debug"

[trainer.model]
impl = "liger_kernel"    # Training implementation
ac = { freq = 1 }        # freq=1 means full activation checkpointing (checkpoint every layer)

[trainer.optim]
lr = 1e-5

[orchestrator]
batch_size = 128              # Batch size (adjust based on GPU memory)
rollouts_per_example = 16     # Number of rollouts per example

[orchestrator.optim]
lr = 1e-5                  # Learning rate (should match trainer.optim.lr)

[orchestrator.sampling]
max_tokens = 512            # Max tokens per generation

# WebArena environment configuration
[[orchestrator.env]]
id = "webarena-env"          # Environment ID (must match the installed package name)
name = "webarena-env"        # Display name for logs

[inference]
# Inference server configuration
api_server_count = 2         # Number of inference servers
gpu_memory_utilization = 0.9

[inference.parallel]
dp = 2  

[inference.model]

```

## Questions

1. **Why does this problem occur?**
2. **How to fix it?**

## Related Code Locations

- `src/prime_rl/inference/vllm/server.py:119` - `_chat_with_tokens` function
- `src/prime_rl/inference/vllm/serving_chat_with_tokens.py` - Chat completion with tokens implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON Serialization Error: NaN Values Causing Serialization Failure #1646

JSON Serialization Error: NaN Values Causing Serialization Failure

Problem Description

Error Message

Full Error Stack Trace

Problem Analysis

Configuration

Questions

Related Code Locations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

JSON Serialization Error: NaN Values Causing Serialization Failure #1646

Description

JSON Serialization Error: NaN Values Causing Serialization Failure

Problem Description

Error Message

Full Error Stack Trace

Problem Analysis

Configuration

Questions

Related Code Locations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions