[PROPOSAL] LLM judgment generation sends vector embedding fields to LLM, wasting tokens and bandwidth

When the LLM Judgment API (`PUT /judgments` with `type: LLM_JUDGMENT`) generates relevance ratings, it retrieves documents via search queries and sends the document `_source` content to the LLM for evaluation. **When `contextFields` is not specified by the user, the entire `_source` is sent — including vector embedding fields** (e.g., `knn_vector`, `dense_vector`).

Vector embedding fields are arrays of hundreds or thousands of floating-point numbers (commonly 768 or 1536 dimensions) that represent semantic meaning in a format only useful for vector similarity computation. These fields:

- Add no value for LLM relevance judgment — the LLM cannot interpret raw embedding vectors
- Consume significant tokens — a single 768-dim vector serializes to ~3,000+ tokens
- Waste network bandwidth — especially impactful at scale
- May cause token limit errors — pushing documents over the configured `tokenLimit`

### Impact at scale

For a typical hybrid search setup (text + neural) with 768-dimension embeddings:

| Scenario | Without vectors | With vectors | Waste |
|----------|----------------|-------------|-------|
| Per document payload | ~500 bytes | ~6,500+ bytes | +1,200% |
| 1 query × 10 docs | ~5 KB | ~65 KB | +60 KB |
| 1,000 queries × 100 docs | ~50 MB | ~650 MB | ~600 MB |
| LLM tokens per query (100 docs) | ~12,500 | ~312,500 | ~300,000 wasted |

With `expandCoverage=true` (which ~doubles document count per query), the waste is amplified 2×.

### Root cause

In `LlmJudgmentsProcessor.getContextSource()`:

```java
private String getContextSource(SearchHit hit, List<String> contextFields) {
    if (contextFields != null && !contextFields.isEmpty()) {
        // SAFE: Only specified fields are included
        Map<String, Object> filteredSource = new HashMap<>();
        for (String field : contextFields) {
            if (sourceAsMap.containsKey(field)) {
                filteredSource.put(field, sourceAsMap.get(field));
            }
        }
        return OBJECT_MAPPER.writeValueAsString(filteredSource);
    }
    // PROBLEM: Returns ENTIRE _source including vector fields
    return hit.getSourceAsString();
}
```

When `contextFields` is not provided (which is common — it's an optional parameter), the full `_source` is serialized and sent to the LLM. For indices with neural search embeddings, this includes large float arrays like:

```json
{
  "title": "Wireless Headphones",
  "title_embedding": [0.0234, -0.1567, 0.0891, ... /* 768 floats */],
  "description": "High quality noise cancelling..."
}
```

### Existing mitigation

Users can work around this by specifying `contextFields` in their judgment request:

```json
{
  "type": "LLM_JUDGMENT",
  "contextFields": ["title", "description", "category"],
  ...
}
```

However, this requires users to know about the issue and manually list all relevant fields while excluding vector fields.

### Proposed solution

**Auto-exclude vector-like fields when `contextFields` is not specified.**

In the `getContextSource()` fallback path, detect and skip fields whose values are large numeric arrays (heuristic: `List<Number>` with length > 32):

```java
// When no contextFields specified, auto-exclude embedding/vector fields
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
Map<String, Object> filteredSource = new HashMap<>();
for (Map.Entry<String, Object> entry : sourceAsMap.entrySet()) {
    Object value = entry.getValue();
    if (isLikelyVectorField(value)) {
        continue; // Skip embedding fields
    }
    filteredSource.put(entry.getKey(), value);
}
return OBJECT_MAPPER.writeValueAsString(filteredSource);
```

Where `isLikelyVectorField` checks:
```java
private boolean isLikelyVectorField(Object value) {
    if (value instanceof List) {
        List<?> list = (List<?>) value;
        return list.size() > 32 
            && !list.isEmpty() 
            && list.get(0) instanceof Number;
    }
    return false;
}
```

**Additional improvements:**
- Log a warning when large array fields are detected and auto-excluded, for user visibility
- Consider querying the index mapping to identify `knn_vector` typed fields for a more precise exclusion

### Alternative approaches

1. **`_source` excludes in search request** — modify search request builder to add `_source: { excludes: ["*_embedding", "*_vector"] }`. Relies on field naming conventions.

2. **Query index mapping** — before searching, query the index mapping to discover all `knn_vector`/`dense_vector` typed fields and exclude them from `_source`. Most precise but adds an extra API call per index.

3. **Introduce `excludeFields` parameter** — add a new optional parameter to the judgment API that lets users specify fields to exclude. Complementary to `contextFields`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PROPOSAL] LLM judgment generation sends vector embedding fields to LLM, wasting tokens and bandwidth #403

Impact at scale

Root cause

Existing mitigation

Proposed solution

Alternative approaches

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scenario	Without vectors	With vectors	Waste
Per document payload	~500 bytes	~6,500+ bytes	+1,200%
1 query × 10 docs	~5 KB	~65 KB	+60 KB
1,000 queries × 100 docs	~50 MB	~650 MB	~600 MB
LLM tokens per query (100 docs)	~12,500	~312,500	~300,000 wasted

[PROPOSAL] LLM judgment generation sends vector embedding fields to LLM, wasting tokens and bandwidth #403

Description

Impact at scale

Root cause

Existing mitigation

Proposed solution

Alternative approaches

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions