-
Notifications
You must be signed in to change notification settings - Fork 30
Description
When the LLM Judgment API (PUT /judgments with type: LLM_JUDGMENT) generates relevance ratings, it retrieves documents via search queries and sends the document _source content to the LLM for evaluation. When contextFields is not specified by the user, the entire _source is sent — including vector embedding fields (e.g., knn_vector, dense_vector).
Vector embedding fields are arrays of hundreds or thousands of floating-point numbers (commonly 768 or 1536 dimensions) that represent semantic meaning in a format only useful for vector similarity computation. These fields:
- Add no value for LLM relevance judgment — the LLM cannot interpret raw embedding vectors
- Consume significant tokens — a single 768-dim vector serializes to ~3,000+ tokens
- Waste network bandwidth — especially impactful at scale
- May cause token limit errors — pushing documents over the configured
tokenLimit
Impact at scale
For a typical hybrid search setup (text + neural) with 768-dimension embeddings:
| Scenario | Without vectors | With vectors | Waste |
|---|---|---|---|
| Per document payload | ~500 bytes | ~6,500+ bytes | +1,200% |
| 1 query × 10 docs | ~5 KB | ~65 KB | +60 KB |
| 1,000 queries × 100 docs | ~50 MB | ~650 MB | ~600 MB |
| LLM tokens per query (100 docs) | ~12,500 | ~312,500 | ~300,000 wasted |
With expandCoverage=true (which ~doubles document count per query), the waste is amplified 2×.
Root cause
In LlmJudgmentsProcessor.getContextSource():
private String getContextSource(SearchHit hit, List<String> contextFields) {
if (contextFields != null && !contextFields.isEmpty()) {
// SAFE: Only specified fields are included
Map<String, Object> filteredSource = new HashMap<>();
for (String field : contextFields) {
if (sourceAsMap.containsKey(field)) {
filteredSource.put(field, sourceAsMap.get(field));
}
}
return OBJECT_MAPPER.writeValueAsString(filteredSource);
}
// PROBLEM: Returns ENTIRE _source including vector fields
return hit.getSourceAsString();
}When contextFields is not provided (which is common — it's an optional parameter), the full _source is serialized and sent to the LLM. For indices with neural search embeddings, this includes large float arrays like:
{
"title": "Wireless Headphones",
"title_embedding": [0.0234, -0.1567, 0.0891, ... /* 768 floats */],
"description": "High quality noise cancelling..."
}Existing mitigation
Users can work around this by specifying contextFields in their judgment request:
{
"type": "LLM_JUDGMENT",
"contextFields": ["title", "description", "category"],
...
}However, this requires users to know about the issue and manually list all relevant fields while excluding vector fields.
Proposed solution
Auto-exclude vector-like fields when contextFields is not specified.
In the getContextSource() fallback path, detect and skip fields whose values are large numeric arrays (heuristic: List<Number> with length > 32):
// When no contextFields specified, auto-exclude embedding/vector fields
Map<String, Object> sourceAsMap = hit.getSourceAsMap();
Map<String, Object> filteredSource = new HashMap<>();
for (Map.Entry<String, Object> entry : sourceAsMap.entrySet()) {
Object value = entry.getValue();
if (isLikelyVectorField(value)) {
continue; // Skip embedding fields
}
filteredSource.put(entry.getKey(), value);
}
return OBJECT_MAPPER.writeValueAsString(filteredSource);Where isLikelyVectorField checks:
private boolean isLikelyVectorField(Object value) {
if (value instanceof List) {
List<?> list = (List<?>) value;
return list.size() > 32
&& !list.isEmpty()
&& list.get(0) instanceof Number;
}
return false;
}Additional improvements:
- Log a warning when large array fields are detected and auto-excluded, for user visibility
- Consider querying the index mapping to identify
knn_vectortyped fields for a more precise exclusion
Alternative approaches
-
_sourceexcludes in search request — modify search request builder to add_source: { excludes: ["*_embedding", "*_vector"] }. Relies on field naming conventions. -
Query index mapping — before searching, query the index mapping to discover all
knn_vector/dense_vectortyped fields and exclude them from_source. Most precise but adds an extra API call per index. -
Introduce
excludeFieldsparameter — add a new optional parameter to the judgment API that lets users specify fields to exclude. Complementary tocontextFields.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status