Skip to content

BucketOperation doesn't expose _id field, causing IllegalArgumentException in subsequent operations #5046

@this-amine

Description

@this-amine

BucketOperation doesn't expose _id field, causing IllegalArgumentException in subsequent operations

Problem Description

Spring Data MongoDB's BucketOperation has a critical field exposure bug that prevents subsequent aggregation operations from referencing the _id field that MongoDB's $bucket operation always generates. This causes IllegalArgumentException errors in what should be standard aggregation pipelines.

Example That Demonstrates the Issue

// This fails with "Invalid reference '_id'"
new Aggregation(
    bucket("price").withBoundaries(10, 50, 100),
    addFields().addField("bucketLabel").withValueOf("_id")
).toDocument("collection", strictContext);

Error: IllegalArgumentException: Invalid reference '_id'

Why This Should Work

MongoDB's $bucket operation always generates an _id field in its output documents. When you run a bucket aggregation, the result looks like this:

[
  { "_id": 10, "count": 3 },
  { "_id": 50, "count": 7 },
  { "_id": 100, "count": 2 }
]

The _id field contains the bucket boundary value, making it a natural field to reference in subsequent pipeline stages.

Root Cause Analysis

The issue lies in BucketOperation's field exposure logic. Currently, the asExposedFields() method doesn't include the _id field:

// Current (incorrect) implementation in BucketOperationSupport
protected ExposedFields asExposedFields() {
    // Missing _id field exposure!
    if (isEmpty()) {
        return ExposedFields.from(new ExposedField("count", true));
    }
    ExposedFields fields = ExposedFields.from();
    // ... only exposes user-defined outputs
}

This means Spring Data MongoDB doesn't know that the _id field exists, even though MongoDB will generate it.

Proposed Solution

Make BucketOperation expose the _id field that MongoDB actually generates:

// Proposed fix in BucketOperationSupport.asExposedFields():
protected ExposedFields asExposedFields() {
    // MongoDB's $bucket and $bucketAuto always generate _id field
    ExposedFields fields = ExposedFields.from(new ExposedField(Fields.UNDERSCORE_ID, true));

    //rest of the code
}

Why This Bug Went Unnoticed

A previous fix (#3497) added a defensive fallback in ProjectionOperation that made project("_id") work with bucket operations. However, this fix only addressed the symptom for projection operations—it didn't solve the root cause. As a result, other operations like addFields() continued to fail when trying to reference the _id field that MongoDB actually generates.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions