Skip to content

Add Terms aggregation in proto#411

Open
yiyuabc wants to merge 4 commits intoopensearch-project:mainfrom
yiyuabc:yiyupan/terms-agg
Open

Add Terms aggregation in proto#411
yiyuabc wants to merge 4 commits intoopensearch-project:mainfrom
yiyuabc:yiyupan/terms-agg

Conversation

@yiyuabc
Copy link

@yiyuabc yiyuabc commented Mar 8, 2026

Description

Summary

Enables Terms bucket aggregation support in proto generation by removing Terms-related types from the exclusion list in spec-filter.yaml.

This PR builds on the Min/Max aggregation proto support and adds:

  • Request: TermsAggregation
  • Response: 5 Terms aggregate variants (Double, Long, String, UnsignedLong, Unmapped)

Changes

Modified tools/proto-convert/src/config/spec-filter.yaml to remove exclusions for:

  • TermsAggregation
  • DoubleTermsAggregate
  • LongTermsAggregate
  • StringTermsAggregate
  • UnmappedTermsAggregate
  • UnsignedLongTermsAggregate

Generated Proto Definitions

Request - TermsAggregation:

message TermsAggregation {
  // Custom metadata to associate with the aggregation (optional)
  optional ObjectMap meta = 1;

  // Sub-aggregations for this bucket aggregation
  map<string, AggregationContainer> aggregations = 2;

  optional TermsAggregationCollectMode collect_mode = 3;
  repeated string exclude = 4;
  optional TermsAggregationExecutionHint execution_hint = 5;
  optional TermsInclude include = 6;

  // Only return values that are found in more than min_doc_count hits.
  optional int64 min_doc_count = 7;
  optional FieldValue missing = 8;

  // Coerced unmapped fields into the specified type.
  optional ValueType value_type = 9;
  repeated SortOrderSingleMap order = 10;

  // The number of candidate terms produced by each shard.
  optional int32 shard_size = 11;
  optional int64 shard_min_doc_count = 12;

  // Set to true to return the doc_count_error_upper_bound.
  optional bool show_term_doc_count_error = 13;

  // The number of buckets returned out of the overall terms list.
  optional int32 size = 14;
  optional string format = 15;
  optional string field = 16;
  optional Script script = 17;
}

message TermsInclude {
  oneof terms_include {
    StringArray terms = 1;
    TermsPartition partition = 2;
  }
}

message TermsPartition {
  int32 num_partitions = 1;
  int32 partition = 2;
}

Response - DoubleTermsAggregate:
message DoubleTermsAggregate {
  optional ObjectMap meta = 1;
  repeated DoubleTermsBucket buckets = 2;
  optional int64 doc_count_error_upper_bound = 3;
  optional int64 sum_other_doc_count = 4;
}

message DoubleTermsBucket {
  int64 doc_count = 1;
  map<string, Aggregate> aggregate = 2;
  optional int64 doc_count_error_upper_bound = 3;
  double key = 4;
  optional string key_as_string = 5;
}

Response - LongTermsAggregate:
message LongTermsAggregate {
  optional ObjectMap meta = 1;
  repeated LongTermsBucket buckets = 2;
  optional int64 doc_count_error_upper_bound = 3;
  optional int64 sum_other_doc_count = 4;
}

message LongTermsBucket {
  int64 doc_count = 1;
  map<string, Aggregate> aggregate = 2;
  optional int64 doc_count_error_upper_bound = 3;
  LongTermsBucketKey key = 4;
  optional string key_as_string = 5;
}

message LongTermsBucketKey {
  oneof long_terms_bucket_key {
    int64 signed = 1;
    string unsigned = 2;
  }
}

Response - StringTermsAggregate:
message StringTermsAggregate {
  optional ObjectMap meta = 1;
  repeated StringTermsBucket buckets = 2;
  optional int64 doc_count_error_upper_bound = 3;
  optional int64 sum_other_doc_count = 4;
}

message StringTermsBucket {
  int64 doc_count = 1;
  map<string, Aggregate> aggregate = 2;
  optional int64 doc_count_error_upper_bound = 3;
  string key = 4;
  optional string key_as_string = 5;
}

Response - UnsignedLongTermsAggregate:
message UnsignedLongTermsAggregate {
  optional ObjectMap meta = 1;
  repeated UnsignedLongTermsBucket buckets = 2;
  optional int64 doc_count_error_upper_bound = 3;
  optional int64 sum_other_doc_count = 4;
}

message UnsignedLongTermsBucket {
  int64 doc_count = 1;
  map<string, Aggregate> aggregate = 2;
  optional int64 doc_count_error_upper_bound = 3;
  Uint64 key = 4;
  optional string key_as_string = 5;
}

Response - UnmappedTermsAggregate:
message UnmappedTermsAggregate {
  optional ObjectMap meta = 1;
  repeated ObjectMap buckets = 2;
  optional int64 doc_count_error_upper_bound = 3;
  optional int64 sum_other_doc_count = 4;
}

Added to AggregationContainer:
message AggregationContainer {
  // ...existing aggregations...
  TermsAggregation terms = 4;
}

Added to Aggregate:
message Aggregate {
  // ...existing aggregates...
  optional DoubleTermsAggregate dterms = 3;
  optional LongTermsAggregate lterms = 4;
  optional StringTermsAggregate sterms = 5;
  optional UnsignedLongTermsAggregate ulterms = 6;
  optional UnmappedTermsAggregate umterms = 7;
}

Testing

npm run preprocessing
npx --yes @openapitools/openapi-generator-cli generate -c tools/proto-convert/src/config/protobuf-generator-config.yaml
npm run postprocessing

All proto generation steps complete successfully.

Related

🤖 Generated with https://claude.com/claude-code

Issues Resolved

List any issues this PR will resolve, e.g. Closes [...].

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

yiyuabc and others added 3 commits March 7, 2026 05:55
Signed-off-by: Yiyu Pan <yypan14@gmail.com>
Co-Authored-By: Claude (claude-sonnet-4-5) <noreply@anthropic.com>
Signed-off-by: Yiyu Pan <yypan14@gmail.com>
Remove Terms aggregation types from exclusion list to enable
proto generation for Terms bucket aggregation support:

Request:
- TermsAggregation

Response:
- DoubleTermsAggregate
- LongTermsAggregate
- StringTermsAggregate
- UnmappedTermsAggregate
- UnsignedLongTermsAggregate

Signed-off-by: Yiyu Pan <yypan14@gmail.com>
Signed-off-by: Yiyu Pan <yypan14@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant