Skip to content

Add support of BinaryFieldValues for gRPC non-source primitive array indexing#1035

Open
laminelam wants to merge 1 commit intoopensearch-project:mainfrom
laminelam:feature/binary_fields_float
Open

Add support of BinaryFieldValues for gRPC non-source primitive array indexing#1035
laminelam wants to merge 1 commit intoopensearch-project:mainfrom
laminelam:feature/binary_fields_float

Conversation

@laminelam
Copy link

@laminelam laminelam commented Jan 28, 2026

Introducing new proto messages to support indexing of primitive arrays in gRPC outside of source.

Related issue: opensearch-project/OpenSearch#19638

For now, only float arrays are supported, as well as BytesValue.

Two options are offered:

  • packed array.
  • binary LE (little-endian) which is much faster (see benchmarks results in the related issue).

This is the expected proto messages

message BulkRequestBody {

  // [required]
  OperationContainer operation_container = 1;

  // [optional]
  optional UpdateAction update_action = 2;

  // [optional]
  optional bytes object = 3;

  // Map of fully-qualified field path -> typed value.
  map<string, BinaryFieldValue> field_values = 4; // <-- NEW
}

message BinaryFieldValue {
  oneof binary_field_value {
    BytesValue bytes_value = 1;

    FloatArrayValue float_array_value = 2;

  }
}

message BytesValue {

  bytes bytes = 1;
}

message FloatArrayValue {
  oneof float_array_value {
    // fast path (4 * dimension bytes, little-endian)
    FloatBinaryLE binary_le = 1;

    // simple path, packed array
    FloatList values = 2;

  }
}

message FloatBinaryLE {

  bytes bytes_le = 1;

  // Vector dimension
  int32 dimension = 2;
}

message FloatList {

  repeated float values = 1;
}

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

Proto Compatibility Report

Merge Report

Message Changes

Message Change Field
BulkRequestBody ADDED map<string, BinaryFieldValue> field_values = 4
QueryContainer ADDED BoostingQuery boosting = 33
QueryContainer ADDED SimpleQueryStringQuery simple_query_string = 34

Legend

  • 🗑️ DEPRECATED - Field/value annotated as deprecated in protobufs and will be officially removed in the next major OpenSearch release
  • ADDED - New field/value added at the end of the message/enum
  • ✏️ RENAMED - Field renamed in-place
  • 🚨 BREAKING - This change will cause breaking change to Protobuf

Generated by Proto Compatibility Check

@github-actions
Copy link
Contributor

Changes Analysis

Commit SHA: 559e3c9
Comparing To SHA: 2954600

API Changes

Summary

├─┬Paths
│ ├─┬/_bulk
│ │ ├─┬PUT
│ │ │ └─┬Request Body
│ │ │   └─┬Content
│ │ │     └─┬application/x-ndjson
│ │ │       └─┬Schema
│ │ │         └─┬Schema
│ │ │           ├──[➕] anyOf (30728:19)
│ │ │           └──ANYOF
│ │ └─┬POST
│ │   └─┬Request Body
│ │     └─┬Content
│ │       └─┬application/x-ndjson
│ │         └─┬Schema
│ │           └─┬Schema
│ │             ├──[➕] anyOf (30728:19)
│ │             └──ANYOF
│ └─┬/{index}/_bulk
│   ├─┬PUT
│   │ └─┬Request Body
│   │   └─┬Content
│   │     └─┬application/x-ndjson
│   │       └─┬Schema
│   │         └─┬Schema
│   │           ├──[➕] anyOf (30728:19)
│   │           └──ANYOF
│   └─┬POST
│     └─┬Request Body
│       └─┬Content
│         └─┬application/x-ndjson
│           └─┬Schema
│             └─┬Schema
│               ├──[➕] anyOf (30728:19)
│               └──ANYOF
└─┬Components
  ├──[➕] schemas/_core.bulk___BytesValue (50298:7)
  ├──[➕] schemas/_core.bulk___FloatList (50357:7)
  ├──[➕] schemas/_core.bulk___FloatBinaryLE (50344:7)
  ├──[➕] schemas/_core.bulk___BinaryFieldValue (50257:7)
  ├──[➕] schemas/_core.bulk___FloatArrayValue (50324:7)
  └──[➕] schemas/_core.bulk___BinaryFieldValues (50273:7)

Document Element Total Changes Breaking Changes
paths 4 0
components 6 0
  • Total Changes: 10
  • Additions: 10

Report

The full API changes report is available at: https://github.com/opensearch-project/opensearch-api-specification/actions/runs/21423612062/artifacts/5291578168

API Coverage

Before After Δ
Covered (%) 666 (65.23 %) 666 (65.23 %) 0 (0 %)
Uncovered (%) 355 (34.77 %) 355 (34.77 %) 0 (0 %)
Unknown 145 145 0

@karenyrx
Copy link
Collaborator

Hi @laminelam , will this change be needed in the REST side? If it's only for the gRPC side, I wonder if we should consider moving this to the opensearch-protobufs repository instead.

Just some preliminary thoughts on placement in protobufs repo: I'm thinking it can be added manually to common.proto for now, which the protobuf tooling to autogenerate protobufs should not overwrite (not until the next OpenSearch 4.0 major release at least). Meanwhile, the protobuf maintainers can look for a more long term solution to support fields specific only to only in the protobufs and not the REST API in the future. @lucy66hw @laminelam WDYT?

@github-actions
Copy link
Contributor

Spec Test Coverage Analysis

Total Tested
682 680 (99.71 %)

@laminelam
Copy link
Author

Hi @laminelam , will this change be needed in the REST side? If it's only for the gRPC side, I wonder if we should consider moving this to the opensearch-protobufs repository instead.

Just some preliminary thoughts on placement in protobufs repo: I'm thinking it can be added manually to common.proto for now, which the protobuf tooling to autogenerate protobufs should not overwrite (not until the next OpenSearch 4.0 major release at least). Meanwhile, the protobuf maintainers can look for a more long term solution to support fields specific only to only in the protobufs and not the REST API in the future. @lucy66hw @laminelam WDYT?

Hi @karenyrx
I am fine to manually add the new protos directly to protobufs repo. We need to find a way to prevent it from being overridden by the auto-generating process.

@karenyrx
Copy link
Collaborator

karenyrx commented Feb 1, 2026

I am fine to manually add the new protos directly to protobufs repo. We need to find a way to prevent it from being overridden by the auto-generating process.

The tooling will not overwrite any protobufs, if you add an annotation next to the field [(tooling_skip) = true];, e.g.:

map<string, BinaryFieldValue> field_values = 4 [(tooling_skip) = true];

@laminelam Let's port this PR over to opensearch-protobufs then?

@laminelam
Copy link
Author

The tooling will not overwrite any protobufs, if you add an annotation next to the field [(tooling_skip) = true];, e.g.:

map<string, BinaryFieldValue> field_values = 4 [(tooling_skip) = true];

@laminelam Let's port this PR over to opensearch-protobufs then?

Thank you @karenyrx for adding this new feature
Here's the protobufs PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants