Skip to content

OpenSearch source with alternative sort/search_after #6332

@dlvenable

Description

@dlvenable

Is your feature request related to a problem? Please describe.

The opensearch source performs search queries against the OpenSearch cluster/domain/collection to find data. To paginate, it also uses the sort and search_after query parameters.

.sort(List.of(
SortOptions.of(sortOptionsBuilder -> sortOptionsBuilder.doc(ScoreSort.of(scoreSort -> scoreSort.order(SortOrder.Asc)))),
SortOptions.of(sortOptionsBuilder -> sortOptionsBuilder.field(FieldSort.of(fieldSort -> fieldSort.field("_id").order(SortOrder.Asc)))))
)

The sort and search_after only use a combination of _score and _id. This works well for many pipeline authors as this behavior does not presume any specific field. However, some pipeline authors may want to query using @timestamp or perhaps another field they have.

Describe the solution you'd like

I'd like a configuration to configure the sort and search_after values. There should be only one option since these should be in sync. Additionally, we should be able to control ascending or descending.

source:
  opensearch:
    search_options:
      sort: 
        - name: @timestamp
           order: descending
        - name: _id
           order: ascending

Describe alternatives you've considered (Optional)

Some alternative syntax options:

Compact, but also not a pattern found in Data Prepper:

source:
  opensearch:
    search_options:
      sort: 
        - @timestamp: descending
        - _id: ascending

Split sort and order.

source:
  opensearch:
    search_options:
      sort: 
        - @timestamp
        - _id
      order:
        @timestamp: descending
        #_id: ascending  # Not required since it is the default

Additional context

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Status

    Unplanned

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions