Skip to content

Conversation

@IanHoang
Copy link
Collaborator

@IanHoang IanHoang commented Sep 23, 2025

Description

This PR adds support for TimeSeries settings. Users can now add the following config to the SDG configs and specify a timestamp field as well as the frequency of the timestamp. Data outputted is written in order. The PR also adds the option to start writing data with at a specific suffix.

# SDG Config new section
  filename_suffix_begins_at: 0
  timeseries_enabled:
    timeseries_field: "@timestamp"
    timeseries_start_date: "1/1/2024"
    timeseries_end_date: "1/31/2024"
    timeseries_frequency: "ms"
    timeseries_format: "epoch_ms"

The example above instructs SDG to start generating files with file paritions starting at suffix 0, and for SDG to generate timestamps in epoch format, each entry should be a ms apart, timestamp generated should be between 1/1/2024-1/31/2024, and should be written in @timestamp field.

This was demo'd at OpenSearchCon NA.

Issues Resolved

#862

Testing

  • New functionality includes testing

  • Tested with custom logic

  • Tested with index mapping

  • Tested E2E with --test-document flag for both approaches

  • Confirmed that the output was as intended


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Add Pydantic Model for TimeSeries settings, update TimeSeries Partitioner, update Mapping SDG to use timeseries data, and add params to basic generators in Mapping SDG

Pylint fixes

Signed-off-by: Ian Hoang <[email protected]>
Copy link
Member

@OVI3D0 OVI3D0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM just left a couple q's

Signed-off-by: Ian Hoang <[email protected]>
Signed-off-by: Ian Hoang <[email protected]>
@IanHoang IanHoang requested a review from gkamat November 4, 2025 22:54
@IanHoang
Copy link
Collaborator Author

IanHoang commented Nov 4, 2025

@gkamat Ready for another review

@IanHoang IanHoang merged commit 7512238 into opensearch-project:main Nov 5, 2025
11 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants