Skip to content

Conversation

@sergehuber
Copy link
Contributor

@sergehuber sergehuber commented Jan 6, 2025

Dear Apache Unomi developers and users,

After 8 months of development, and while I'm at it to start 2025 on a high note, I am proud to share with you and submit for review my contribution on the support of OpenSearch as an alternative to ElasticSearch for Apache Unomi.

A lot of work has gone into this contribution, and I can fully understand that some of you will have questions. I propose to organize a meeting to go over the changes and also to answer any questions you might have.

LIVE DEMONSTRATION ANNOUNCEMENT

I'm pleased to invite you to a live presentation and demonstration of the OpenSearch integration and new development tools.

Date: January 13th, 2025
Time: 17:00 CET
Format: Online presentation with live demonstration
Link: https://calendar.app.google/7ZffzAgwPwhamAAU6

Agenda:

  1. Overview of OpenSearch integration
  2. Demonstrations:
    • Configuring and starting Apache Unomi with OpenSearch
    • New development tools in action
    • Integration testing capabilities
  3. Q&A session

OPENSEARCH CONTRIBUTION RATIONALE

  • Enable true vendor independence by providing a fully open-source search engine alternative
  • Address growing community concerns about ElasticSearch licensing changes and potential future restrictions
  • Reduce total cost of ownership for Apache Unomi deployments, especially in large-scale environments
  • Future-proof Apache Unomi's architecture through a modular persistence layer design
  • Enhance developer experience with flexible backend choices
  • Leverage OpenSearch's strong commitment to the Apache License 2.0
  • Enable seamless scaling without licensing constraints
  • Benefit from OpenSearch's active development and innovation in the search engine space
  • Strengthen Apache Unomi's position as a truly free and open-source CDP platform
  • Support the latest and greatest version of OpenSearch
  • Don't break existing ElasticSearch support for existing users
  • Use existing integration tests to validate contribution for both backend implementations

WHERE TO FIND IT
——————————

https://github.com/apache/unomi/tree/opensearch-persistence

WHAT’S INCLUDED
—————————

  • 100% feature parity between ElasticSearch and OpenSearch
  • 100% integration tests passing on both ElasticSearch and OpenSearch
  • Updates to in-project documentation (website updates will be done once these changes are reviewed)

WHAT IS NOT INCLUDED
————————————

DETAILED CHANGES SUMMARY

I'd like to summarize the significant improvements and changes implemented in the opensearch-persistence branch. This work represents a major enhancement to Apache Unomi's persistence layer and development tooling.

Key Features and Improvements:

  1. OpenSearch Integration
    • Full support for OpenSearch 3.0.0 as an alternative to ElasticSearch
    • Complete implementation of persistence layer for OpenSearch
    • Support for authenticated and encrypted OpenSearch configurations
    • Implementation of roll-over policy for OpenSearch
    • Backend-independent implementation of GeoDistance and DateMath utilities
  • Integration with Health Check servlet
  1. Testing and CI/CD Improvements

    • Integration tests now fully compatible with both OpenSearch and ElasticSearch
    • Docker-based OpenSearch testing environment
    • Prepared GitHub Workflows matrix
    • Support for Karaf Debugging in Docker environments
  2. Architecture and Configuration

    • Feature-based persistence implementation selection
    • Configurable startup process via org.apache.unomi.start.cfg
    • Clean separation between OpenSearch and ElasticSearch deployments
    • Split Unomi features into smaller, more manageable chunks
    • Enhanced configuration options for both persistence implementations
  3. Command Line Interface Improvements

    • Enhanced unomi:start command with persistence selection
  4. Code Quality and Maintenance

    • Improved logging consistency
    • Enhanced error handling
    • Better resource cleanup

Test Performance Insights:
The slowest test analysis reveals areas for potential optimization, with some tests taking up to 100 seconds to complete. This information will be valuable for future performance improvements. Also a intermediate progress report is now printed indicating how many tests have succeeded or failed, as well as a time estimation until they are completed and a progress bar.

Next Steps:

  1. Code review and merge of the branch
  2. Complete testing of the GitHub Workflows matrix
  3. Documentation updates to reflect new features and configurations
  4. Release new version
  5. Update website with latest OpenSearch support information

The changes represent a significant step forward in making Apache Unomi more flexible and maintainable, while providing developers with better tools for development and debugging.


Please following this checklist to help us incorporate your contribution quickly and easily:

  • Make sure there is a JIRA issue filed
    for the change (usually before you start working on it). Trivial changes like typos do not
    require a JIRA issue. Your pull request should address just this issue, without pulling in other changes.
  • Format the pull request title like [UNOMI-XXX] - Title of the pull request
  • Provide integration tests for your changes, especially if you are changing the behavior of existing code or adding
    significant new parts of code.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
    Copy the description to the related JIRA issue
  • Run mvn clean install -P integration-tests to make sure basic checks pass. A more thorough check will be
    performed on your pull request automatically.

Trivial changes like typos do not require a JIRA issue (javadoc, project build changes, small doc changes, comments...).

If this is your first contribution, you have to read the Contribution Guidelines

If your pull request is about ~20 lines of code you don't need to sign an Individual Contributor License Agreement
if you are unsure please ask on the developers list.

To make clear that you license your contribution under the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.

…king, work on automated tests is still in progress.
# Conflicts:
#	graphql/graphql-playground/yarn.lock
#	itests/src/test/java/org/apache/unomi/itests/BaseIT.java
#	manual/src/main/asciidoc/graphql.adoc
#	persistence-elasticsearch/core/src/test/java/org/apache/unomi/persistence/elasticsearch/ElasticsearchPersistenceTest.java
#	persistence-spi/src/main/java/org/apache/unomi/persistence/spi/conditions/ConditionContextHelper.java
#	persistence-spi/src/main/java/org/apache/unomi/persistence/spi/conditions/ConditionEvaluatorDispatcherImpl.java
#	tools/shell-commands/src/main/java/org/apache/unomi/shell/services/internal/UnomiManagementServiceImpl.java
- Changed Unomi startup to use features instead of bundles
- Added new build script that integrates all the functionality of the other build scripts
- Fix IPv6 address parsing
- Merge latest changes from master branch
-
- Add option to new build script to be able to use opensearch with integration tests
- Replace hardcoded past event ES query builder in pasteventconditionbuilder to use a generic interface
- Replace GeoDistance and DistanceUnit used directly from ElasticSearch with clean-room implementation that are validated through unit tests. Also added unit tests on the ElasticSearch implementation to be able to test any differences
- Also added clean-room implementation of the DateMathParser so that it can be used with ElasticSearch or OpenSearch
- Modified Unomi startup mechanism to use features instead of bundles. Unfortunately due to complex interdependencies between bundles the features could not be split as wanted, so there is some duplication in the list of bundles.
- Removed last of inter-dependencies in the base plugin and the persistence-spi to try to resolve bundle inter-dependencies but that wasn't enough.
- ElasticSearch integration tests are now execute without any errors !
- Removed elasticsearch-core from bundle watch requirements
- Fix issues with date parsing due to case sensitivity
- Improved test units for date parsing and date math handling
- Modified HealthChecks to provide an OpenSearch check provider (not yet fully working)
- Deactivate 1.x to 2.x migration integration test for OpenSearch (No OpenSearch users will be coming from 1.x)
- Update OpenSearch past event query builder to latest changes done in ElasticSearch past event query builder
- Various fixes in the integration tests to make them compatible with OpenSearch (removed hardcoded elasticsearch configuration and references)
- Added new shell script in itests directory to make it easier to handle the dynamically generated Pax Exam Karaf test container directory. Documentation is also included in the README file inside the itests directory.
- Removed elasticsearch-core from bundle watch requirements
- Fix issues with date parsing due to case sensitivity
- Improved test units for date parsing and date math handling
- Modified HealthChecks to provide an OpenSearch check provider (not yet fully working)
- Deactivate 1.x to 2.x migration integration test for OpenSearch (No OpenSearch users will be coming from 1.x)
- Update OpenSearch past event query builder to latest changes done in ElasticSearch past event query builder
- Various fixes in the integration tests to make them compatible with OpenSearch (removed hardcoded elasticsearch configuration and references)
- Added new shell script in itests directory to make it easier to handle the dynamically generated Pax Exam Karaf test container directory. Documentation is also included in the README file inside the itests directory.
…ess status in the integration tests

- Make sure the Unomi Management Service is started in IT tests before starting the unomi:start command
- Add support for minimal cluster state to allow to start an OpenSearch cluster with yellow status in IT tests
- Fix OpenSearch configuration prefix
- Modify HealthCheck providers to only be available depending on the availability of the persistence implementation.
- Fix integration tests to work properly with OpenSearch.
- Fix OpenSearch persistence initial startup
- Restructure startFeatures configuration to use arrays instead of complex parsing
- Modify OpenSearch custom object mapping to serialize map entries that have null values (which is the default for the ElasticSearch implementation).
-
…ess status in the integration tests

- Make sure the Unomi Management Service is started in IT tests before starting the unomi:start command
- Add support for minimal cluster state to allow to start an OpenSearch cluster with yellow status in IT tests
- Fix OpenSearch configuration prefix
- Modify HealthCheck providers to only be available depending on the availability of the persistence implementation.
- Fix integration tests to work properly with OpenSearch.
- Fix OpenSearch persistence initial startup
- Restructure startFeatures configuration to use arrays instead of complex parsing
- Modify OpenSearch custom object mapping to serialize map entries that have null values (which is the default for the ElasticSearch implementation).
- Make sure the OpenSearch docker container used for the IT tests is replaced when tests are restarted.
- Fix the handling of the OffsetDateTime in the OpenSearch Property condition query builder
- Fix the rule service IT to generate rules with proper conditions and actions
…sts output to be CSV compatible

- Added a known issue in the itests README to reference the log issue on OpenSearch 2.18.
- Add docker compose support for OpenSearch
- Fix startup issues with updates to UnomiManagementService
- Documentation updates to add OpenSearch information (still to be completed)
…r both OpenSearch and ElasticSearch

- Update Health check README to explain how it now works with both ElasticSearch and OpenSearch engines
- Add docker compose support for OpenSearch
- Fix startup issues with updates to UnomiManagementService
- Documentation updates to add OpenSearch information (still to be completed)
# Conflicts:
#	itests/src/test/java/org/apache/unomi/itests/migration/Migrate16xTo220IT.java
@sergehuber sergehuber requested a review from jsinovassin January 6, 2025 10:18
@sergehuber sergehuber self-assigned this Jan 6, 2025
…tween the ElasticSearch and OpenSearch integration tests

- Add documentation on how to migrate from ElasticSearch to OpenSearch (not tested yet)
@sergehuber sergehuber requested a review from fpapon January 15, 2025 13:26
…parison, aligning with analyzer configuration behavior
…tion files and update related test cases. Rename test method for clarity and adjust logging messages to reflect changes in index template creation.
…viceImpl to use logging instead of printStackTrace.

Update exception messages in IdsConditionESQueryBuilder and IdsConditionOSQueryBuilder to include maximum IDs query count for better clarity.
Other general naming cleanup.
…cSearch

- Remove hover event query builder that is replaced with a condition definition with a parent condition
- Add a JSON schema for the hover event type
- Add missing JavaDocs
- Added missing Javadoc comments
- Minor whitespace cleanups
…h integration test instructions in documentation
…in Elasticsearch and OpenSearch. Add integration tests for legacy query builder functionality, including new condition definitions and JSON files for legacy conditions. Update documentation to reflect changes in query builder ID conventions and migration steps from previous versions.
…rDispatcher to utilize ConditionQueryBuilderDispatcherSupport for legacy ID resolution and contextualization. Remove hardcoded legacy ID mappings and improve documentation regarding legacy query builder handling.
@sergehuber
Copy link
Contributor Author

Changes since February on OpenSearch contribution
After a LOT of work, I have updated this PR. Hopefully, it will address all the feedback from the reviews and can be merged soon to avoid having to track changes on the master branch again.
Here is a summary of the changes:

  • Merged with the latest changes from the master branch, including ES 9 support and Karaf 4.4.8 (tricky merge!)
  • Upgraded OpenSearch support to version 3 (was on 2.18 previously)
  • Removed all the non-OpenSearch related changes
  • Added missing Javadocs
  • Addressed all the points raised by the reviewers that were relevant
  • Renamed all the queryBuilderIds and removed the *ESQueryBuilder postfix in the queryBuilderIds, replacing them with *QueryBuilder. For example, we had queryBuilderId=booleanConditionESQueryBuilder for both OpenSearch and ElasticSearch, so I simply renamed this to booleanConditionQueryBuilder. In order to avoid introducing a breaking change, I added a system to map old queryBuilderIds to new ones that issues a warning if a legacy ID is being used but still allows the old IDs to work transparently. Also documented the system in the migration guide.
  • Developed integration tests for the queryBuilderId mapping system
  • Updated the Healthcheck provider to now only report the persistence implementation that is being used
  • Documented the new start configuration system and explained how it can be used for custom deployments or distributions
  • Removed the Hover event query builder implementation and simply replaced it with a condition type with a parent condition. This way it can work on both ElasticSearch and OpenSearch. Also added a hover event JSON schema built into the plugin.

All the tests are green on GitHub, and the code is ready for review (again)!

@sergehuber
Copy link
Contributor Author

Following the release I need to update the version numbers, I'll do this asap.

@sergehuber
Copy link
Contributor Author

Ok checks are passing again

…ss. Update legacy query builder references to use the new centralized mapping and logging mechanism. This change enhances code maintainability and prepares for future improvements.
@sergehuber sergehuber merged commit 078ebce into master Nov 28, 2025
3 checks passed
@sergehuber sergehuber deleted the opensearch-persistence branch November 28, 2025 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants