-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Is your feature request related to a problem? Please describe.
ZStd was introduced as an experimental compression option for Lucene indexes in #3577. This brought the implementation as a module
(installed by default) under the sandbox
directory. However, the feature would remain "expert" optional as it would only be installed if users built the distribution themselves and passed sandbox.enabled=true
at JVM startup.
Unfortunately this code was prematurely released GA when it was migrated as a top level module a short time later in #7908 without any feature flags. This has now lead to users reporting memory leaks along with several other bugs in the zstd jni library. Additionally, Lucene has a long running discussion on the pros/cons of Zstd as an index compression option including the reasons it's not provided as a core capability. One of those reasons is the hard dependency on native compiled code, which often leads to portability issues due to glibc differences. These issues have been realized several times both in the OpenSearch bundle (e.g., see the KNN issue where users can't compile on M1 Mac) and legacy codebase .
For these reasons, we need to address the premature promotion of the zstd library as a GA / LTS feature in core, and (at minimum) release a patched bundle that fixes the critical performance issues and bugs.
Describe the solution you'd like
Quickly build and release a 2.9.1 bundle distribution with five patches.
- Zstd-jni memory leak fix in: Close Zstd Dictionary after execution to avoid any memory leak.ย #9403
- Add a new
ZSTD_COMPRESSION_EXPERIMENTAL = "opensearch.experimental.feature.compression.zstd.enabled
feature flag that is set tofalse
by default (forcing users to opt in). - Apply the
ZSTD_COMPRESSION_EXPERIMENTAL
feature flag both to the CodecService constructor and theCompressionProvider.getCompressors()
(used for BlobStoreRepository compression). - Add a
DeprecationLogger
message that the zstd feature will be moved to a plugin in the next release - Bump the zstd library dependency from
1.5.5-3
to1.5.5-5
Bump zstd version to 1.5.5-5ย #9431 (NEEDS TO BE BACKPORTED)
In 2.10.0 (or later even) we should decide the following:
- Move the
ZstdCodec
andZstdNoDictCodec
out from being a default module into an optional location (e.g., either an optional plugin or library - note that the BlobStoreRepository compression is already in as an optional library, but its packaged and included by default so we still need to figure out how to make that optional). - Whether to switch to direct memory or introduce an expert setting that gives users the option to use direct or heap memory when using ZStd compression
Describe alternatives you've considered
- Move ZStd codec and BlobStore compression to a module in a patch release.
- Revert Moving zstd out of sandboxย #7908 in a 2.9.1 patch release