Skip to content

Feat/hunspell ref path#20792

Closed
shayush622 wants to merge 11 commits intoopensearch-project:mainfrom
shayush622:feat/hunspell-ref-path
Closed

Feat/hunspell ref path#20792
shayush622 wants to merge 11 commits intoopensearch-project:mainfrom
shayush622:feat/hunspell-ref-path

Conversation

@shayush622
Copy link
Contributor

  • Add INDEX_REF_PATH_SETTING for package-based hunspell dictionaries
  • Add RestHunspellCacheInvalidateAction for cache invalidation endpoint
  • Update HunspellService with cache management methods
  • Add ref_path validation in MetadataCreateIndexService

Description

This PR adds support for loading Hunspell dictionaries from package-based directories using a new ref_path parameter, enabling multi-tenant dictionary isolation and hot-reload capabilities.

Related Issues

Resolves #[20712]
Link to #20712 (comment).

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

shayush622 and others added 7 commits February 27, 2026 03:29
- Add INDEX_REF_PATH_SETTING for package-based hunspell dictionaries
- Add RestHunspellCacheInvalidateAction for cache invalidation endpoint
- Update HunspellService with cache management methods
- Add ref_path validation in MetadataCreateIndexService

Signed-off-by: shayush622 <ayush5267@gmail.com>
- Add ref_path parameter to hunspell token filter for package-based dictionaries
- Load dictionaries from config/packages/{packageId}/hunspell/{locale}/
- Add cache invalidation REST API endpoints:
  - GET /_hunspell/cache (view cached dictionaries)
  - POST /_hunspell/cache/_invalidate (invalidate by package_id, locale, or cache_key)
  - POST /_hunspell/cache/_invalidate_all (clear all cached dictionaries)
- Add TransportAction with cluster:admin/hunspell/cache/clear permission
- Add path traversal security validation for packageId and locale
- Add updateable flag support for hot-reload via _reload_search_analyzers
- Add comprehensive test coverage for HunspellTokenFilterFactory and REST endpoint
- Remove unused index-level ref_path setting

Addresses PR feedback:
- Removed mise.toml from commit (local dev config)
- Added TransportAction authorization pattern
- Added 21 unit tests for REST endpoint
- Fixed security validation for path traversal attacks
- Documented race condition in invalidateAllDictionaries

Signed-off-by: shayush622 <ayush5267@gmail.com>
- Add ref_path parameter for package-based dictionary loading
- Load from config/packages/{packageId}/hunspell/{locale}/
- Add cache invalidation REST API (GET/POST /_hunspell/cache/_invalidate)
- Add TransportAction with cluster:admin permission
- Add comprehensive security validation (path traversal, separators, cache-key injection)
- Add updateable flag for hot-reload via _reload_search_analyzers
- Add comprehensive test coverage

PR feedback addressed:
- Stricter validate() to reject conflicting params
- Path traversal checks now use config/packages/ as base
- ref_path/locale validation rejects ., .., /, \, : characters

Signed-off-by: shayush622 <ayush5267@gmail.com>
Signed-off-by: shayush622 <ayush5267@gmail.com>
- Add ref_path parameter for package-based dictionary loading
- Load from config/packages/{packageId}/hunspell/{locale}/
- Add cache info API: GET /_hunspell/cache (cluster:monitor/hunspell/cache)
- Add cache invalidation API: POST /_hunspell/cache/_invalidate (cluster:admin/hunspell/cache/invalidate)
- Support invalidation by package_id, locale, cache_key, or invalidate_all
- Add security validation (path traversal, separator injection, null bytes)
- Add updateable flag for hot-reload via _reload_search_analyzers
- Use Strings.hasText() and Strings.isNullOrEmpty() for validation consistency
- Consistent response schema with all fields always present
- Add unit tests, REST handler tests, and integration tests

Signed-off-by: shayush622 <ayush5267@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 3917262.

'Diff too large, requires skip by maintainers after manual review'


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@cwperks cwperks added the skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis. label Mar 6, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Reviewer Guide 🔍

(Review updated until commit 385793b)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 Security concerns

Path traversal mitigation:
The loadDictionaryFromPackage() method includes path traversal checks, but the locale parameter is also resolved as a path (packageHunspellDir.resolve(locale)) and only checked against the hunspell directory after resolution. While there is a startsWith check, the validatePackageIdentifier() in HunspellTokenFilterFactory (which rejects '/', '\', '..') is not called from within HunspellService itself. If getDictionaryFromPackage() is called directly (not through HunspellTokenFilterFactory), the locale parameter could potentially contain path traversal sequences like "../../../etc" that might pass the startsWith check depending on normalization behavior. It is recommended to add input validation directly in getDictionaryFromPackage() and loadDictionaryFromPackage() rather than relying solely on callers to validate.

✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add ref_path support for package-based hunspell dictionary loading

Relevant files:

  • server/src/main/java/org/opensearch/indices/analysis/HunspellService.java
  • server/src/main/java/org/opensearch/index/analysis/HunspellTokenFilterFactory.java
  • server/src/test/java/org/opensearch/indices/analyze/HunspellServiceTests.java
  • server/src/test/java/org/opensearch/index/analysis/HunspellTokenFilterFactoryTests.java
  • server/src/test/resources/indices/analyze/conf_dir/packages/test-pkg/hunspell/en_US/en_US.aff

Sub-PR theme: Add Hunspell cache invalidation REST API and transport actions

Relevant files:

  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/HunspellCacheInvalidateAction.java
  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/HunspellCacheInvalidateRequest.java
  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/HunspellCacheInvalidateResponse.java
  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/HunspellCacheInfoAction.java
  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/HunspellCacheInfoRequest.java
  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/HunspellCacheInfoResponse.java
  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/TransportHunspellCacheInvalidateAction.java
  • server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/TransportHunspellCacheInfoAction.java
  • server/src/main/java/org/opensearch/rest/action/admin/indices/RestHunspellCacheInvalidateAction.java
  • server/src/main/java/org/opensearch/action/ActionModule.java
  • server/src/main/java/org/opensearch/node/Node.java
  • server/src/test/java/org/opensearch/rest/action/admin/indices/RestHunspellCacheInvalidateActionTests.java
  • server/src/internalClusterTest/java/org/opensearch/action/admin/indices/cache/hunspell/HunspellCacheIT.java

⚡ Recommended focus areas for review

Null HunspellService
In ActionModule.configure(), HunspellService is only bound if non-null. However, TransportHunspellCacheInvalidateAction and TransportHunspellCacheInfoAction both require HunspellService via @Inject. If hunspellService is null, these transport actions will fail at runtime with an injection error rather than a clear error message.

Race Condition

In invalidateDictionariesByPackage(), the size is captured before removeIf, but ConcurrentHashMap.removeIf is not atomic with respect to size(). The count calculation (sizeBefore - dictionaries.size()) may be inaccurate in concurrent scenarios. While the comment acknowledges approximate counts for logging, this could mislead operators about actual invalidation counts returned via the API.

String prefix = packageId + CACHE_KEY_SEPARATOR;  // Match keys like "pkg-1234:en_US"
int sizeBefore = dictionaries.size();
dictionaries.keySet().removeIf(key -> key.startsWith(prefix));
int count = sizeBefore - dictionaries.size();
Resource Leak

In loadDictionaryFromPackage(), if an exception occurs after opening dicStreams but before or during the try-with-resources block for the NIOFSDirectory, the affixStream may not be closed if it was opened before the exception. The finally block closes both, but if affixStream assignment itself throws, dicStreams may not be closed.

InputStream affixStream = null;
List<InputStream> dicStreams = new ArrayList<>(dicFiles.length);

try {
    for (Path dicFile : dicFiles) {
        dicStreams.add(Files.newInputStream(dicFile));
    }
    affixStream = Files.newInputStream(affixFiles[0]);

    try (Directory tmp = new NIOFSDirectory(env.tmpDir())) {
        return new Dictionary(tmp, "hunspell", affixStream, dicStreams, ignoreCase);
    }
} catch (Exception e) {
    logger.error(
        () -> new ParameterizedMessage("Could not load hunspell dictionary from package [{}] locale [{}]", packageId, locale),
        e
    );
    throw e;
} finally {
    IOUtils.close(affixStream);
    IOUtils.close(dicStreams);
}
Path-based Routing

The invalidate_all detection relies on request.path().endsWith("/_invalidate_all"), which is fragile. If the path has trailing slashes or query parameters affect path resolution, this check could fail silently and result in no invalidation action being taken (empty request with no parameters set, which would fail validation).

if (request.path().endsWith("/_invalidate_all")) {
    invalidateRequest.setInvalidateAll(true);
} else {
    String packageId = request.param("package_id");
    String locale = request.param("locale");
    String cacheKey = request.param("cache_key");

    invalidateRequest.setPackageId(packageId);
    invalidateRequest.setLocale(locale);
    invalidateRequest.setCacheKey(cacheKey);
}
Cache Key Collision

The isPackageCacheKey() method uses colon (':') as the separator to distinguish package-based keys from traditional locale keys. However, if a traditional locale name somehow contains a colon (unlikely but possible in edge cases), it would be misidentified as a package-based key. The validatePackageIdentifier() method in HunspellTokenFilterFactory rejects colons, but getDictionary() (traditional path) does not call this validation.

public static boolean isPackageCacheKey(String cacheKey) {
    return cacheKey != null && cacheKey.contains(CACHE_KEY_SEPARATOR);
}

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

PR Code Suggestions ✨

Latest suggestions up to 385793b

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix non-atomic removal count in concurrent map

Using sizeBefore - dictionaries.size() to count removed entries is not atomic on a
ConcurrentHashMap and can yield incorrect counts if concurrent modifications occur
between the two size() calls. Instead, count the removals directly inside the
removeIf predicate using an AtomicInteger.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [509-511]

-int sizeBefore = dictionaries.size();
-dictionaries.keySet().removeIf(key -> key.startsWith(prefix));
-int count = sizeBefore - dictionaries.size();
+java.util.concurrent.atomic.AtomicInteger count = new java.util.concurrent.atomic.AtomicInteger(0);
+dictionaries.keySet().removeIf(key -> {
+    if (key.startsWith(prefix)) {
+        count.incrementAndGet();
+        return true;
+    }
+    return false;
+});
+int removedCount = count.get();
Suggestion importance[1-10]: 6

__

Why: The non-atomic size difference calculation between two ConcurrentHashMap.size() calls can yield incorrect counts under concurrent access. Using an AtomicInteger inside removeIf is a valid and more accurate approach for counting actual removals.

Low
Prevent silent injection failure for null service

If hunspellService is null, TransportHunspellCacheInvalidateAction and
TransportHunspellCacheInfoAction will fail at injection time with a cryptic error.
The binding should always be present, or the transport actions should handle a
null/optional service gracefully. Consider throwing an explicit error or always
providing a non-null HunspellService.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [1129-1131]

-if (hunspellService != null) {
-    bind(HunspellService.class).toInstance(hunspellService);
+if (hunspellService == null) {
+    throw new IllegalStateException("HunspellService must not be null when registering hunspell cache actions");
 }
+bind(HunspellService.class).toInstance(hunspellService);
Suggestion importance[1-10]: 5

__

Why: The suggestion points to code in ActionModule.java but the existing_code references HunspellService.java. The actual code is in ActionModule.java at the configure() method. The concern is valid — a null hunspellService would cause injection failures — but the suggestion's file reference is wrong. Additionally, in tests null is explicitly passed, so always throwing would break tests.

Low
Avoid double-closing streams after successful dictionary load

If Files.newInputStream(dicFile) succeeds for some files but then throws for a later
file, the already-opened streams are added to dicStreams but affixStream is still
null. The finally block will close the partial dicStreams correctly, but if an
exception occurs inside the Dictionary constructor after streams are opened, the
streams are closed twice (once by Dictionary and once by finally). More critically,
if opening any dicFile stream fails mid-loop, the partially-filled dicStreams list
is still closed in finally, which is correct—but affixStream is opened after all dic
streams, so a failure there leaves all dic streams open until finally. This ordering
is fine, but the Dictionary constructor may close the streams itself on success,
leading to double-close in finally. Consider using a flag or nulling streams after
handing them to Dictionary.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [315-336]

 InputStream affixStream = null;
 List<InputStream> dicStreams = new ArrayList<>(dicFiles.length);
+boolean success = false;
 
 try {
     for (Path dicFile : dicFiles) {
         dicStreams.add(Files.newInputStream(dicFile));
     }
     affixStream = Files.newInputStream(affixFiles[0]);
 
     try (Directory tmp = new NIOFSDirectory(env.tmpDir())) {
-        return new Dictionary(tmp, "hunspell", affixStream, dicStreams, ignoreCase);
+        Dictionary dict = new Dictionary(tmp, "hunspell", affixStream, dicStreams, ignoreCase);
+        success = true;
+        return dict;
     }
 } catch (Exception e) {
-    ...
+    logger.error(
+        () -> new ParameterizedMessage("Could not load hunspell dictionary from package [{}] locale [{}]", packageId, locale),
+        e
+    );
+    throw e;
 } finally {
-    IOUtils.close(affixStream);
-    IOUtils.close(dicStreams);
+    if (!success) {
+        IOUtils.close(affixStream);
+        IOUtils.close(dicStreams);
+    }
 }
Suggestion importance[1-10]: 5

__

Why: The concern about potential double-closing of streams is valid if Dictionary closes streams internally on success. Using a success flag to conditionally close streams in finally is a reasonable defensive pattern, though the actual impact depends on Dictionary's implementation.

Low
Security
Strengthen path traversal detection to cover embedded dots

The check value.startsWith(".") already covers value.equals(".") and
value.equals(".."), and value.contains("./") / value.contains("../") are subsumed by
the path separator check below. More importantly, the regex-like strings "\." and
"\.\." are checking for literal backslash-dot sequences, not regex escapes—this
is correct for String.contains, but the backslash check is already handled by the
path separator rejection. The redundant conditions add confusion without extra
safety. More critically, a value like "foo..bar" (double dot embedded) is not caught
by any of these checks but could still be problematic depending on OS path
resolution. Consider normalizing and checking the resolved path instead.

server/src/main/java/org/opensearch/index/analysis/HunspellTokenFilterFactory.java [160-167]

-if (value.equals(".")
-    || value.equals("..")
-    || value.contains("./")
-    || value.contains("../")
-    || value.contains("\\.")
-    || value.contains("\\..")
-    || value.startsWith(".")
-    || value.endsWith(".")) {
+if (value.startsWith(".") || value.endsWith(".") || value.contains("..")) {
+    throw new IllegalArgumentException(
+        String.format(Locale.ROOT, "Invalid %s: [%s]. Path traversal sequences (., ..) are not allowed.", paramName, value)
+    );
+}
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that value.contains("..") would catch embedded double-dot sequences like "foo..bar" that the current checks miss. The simplified logic is cleaner and more comprehensive for path traversal prevention, which is a security concern.

Low
General
Remove duplicate changelog entries

There are two identical changelog entries with different PR numbers. These appear to
be duplicate entries describing the same feature. Consider consolidating them into a
single entry or differentiating the descriptions to accurately reflect what each PR
contributes.

CHANGELOG.md [42-43]

-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
+- Add ref_path support for package-based hunspell dictionary loading ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
+- Add cache invalidation API for package-based hunspell dictionary with ref_path support ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
Suggestion importance[1-10]: 6

__

Why: The two changelog entries have identical descriptions but different PR numbers, which is confusing. Differentiating them to accurately reflect each PR's contribution would improve clarity, though this is a minor documentation issue.

Low

Previous suggestions

Suggestions up to commit 190bfd1
CategorySuggestion                                                                                                                                    Impact
General
Fix inaccurate concurrent removal count

In a concurrent environment, sizeBefore - dictionaries.size() can yield a negative
or incorrect count because other threads may add or remove entries between the two
size() calls. Use an atomic counter incremented inside the removeIf predicate to get
an accurate count.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [509-511]

-int sizeBefore = dictionaries.size();
-dictionaries.keySet().removeIf(key -> key.startsWith(prefix));
-int count = sizeBefore - dictionaries.size();
+java.util.concurrent.atomic.AtomicInteger count = new java.util.concurrent.atomic.AtomicInteger(0);
+dictionaries.keySet().removeIf(key -> {
+    if (key.startsWith(prefix)) {
+        count.incrementAndGet();
+        return true;
+    }
+    return false;
+});
+int removedCount = count.get();
Suggestion importance[1-10]: 7

__

Why: The sizeBefore - dictionaries.size() approach is genuinely racy in a ConcurrentHashMap and can yield incorrect (even negative) counts. Using an AtomicInteger inside the removeIf predicate is the correct fix and directly addresses a real concurrency bug.

Medium
Deduplicate identical changelog entries

There are two identical changelog entries with different PR numbers. These should
have distinct descriptions to differentiate what each PR contributes. Having
duplicate descriptions makes the changelog misleading and harder to understand.

CHANGELOG.md [42-43]

-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
+- Add ref_path support for package-based hunspell dictionary loading ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
+- Add cache invalidation API support for package-based hunspell dictionary with ref_path ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
Suggestion importance[1-10]: 6

__

Why: The two changelog entries have identical descriptions but different PR numbers, which is misleading. Differentiating the descriptions would improve clarity, though the exact correct descriptions depend on what each PR actually implements.

Low
Fix repeated disk reads on failed dictionary load

ConcurrentHashMap.computeIfAbsent does not allow the mapping function to throw
checked exceptions, but it also caches failed computations differently across JVM
versions. If loadDictionaryFromPackage throws, the IllegalStateException is
propagated but the key may remain absent, causing repeated expensive disk reads on
every subsequent call. Consider caching a sentinel or using a separate loading
mechanism to avoid repeated failures.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [180-190]

-return dictionaries.computeIfAbsent(cacheKey, (key) -> {
-    try {
-        return loadDictionaryFromPackage(packageId, locale);
-    } catch (Exception e) {
+Dictionary cached = dictionaries.get(cacheKey);
+if (cached != null) {
+    return cached;
+}
+try {
+    Dictionary loaded = loadDictionaryFromPackage(packageId, locale);
+    Dictionary existing = dictionaries.putIfAbsent(cacheKey, loaded);
+    return existing != null ? existing : loaded;
+} catch (Exception e) {
+    throw new IllegalStateException(
+        String.format(Locale.ROOT, "Failed to load hunspell dictionary for package [%s] locale [%s]", packageId, locale),
+        e
+    );
+}
 
-        throw new IllegalStateException(
-            String.format(Locale.ROOT, "Failed to load hunspell dictionary for package [%s] locale [%s]", packageId, locale),
-            e
-        );
-    }
-});
-
Suggestion importance[1-10]: 5

__

Why: The concern about repeated disk reads on failure is valid, but ConcurrentHashMap.computeIfAbsent in Java 8+ does not cache failed computations (the key remains absent on exception), so every failed call will retry the disk read. The suggested alternative using get/putIfAbsent is a reasonable improvement for correctness in concurrent scenarios.

Low
Possible issue
Prevent silent null binding for required service

If hunspellService is null, TransportHunspellCacheInvalidateAction and
TransportHunspellCacheInfoAction will fail at runtime with a Guice injection error
since they require HunspellService. The null check silently skips binding, which
will cause an obscure failure rather than a clear error. Either always bind the
service or throw an explicit exception when it's null.

server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/ActionModule.java [1129-1131]

-if (hunspellService != null) {
-    bind(HunspellService.class).toInstance(hunspellService);
+if (hunspellService == null) {
+    throw new IllegalStateException("HunspellService must not be null when registering hunspell cache actions");
 }
+bind(HunspellService.class).toInstance(hunspellService);
Suggestion importance[1-10]: 6

__

Why: The null check silently skips binding HunspellService, which would cause an obscure Guice injection error at runtime when TransportHunspellCacheInvalidateAction or TransportHunspellCacheInfoAction are invoked. However, the test code explicitly passes null for hunspellService, suggesting the null case is intentional for test scenarios, so throwing unconditionally could break tests.

Low
Security
Fix incomplete path traversal detection

The path traversal check is incomplete: it misses cases like foo/.. (ends with ..),
foo/../bar (contains /../), and Windows-style foo.. sequences. Since path
separators are already rejected in the next check, the traversal check is partially
redundant but still incomplete for edge cases like a value that is exactly ... Rely
on the path separator check as the primary defense and simplify the traversal check
to cover the remaining cases.

server/src/main/java/org/opensearch/index/analysis/HunspellTokenFilterFactory.java [160-167]

-if (value.equals(".")
-    || value.equals("..")
-    || value.contains("./")
-    || value.contains("../")
-    || value.contains("\\.")
-    || value.contains("\\..")
-    || value.startsWith(".")
-    || value.endsWith(".")) {
+if (value.equals(".") || value.equals("..") || value.startsWith(".") || value.endsWith(".") || value.contains("..")) {
Suggestion importance[1-10]: 6

__

Why: The existing traversal check misses patterns like foo/.. or foo/../bar, though the subsequent path separator check (/ and \) provides a strong secondary defense. The simplified check value.contains("..") would catch more cases, but since path separators are already rejected, the remaining gap is limited to edge cases like a bare .. value which is already covered.

Low
Suggestions up to commit dbc150a
CategorySuggestion                                                                                                                                    Impact
General
Fix inaccurate concurrent removal count

dictionaries is a ConcurrentHashMap, and removeIf on its keySet() is not atomic with
respect to size(). Between sizeBefore = dictionaries.size() and the removeIf, new
entries matching the prefix could be added or removed by concurrent threads, making
count inaccurate. While the comment acknowledges approximate counts for
invalidateAllDictionaries, this method returns the count as a meaningful result used
by callers. Consider using an atomic counter within the removeIf predicate instead.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [509-511]

-int sizeBefore = dictionaries.size();
-dictionaries.keySet().removeIf(key -> key.startsWith(prefix));
-int count = sizeBefore - dictionaries.size();
+java.util.concurrent.atomic.AtomicInteger count = new java.util.concurrent.atomic.AtomicInteger(0);
+dictionaries.keySet().removeIf(key -> {
+    if (key.startsWith(prefix)) {
+        count.incrementAndGet();
+        return true;
+    }
+    return false;
+});
+int removed = count.get();
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies a real concurrency issue where sizeBefore and removeIf are not atomic, making the returned count potentially inaccurate. Using an AtomicInteger within the predicate is a valid fix that provides a more accurate count of actually removed entries.

Low
Deduplicate identical changelog entries

There are two identical changelog entries describing the same feature but
referencing different PR numbers. These should be deduplicated or differentiated to
accurately describe what each PR contributes. Having duplicate descriptions makes
the changelog misleading and harder to understand.

CHANGELOG.md [40-41]

-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
+- Add ref_path support for package-based hunspell dictionary loading ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
+- Add cache invalidation API for package-based hunspell dictionary with ref_path support ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
Suggestion importance[1-10]: 6

__

Why: The two changelog entries have identical descriptions but reference different PR numbers (#20741 and #20792). This is misleading as it suggests both PRs do exactly the same thing. Differentiating the descriptions would improve clarity and accuracy of the changelog.

Low
Validate REST parameters before building invalidate request

The buildInvalidateRequest method is called for all POST requests, including those
to /_invalidate_all. However, if a client sends a POST to /_invalidate_all with
package_id or cache_key query parameters, those are silently ignored. More
importantly, the request validation in HunspellCacheInvalidateRequest.validate()
would reject a request where invalidateAll=true AND packageId != null, but since the
REST handler ignores params for _invalidate_all, this conflict is masked rather than
properly handled. This is acceptable behavior but should be explicitly documented,
or params should be rejected with a 400 error.

server/src/main/java/org/opensearch/rest/action/admin/indices/RestHunspellCacheInvalidateAction.java [106-116]

 if (request.path().endsWith("/_invalidate_all")) {
     invalidateRequest.setInvalidateAll(true);
+    // Ignore any package_id/cache_key params for _invalidate_all endpoint
 } else {
     String packageId = request.param("package_id");
     String locale = request.param("locale");
     String cacheKey = request.param("cache_key");
+
+    if (packageId == null && locale == null && cacheKey == null) {
+        throw new IllegalArgumentException(
+            "At least one of 'package_id', 'cache_key' must be specified for /_invalidate"
+        );
+    }
 
     invalidateRequest.setPackageId(packageId);
     invalidateRequest.setLocale(locale);
     invalidateRequest.setCacheKey(cacheKey);
 }
Suggestion importance[1-10]: 3

__

Why: The suggestion points out that missing parameters for /_invalidate are silently passed through to validation, but the HunspellCacheInvalidateRequest.validate() already handles this case and returns a proper validation error. Adding early validation in the REST handler is redundant and the improved_code throws an unchecked exception rather than returning a proper 400 response.

Low
Possible issue
Prevent silent injection failure for null service

If hunspellService is null, TransportHunspellCacheInvalidateAction and
TransportHunspellCacheInfoAction will fail at injection time with a cryptic error.
The binding should always be present, or the transport actions should handle a
null/optional service gracefully. Consider throwing an explicit error or ensuring
hunspellService is never null when these actions are registered.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [1129-1131]

-if (hunspellService != null) {
-    bind(HunspellService.class).toInstance(hunspellService);
+if (hunspellService == null) {
+    throw new IllegalStateException("HunspellService must not be null when hunspell cache actions are registered");
 }
+bind(HunspellService.class).toInstance(hunspellService);
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies that a null hunspellService would cause a cryptic injection failure for TransportHunspellCacheInvalidateAction and TransportHunspellCacheInfoAction. However, the existing_code snippet is from ActionModule.java not HunspellService.java as stated in the suggestion's relevant_file. The logic is valid but the impact is moderate since tests already pass null explicitly.

Low
Security
Fix incorrect path traversal detection logic

The path traversal check is incomplete and redundant. The value.startsWith(".")
condition already covers value.equals(".") and value.equals(".."), but sequences
like foo/.. or foo/../bar are not caught here — they are only caught by the path
separator check. However, a value like foo..bar would incorrectly be rejected by
value.endsWith(".") only if it ends with a dot. More critically,
value.contains("\.") uses a regex-style escape but String.contains does literal
matching, so \. matches the literal two-character sequence . rather than a dot —
this is likely a bug. Simplify and fix the check.

server/src/main/java/org/opensearch/index/analysis/HunspellTokenFilterFactory.java [160-167]

 if (value.equals(".")
     || value.equals("..")
+    || value.startsWith(".")
+    || value.endsWith(".")
     || value.contains("./")
     || value.contains("../")
-    || value.contains("\\.")
-    || value.contains("\\..")
-    || value.startsWith(".")
-    || value.endsWith(".")) {
+    || value.contains("\\.") // literal backslash-dot
+    || value.contains("\\..")) { // literal backslash-dot-dot
Suggestion importance[1-10]: 4

__

Why: The suggestion raises a valid point about value.contains("\\\\.") using a literal string match for \\. rather than a backslash followed by a dot, but the improved_code is essentially the same as the existing_code with only comment changes, making it not a meaningful fix. The actual bug identified is real but the proposed fix doesn't address it properly.

Low
Suggestions up to commit 33f4a54
CategorySuggestion                                                                                                                                    Impact
Possible issue
Avoid blocking inside computeIfAbsent on load failure

ConcurrentHashMap.computeIfAbsent must not throw checked or unchecked exceptions
that leave a partial mapping; if the supplier throws, the key is not stored, but a
concurrent caller may already be inside computeIfAbsent for the same key and will
block indefinitely on some JVM versions. Wrap the call outside computeIfAbsent and
use a two-step get-then-compute pattern to avoid this hazard.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [180-190]

-return dictionaries.computeIfAbsent(cacheKey, (key) -> {
-    try {
-        return loadDictionaryFromPackage(packageId, locale);
-    } catch (Exception e) {
+Dictionary existing = dictionaries.get(cacheKey);
+if (existing != null) {
+    return existing;
+}
+Dictionary loaded;
+try {
+    loaded = loadDictionaryFromPackage(packageId, locale);
+} catch (Exception e) {
+    throw new IllegalStateException(
+        String.format(Locale.ROOT, "Failed to load hunspell dictionary for package [%s] locale [%s]", packageId, locale),
+        e
+    );
+}
+Dictionary previous = dictionaries.putIfAbsent(cacheKey, loaded);
+return previous != null ? previous : loaded;
 
-        throw new IllegalStateException(
-            String.format(Locale.ROOT, "Failed to load hunspell dictionary for package [%s] locale [%s]", packageId, locale),
-            e
-        );
-    }
-});
-
Suggestion importance[1-10]: 6

__

Why: This is a valid concern - ConcurrentHashMap.computeIfAbsent can cause issues when the mapping function throws exceptions, potentially causing threads to block. The proposed get-then-putIfAbsent pattern is a safer alternative, though it may load the dictionary twice under high concurrency.

Low
Prevent null HunspellService injection failure

If hunspellService is null, TransportHunspellCacheInvalidateAction and
TransportHunspellCacheInfoAction will fail at injection time with a cryptic error.
Either make hunspellService a required parameter (non-null) or provide a fallback
binding so the injector always has a binding for HunspellService.

server/src/main/java/org/opensearch/action/admin/indices/cache/hunspell/ActionModule.java [1129-1131]

-if (hunspellService != null) {
-    bind(HunspellService.class).toInstance(hunspellService);
+if (hunspellService == null) {
+    throw new IllegalArgumentException("HunspellService must not be null");
 }
+bind(HunspellService.class).toInstance(hunspellService);
Suggestion importance[1-10]: 5

__

Why: The suggestion is valid in principle - if hunspellService is null, the transport actions that depend on it will fail at injection time. However, the test code explicitly passes null for hunspellService in ActionModuleTests, suggesting null is an intentional supported state. The relevant file in the suggestion is wrong (ActionModule.java vs the actual file path), but the code lines match the ActionModule.java content in the diff.

Low
General
Deduplicate identical changelog entries

There are two identical changelog entries describing the same feature but
referencing different PR numbers. These should be deduplicated or differentiated to
accurately describe what each PR contributes. Having duplicate entries with the same
description is misleading and clutters the changelog.

CHANGELOG.md [40-41]

-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
-- Add ref_path support for package-based hunspell dictionary loading and cache invalidation API ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
+- Add ref_path support for package-based hunspell dictionary loading ([#20741](https://github.com/opensearch-project/OpenSearch/pull/20741))
+- Add cache invalidation API for package-based hunspell dictionary with ref_path support ([#20792](https://github.com/opensearch-project/OpenSearch/pull/20792))
Suggestion importance[1-10]: 6

__

Why: The two changelog entries have identical descriptions but different PR numbers, which is misleading. Differentiating them to accurately describe each PR's contribution improves changelog clarity and accuracy.

Low
Fix inaccurate concurrent removal count

ConcurrentHashMap.keySet().removeIf() is not atomic, so sizeBefore and the
post-removal size can be inconsistent under concurrent access, leading to a negative
or incorrect count. Collect the removed keys explicitly to get an accurate count.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [509-511]

-int sizeBefore = dictionaries.size();
-dictionaries.keySet().removeIf(key -> key.startsWith(prefix));
-int count = sizeBefore - dictionaries.size();
+int[] count = {0};
+dictionaries.keySet().removeIf(key -> {
+    if (key.startsWith(prefix)) {
+        count[0]++;
+        return true;
+    }
+    return false;
+});
+int result = count[0];
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a race condition between size() and removeIf() in ConcurrentHashMap. The improved code using an atomic counter inside removeIf is more accurate. However, the code itself already acknowledges approximate counts in similar methods, and the impact is limited to logging accuracy.

Low
Security
Simplify and close path traversal validation gaps

The check value.startsWith(".") already covers value.equals(".") and
value.equals(".."), and value.contains("./") / value.contains("../") are subsumed by
the path-separator check below. More importantly, a value like "foo..bar" passes all
these checks but still resolves to a path traversal on some filesystems. Rely on the
server-side normalize() + startsWith() check already present in
loadDictionaryFromPackage and simplify this to just reject separators, null bytes,
and leading/trailing dots.

server/src/main/java/org/opensearch/index/analysis/HunspellTokenFilterFactory.java [160-171]

-if (value.equals(".")
-    || value.equals("..")
-    || value.contains("./")
-    || value.contains("../")
-    || value.contains("\\.")
-    || value.contains("\\..")
-    || value.startsWith(".")
-    || value.endsWith(".")) {
+if (value.startsWith(".") || value.endsWith(".") || value.contains("..")) {
+    throw new IllegalArgumentException(
+        String.format(Locale.ROOT, "Invalid %s: [%s]. Path traversal sequences (., ..) are not allowed.", paramName, value)
+    );
+}
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly notes redundancy in the path traversal checks (e.g., startsWith(".") covers equals(".") and equals("..")). However, the server-side normalize() + startsWith() check in loadDictionaryFromPackage already provides a strong safety net, making this a minor improvement rather than a critical fix.

Low
Suggestions up to commit 3917262
CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent silent injection failure for null service

If hunspellService is null, TransportHunspellCacheInvalidateAction and
TransportHunspellCacheInfoAction will fail at injection time with a cryptic error.
The binding should always be present, or the transport actions should handle a
null/optional service gracefully. Consider throwing an explicit error or ensuring
hunspellService is never null when these actions are registered.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [1129-1131]

-if (hunspellService != null) {
-    bind(HunspellService.class).toInstance(hunspellService);
+if (hunspellService == null) {
+    throw new IllegalStateException("HunspellService must not be null when hunspell cache actions are registered");
 }
+bind(HunspellService.class).toInstance(hunspellService);
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that a null hunspellService would cause a cryptic injection failure for TransportHunspellCacheInvalidateAction and TransportHunspellCacheInfoAction. However, the existing code in ActionModule.java (not HunspellService.java as stated in the suggestion) handles this with a null check, and the test code explicitly passes null for hunspellService, suggesting null is an intentional valid state. The suggestion's relevant_file is also incorrect.

Low
Fix non-atomic removal count in concurrent map

Using sizeBefore - dictionaries.size() on a ConcurrentHashMap is not atomic and can
yield incorrect counts in concurrent scenarios. While the comment acknowledges this
for invalidateAllDictionaries, here the count is used as a return value that callers
may rely on for correctness. Consider tracking removals directly inside removeIf
using an AtomicInteger.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [509-511]

-int sizeBefore = dictionaries.size();
-dictionaries.keySet().removeIf(key -> key.startsWith(prefix));
-int count = sizeBefore - dictionaries.size();
+java.util.concurrent.atomic.AtomicInteger count = new java.util.concurrent.atomic.AtomicInteger(0);
+dictionaries.keySet().removeIf(key -> {
+    if (key.startsWith(prefix)) {
+        count.incrementAndGet();
+        return true;
+    }
+    return false;
+});
+int removed = count.get();
+// use removed instead of count
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly identifies a race condition in invalidateDictionariesByPackage where sizeBefore - dictionaries.size() is non-atomic on a ConcurrentHashMap. Using an AtomicInteger inside removeIf is a valid fix for more accurate counting in concurrent scenarios.

Low
Security
Fix incomplete path traversal detection

The path traversal check is incomplete and can be bypassed. For example, a value
like foo/.. or ..foo would not be caught by the current conditions. Since path
separators (/, </code>) are already rejected in a subsequent check, the dot-sequence
checks are partially redundant but still miss edge cases. The most reliable approach
is to normalize the value as a path and verify it has no parent components.

server/src/main/java/org/opensearch/index/analysis/HunspellTokenFilterFactory.java [160-167]

-if (value.equals(".")
-    || value.equals("..")
-    || value.contains("./")
-    || value.contains("../")
-    || value.contains("\\.")
-    || value.contains("\\..")
-    || value.startsWith(".")
-    || value.endsWith(".")) {
+// Reject path traversal attempts - check after rejecting separators
+if (value.equals(".") || value.equals("..") || value.contains("..")) {
+    throw new IllegalArgumentException(
+        String.format(Locale.ROOT, "Invalid %s: [%s]. Path traversal sequences (., ..) are not allowed.", paramName, value)
+    );
+}
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that the path traversal check has gaps (e.g., ..foo or foo/.. patterns). The improved code simplifies the check to value.contains("..") which is more comprehensive, and since path separators are already rejected separately, this is a valid security improvement.

Low
General
Remove misleading empty comment in exception handler

ConcurrentHashMap.computeIfAbsent must not throw checked exceptions, but wrapping in
IllegalStateException is correct. However, if the loading throws an unchecked
exception, the key will NOT be stored in the map, but the partially-executed lambda
may leave resources open. More critically, if loadDictionaryFromPackage throws, the
exception propagates out of computeIfAbsent without caching, so every subsequent
call retries the load — this is acceptable, but the empty catch block comment should
be removed to avoid confusion.

server/src/main/java/org/opensearch/indices/analysis/HunspellService.java [180-190]

 return dictionaries.computeIfAbsent(cacheKey, (key) -> {
     try {
         return loadDictionaryFromPackage(packageId, locale);
     } catch (Exception e) {
-
         throw new IllegalStateException(
             String.format(Locale.ROOT, "Failed to load hunspell dictionary for package [%s] locale [%s]", packageId, locale),
             e
         );
     }
 });
Suggestion importance[1-10]: 3

__

Why: The suggestion correctly identifies an empty line/comment between catch (Exception e) { and the throw statement that could be confusing. The improved_code and existing_code are functionally equivalent with only whitespace differences, making this a minor style improvement.

Low
Remove trailing whitespace in affix rules

The trailing whitespace after y and [^ey] in these SFX rules may cause parsing
issues with some Hunspell implementations that are strict about line endings. Remove
the trailing spaces to ensure consistent behavior across different Hunspell parsers.

server/src/test/resources/indices/analyze/conf_dir/packages/test-pkg/hunspell/en_US/en_US.aff [44-45]

-SFX N   y     ication    y 
+SFX N   y     ication    y
 SFX N   0     en         [^ey]
Suggestion importance[1-10]: 3

__

Why: The suggestion correctly identifies trailing whitespace in the SFX N rules at lines 44-45. While trailing whitespace could theoretically cause parsing issues in strict Hunspell implementations, this is a test resource file and the impact is minimal. The improved_code accurately reflects the change.

Low
Remove trailing whitespace in suffix rule

Similar to the other SFX rules, this line has trailing whitespace after [^e] which
could cause parsing inconsistencies. Remove the trailing space for consistency and
to avoid potential issues.

server/src/test/resources/indices/analyze/conf_dir/packages/test-pkg/hunspell/en_US/en_US.aff [61]

Suggestion importance[1-10]: 3

__

Why: The suggestion correctly identifies trailing whitespace in the SFX G rule at line 61. Similar to suggestion 1, this is a minor style issue in a test resource file with minimal practical impact.

Low

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

Persistent review updated to latest commit 33f4a54

@github-actions
Copy link
Contributor

github-actions bot commented Mar 6, 2026

❌ Gradle check result for 33f4a54: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shayush622 shayush622 force-pushed the feat/hunspell-ref-path branch from 33f4a54 to dbc150a Compare March 9, 2026 08:29
@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

Persistent review updated to latest commit dbc150a

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

❌ Gradle check result for dbc150a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@cwperks
Copy link
Member

cwperks commented Mar 9, 2026

Not sure if its related to the changes in this PR, but this is showing backwards compatibility failures:

» ERROR][o.o.g.r.RemoteClusterStateService] [v2.19.5-remote-2] Failure in downloading full cluster state. 
»  org.opensearch.gateway.remote.RemoteStateTransferException: Exception during reading cluster state from remote
»  	at org.opensearch.gateway.remote.RemoteClusterStateService.readClusterStateInParallel(RemoteClusterStateService.java:1391)
»  	at org.opensearch.gateway.remote.RemoteClusterStateService.getClusterStateForManifest(RemoteClusterStateService.java:1497)
»  	at org.opensearch.cluster.coordination.PublicationTransportHandler.handleIncomingRemotePublishRequest(PublicationTransportHandler.java:269)
»  	at org.opensearch.cluster.coordination.PublicationTransportHandler.lambda$new$1(PublicationTransportHandler.java:136)
»  	at org.opensearch.wlm.WorkloadManagementTransportInterceptor$RequestHandler.messageReceived(WorkloadManagementTransportInterceptor.java:63)
»  	at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:108)
»  	at org.opensearch.transport.NativeMessageHandler$RequestHandler.doRun(NativeMessageHandler.java:487)
»  	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1014)
»  	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
»  	at java.****/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
»  	at java.****/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
»  	at java.****/java.lang.Thread.run(Thread.java:1583)
»  	Suppressed: org.opensearch.gateway.remote.RemoteStateTransferException: Download failed for nodes
»  		at org.opensearch.gateway.remote.RemoteClusterStateAttributesManager.lambda$getWrappedReadListener$3(RemoteClusterStateAttributesManager.java:103)
»  		at org.opensearch.core.action.ActionListener$1.onFailure(ActionListener.java:90)
»  		at org.opensearch.common.remote.RemoteWriteableEntityBlobStore.lambda$readAsync$0(RemoteWriteableEntityBlobStore.java:87)
»  		at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:955)
»  		... 3 more
»  	Caused by: java.lang.IllegalStateException: unexpected byte [0x1d]
»  		at org.opensearch.core.common.io.stream.StreamInput.readBoolean(StreamInput.java:596)
»  		at org.opensearch.core.common.io.stream.StreamInput.readBoolean(StreamInput.java:586)
»  		at org.opensearch.cluster.node.DiscoveryNode.<init>(DiscoveryNode.java:344)
»  		at org.opensearch.cluster.node.DiscoveryNodes.readFrom(DiscoveryNodes.java:777)
»  		at org.opensearch.gateway.remote.model.RemoteDiscoveryNodes.lambda$static$0(RemoteDiscoveryNodes.java:37)
»  		at org.opensearch.repositories.blobstore.ChecksumWritableBlobStoreFormat.deserialize(ChecksumWritableBlobStoreFormat.java:105)
»  		at org.opensearch.gateway.remote.model.RemoteDiscoveryNodes.deserialize(RemoteDiscoveryNodes.java:101)
»  		at org.opensearch.gateway.remote.model.RemoteDiscoveryNodes.deserialize(RemoteDiscoveryNodes.java:32)
»  		at org.opensearch.common.remote.RemoteWriteableEntityBlobStore.read(RemoteWriteableEntityBlobStore.java:77)
»  		at org.opensearch.common.remote.RemoteWriteableEntityBlobStore.lambda$readAsync$0(RemoteWriteableEntityBlobStore.java:85)
»  		... 4 more
»   ↑ repeated 44 times ↑

@github-actions
Copy link
Contributor

github-actions bot commented Mar 9, 2026

❌ Gradle check result for dbc150a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

shayush622 and others added 2 commits March 11, 2026 10:15
- Add ref_path parameter for package-based dictionary loading
- Load from config/packages/{packageId}/hunspell/{locale}/
- Add cache info API: GET /_hunspell/cache (cluster:monitor/hunspell/cache)
- Add cache invalidation API: POST /_hunspell/cache/_invalidate (cluster:admin/hunspell/cache/invalidate)
- Support invalidation by package_id, locale, cache_key, or invalidate_all
- Add security validation (path traversal, separator injection, null bytes)
- Add updateable flag for hot-reload via _reload_search_analyzers
- Use Strings.hasText() and Strings.isNullOrEmpty() for validation consistency
- Consistent response schema with all fields always present
- Add unit tests, REST handler tests, and integration tests

Signed-off-by: shayush622 <ayush5267@gmail.com>
@shayush622 shayush622 force-pushed the feat/hunspell-ref-path branch from dbc150a to 190bfd1 Compare March 11, 2026 04:47
@github-actions
Copy link
Contributor

Persistent review updated to latest commit 190bfd1

@github-actions
Copy link
Contributor

❌ Gradle check result for 190bfd1: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

- Add ref_path parameter for package-based dictionary loading
- Load from config/packages/{packageId}/hunspell/{locale}/
- Add cache info API: GET /_hunspell/cache (cluster:monitor/hunspell/cache)
- Add cache invalidation API: POST /_hunspell/cache/_invalidate (cluster:admin/hunspell/cache/invalidate)
- Support invalidation by package_id, locale, cache_key, or invalidate_all
- Add security validation (path traversal, separator injection, null bytes)
- Add updateable flag for hot-reload via _reload_search_analyzers
- Use Strings.hasText() and Strings.isNullOrEmpty() for validation consistency
- Consistent response schema with all fields always present
- Add unit tests, REST handler tests, and integration tests

Signed-off-by: shayush622 <ayush5267@gmail.com>
@shayush622 shayush622 force-pushed the feat/hunspell-ref-path branch from 190bfd1 to 385793b Compare March 11, 2026 11:48
@github-actions
Copy link
Contributor

Persistent review updated to latest commit 385793b

@github-actions
Copy link
Contributor

❌ Gradle check result for 385793b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@shayush622
Copy link
Contributor Author

Breaking down the PR into two parts -> #20840 && #20841; thus closing this one

@shayush622 shayush622 closed this Mar 11, 2026
@github-actions
Copy link
Contributor

✅ Gradle check result for 385793b: SUCCESS

@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

❌ Patch coverage is 43.60465% with 194 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.27%. Comparing base (8f8f7b5) to head (385793b).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...g/opensearch/indices/analysis/HunspellService.java 41.05% 44 Missing and 12 partials ⚠️
...ices/cache/hunspell/HunspellCacheInfoResponse.java 0.00% 33 Missing ⚠️
...rch/index/analysis/HunspellTokenFilterFactory.java 41.46% 10 Missing and 14 partials ⚠️
...nspell/TransportHunspellCacheInvalidateAction.java 11.53% 23 Missing ⚠️
...cache/hunspell/HunspellCacheInvalidateRequest.java 64.15% 13 Missing and 6 partials ⚠️
...che/hunspell/TransportHunspellCacheInfoAction.java 15.00% 17 Missing ⚠️
...min/indices/RestHunspellCacheInvalidateAction.java 37.50% 15 Missing ⚠️
...dices/cache/hunspell/HunspellCacheInfoRequest.java 0.00% 6 Missing ⚠️
.../main/java/org/opensearch/action/ActionModule.java 83.33% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main   #20792      +/-   ##
============================================
- Coverage     73.31%   73.27%   -0.05%     
- Complexity    72248    72308      +60     
============================================
  Files          5795     5804       +9     
  Lines        330044   330395     +351     
  Branches      47641    47696      +55     
============================================
+ Hits         241975   242092     +117     
- Misses        68609    68853     +244     
+ Partials      19460    19450      -10     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-diff-analyzer Maintainer to skip code-diff-analyzer check, after reviewing issues in AI analysis.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants