Skip to content

OPENNLP-1816: Make ME classes thread-safe by eliminating shared mutable instance state#1003

Draft
krickert wants to merge 1 commit intoapache:mainfrom
ai-pipestream:feature/thread-safe-me
Draft

OPENNLP-1816: Make ME classes thread-safe by eliminating shared mutable instance state#1003
krickert wants to merge 1 commit intoapache:mainfrom
ai-pipestream:feature/thread-safe-me

Conversation

@krickert
Copy link
Copy Markdown

Summary

Make ME classes (TokenizerME, SentenceDetectorME, POSTaggerME, LemmatizerME) safe for concurrent use by eliminating shared mutable instance state. This enables reusing ME instances across threads instead of allocating a new instance per call, reducing allocation overhead in high-throughput pipelines.

The old pattern (new TokenizerME(model) per call) continues to work identically — zero regressions in correctness or performance.

Motivation

ME classes were documented as not thread-safe due to mutable instance fields (bestSequence, tokProbs, newTokens, sentProbs) that corrupt under concurrent access. The recommended workaround was either creating a new ME instance per call (expensive for high-throughput pipelines processing thousands of sentences in parallel) or using the ThreadSafe*ME wrappers (which use ThreadLocal and leak in Jakarta EE / long-running thread environments).

The root cause was mutable state at four layers:

  1. ME classes — result fields written on every call
  2. Context generators — per-call caches (contextsCache, wordsKey, buf, collectFeats)
  3. Feature generatorsCachedFeatureGenerator with mutable prevTokens and cache
  4. BeamSearch — shared probs[] output buffer and a contextsCache that stored references to the reused buffer (cached values were always stale)

Approach

Move mutable state to method-local variables at every layer. ME instance fields are preserved as volatile for backward-compatible probs() access (last-writer-wins under concurrency). Caches are removed entirely — they were small (size 3 typically), not thread-safe, and in BeamSearch's case, buggy.

Files changed (10 source, 5 test)

File Change
BeamSearch.java Removed shared probs[] and buggy contextsCache; added @ThreadSafe
DefaultSDContextGenerator.java buf/collectFeats moved to method-local; collectFeatures() signature updated
SentenceContextGenerator.java (Thai) Updated to match new collectFeatures() signature
DefaultPOSContextGenerator.java Removed contextsCache and wordsKey
ConfigurablePOSContextGenerator.java Removed contextsCache and wordsKey
CachedFeatureGenerator.java Removed prevTokens, contextsCache, counters; delegates directly
TokenizerME.java newTokens/tokProbs volatile; tokenizePos() uses local lists
SentenceDetectorME.java sentProbs volatile; sentPosDetect() uses local list
POSTaggerME.java bestSequence volatile; tag() uses local var; added null guard
LemmatizerME.java bestSequence volatile; predictSES() uses local var

Backward compatibility

  • The old pattern (new ME(model) per call) is unchanged — verified by regression benchmark
  • probs() methods preserved (deprecated behavior under concurrency, correct single-threaded)
  • Constructor signatures preserved (cacheSize params accepted but ignored, marked @Deprecated(since = "3.0.0"))
  • No new dependencies

Test plan

  • All 675 existing tests pass (mvn test on opennlp-runtime)
  • ThreadSafetyBenchmarkTest — JUnit correctness test: shared ME instances produce identical results to single-threaded baseline across all CPU cores
  • RegressionBenchmark — head-to-head stock vs patched, new-instance-per-call only: zero mismatches, zero errors, performance within noise on both builds
  • ThreadSafetyBenchmark — three-way comparison (new-instance-per-call / instance-per-thread / shared-single-instance)
  • CachedFeatureGeneratorTest — updated for removed cache behavior
  • Checkstyle violation count unchanged (9,446 pre-existing on both stock and patched)
  • Full mvn clean install at root (checkstyle must be skipped — 9,446 pre-existing violations on main)

Regression benchmark results (32 threads, new-instance-per-call)

Proves zero regression — stock vs patched, same API pattern:

Component stock avg_ms patched avg_ms Mismatches Errors
Tokenizer 16.09 16.69 0 0
SentenceDetector 9.21 9.01 0 0
POSTagger 105.76 106.58 0 0

Speedup benchmark results (32 threads, three-way comparison)

Approaches

The benchmark compares three strategies for using ME classes in a multi-threaded environment. All three produce identical output for a given input — the difference is how ME instances are allocated and shared.

Approach Description Example code
new-instance-per-call A fresh ME instance is created for every single operation. This is the traditional pattern and the baseline. Safe but expensive — each call pays the full cost of constructing the ME, its BeamSearch, context generators, and feature generator chain. String[] tags = new POSTaggerME(model).tag(tokens);
instance-per-thread One ME instance is created per thread and reused across all operations on that thread. No cross-thread sharing, so no contention. Eliminates per-call constructor overhead while remaining completely safe. POSTaggerME tagger = new POSTaggerME(model);
for (String[] t : sentences) tagger.tag(t);
shared-single-instance A single ME instance is shared across all threads. Maximum memory efficiency — only one set of internal structures exists. Works for TokenizerME and SentenceDetectorME. POSTaggerME has known contention in the feature generator chain at high thread counts. POSTaggerME shared = new POSTaggerME(model);
// pass shared to all threads

Benchmark results

Component Approach avg_ms Mismatches Speedup
Tokenizer new-instance-per-call 16.63 0 1.0x
Tokenizer instance-per-thread 15.92 0 1.04x
Tokenizer shared-single-instance 16.24 0 1.02x
SentenceDetector new-instance-per-call 9.49 0 1.0x
SentenceDetector instance-per-thread 9.28 0 1.02x
SentenceDetector shared-single-instance 8.93 0 1.06x
POSTagger new-instance-per-call 133.55 0 1.0x
POSTagger instance-per-thread 80.01 0 1.67x

POSTagger sees the largest gain because its constructor is the heaviest — it builds a BeamSearch, a ConfigurablePOSContextGenerator, and a full AdaptiveFeatureGenerator chain on every instantiation. Reusing one instance per thread eliminates that allocation on every call, yielding a 1.67x speedup with zero correctness impact.

Tokenizer and SentenceDetector constructors are lighter, so the per-call overhead is smaller and all three approaches perform similarly.

See opennlp-core/opennlp-runtime/BENCHMARKS.md for full benchmark instructions.


Thank you for contributing to Apache OpenNLP.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?
    https://issues.apache.org/jira/browse/OPENNLP-1816

  • Does your PR title start with OPENNLP-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically main)?

  • Is your initial contribution a single, squashed commit?

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn clean install at the root opennlp folder?
  • Have you written or updated unit tests to verify your changes?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
    • N/A — no new dependencies added
  • If applicable, have you updated the LICENSE file, including the main LICENSE file in opennlp folder?
    • N/A — no license changes required
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found in opennlp folder?
    • N/A — no notice changes required

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions for build issues and submit an update to your PR as soon as possible.

@krickert krickert changed the title OPENNLP-1816Make ME classes thread-safe by eliminating shared mutable instance state OPENNLP-1816: Make ME classes thread-safe by eliminating shared mutable instance state Mar 30, 2026
@krickert krickert force-pushed the feature/thread-safe-me branch from 2a904dd to 729b9c1 Compare March 30, 2026 16:49
@krickert
Copy link
Copy Markdown
Author

There were 3 checkstyle violations - fixed those.

@jzonthemtn
Copy link
Copy Markdown
Contributor

@krickert Thanks for the PR!

@krickert
Copy link
Copy Markdown
Author

always been a fan of OpenNLP - what I love about finally contributing is that before this patch, I was having to create pools of ME objects or create new ones every time. This gets rid of all that scaffolding.

If you would like me to create anymore tests, let me know.

I think the new tests cover the concurrency and recall use cases well. And I think the speed tests show that there's no concern about performance. I was excited to see the > 1.5x speedup with POSTagger... It's the single reason why I decided to work on this.

@rzo1
Copy link
Copy Markdown
Contributor

rzo1 commented Mar 30, 2026

Hi,

Thanks for the contribution! Overall, I like the idea of looking into built-in thread safety rather than relying on ThreadLocal-based wrappers, which have known issues in Jakarta EE and other long-lived thread environments.

A few concerns I'd like to discuss before this can move forward (imho):

  1. Benchmarks (no JMH)

The benchmarks are hand-rolled System.nanoTime() + ExecutorService loops. Without JMH, the results are susceptible to JIT warmup, GC pauses, and profile pollution, i.e. there's no fork isolation, no warmup iterations, and no statistical variance reporting. For a change that removes multiple
caching layers and claims "performance within noise," JMH would be preferable. We already have a profile for JMH benchmarks.

  1. Caches removed without replacement

Three layers of caching were removed as a shortcut to thread safety:

  • CachedFeatureGenerator: the class is now a pass-through that caches nothing, despite its name. During beam search, the same token position may be feature-generated up to k times (beam width) with identical inputs. This cache was saving real work.
  • DefaultPOSContextGenerator / ConfigurablePOSContextGenerator: per-sentence context caches removed entirely.
  • `BeamSearch.contextsCache: you note this was buggy (stale references from the shared probs[] buffer). That may be valid, but removing it rather than fixing it (e.g., storing copies) conflates a bug fix with the thread-safety change.

The regression benchmark reports "performance within noise," but without JMH-level statistical rigor that's hard to verify. More importantly, the benchmark uses a small set of short sentences: a benchmark against a real-world dataset (e.g., from the eval/test corpora: https://nightlies.apache.org/opennlp/) would be far more convincing, particularly for POS tagging where the feature generation cache had the most impact under larger workloads. A thread-safe alternative would be making the caches method-local rather than removing them entirely.

  1. Thread-safety tests are not robust
  • No contention forcing: there's no CyclicBarrier or CountDownLatch to ensure threads hit the critical section simultaneously. Threads free-run, which reduces the probability of surfacing races.
  • LemmatizerME was patched but has no thread-safety test.
  • probs() under concurrent access is not tested, despite being preserved as volatile for backward compatibility.
  • The test could pass on a 2-core CI machine and fail on a 64-core box. I think, that we sohuld at minimum set higher iteration counts with barrier-synchronized thread starts.
  1. Missing coverage
  • Only 4 of 7 ME classes are addressed (ChunkerME, NameFinderME, LanguageDetectorME are untouched). This is fine as a scoped PR, but worth noting, so the existing ThreadSafe*ME wrappers can't be deprecated yet.

@rzo1
Copy link
Copy Markdown
Contributor

rzo1 commented Mar 30, 2026

Regardless of my comment, I am going to trigger a Eval build for this: https://ci-builds.apache.org/job/OpenNLP/job/eval-tests-configurable/39/

@krickert
Copy link
Copy Markdown
Author

@rzo1 working on addressing all of your concerns right now - it'll be done in a moment. I'm restoring the caches and running tests with and without with proper benchmarks. All great points, and thanks for the feedback.

@krickert krickert force-pushed the feature/thread-safe-me branch from 729b9c1 to 94ca28d Compare March 30, 2026 18:20
@krickert
Copy link
Copy Markdown
Author

I'm going to make the caches optional and configurable. This way we can run tests against all scenarios and come up with as many uses cases as needed to measure the impact.

The last commit was premature, I'm still working on this.

@mawiesne
Copy link
Copy Markdown
Contributor

I'm going to make the caches optional and configurable. This way we can run tests against all scenarios and come up with as many uses cases as needed to measure the impact.

@krickert Thx Kristian for tackling this complex topic with so much energy! Much appreciated! Happy to review this PR deeper, especially lkn fwd for the JMH analyses. Richard has already given deep feedback in the first round; I'll share my 2c later on code stylistic nuances, seeking an optimal result from devs perspective.

For the moment, completing the 3.0.0-M2 release process is on my list…

@krickert
Copy link
Copy Markdown
Author

@mawiesne no problem... I've been thinking about this for awhile now.

@rzo1 you were right about CachedFeatureGenerator. The data shows it clearly and it helps. That particular cache in the old vs new instances to bring a 1.6x boost. This combined with the thread safety feature with reuse show over a 2x increase now. Thanks for pointing that out. But don't trust what I say; I'll update the tests shortly to show it (I would love to see it on another machine too)

@krickert krickert force-pushed the feature/thread-safe-me branch 2 times, most recently from bfc8fdf to d31aaa6 Compare March 30, 2026 19:31
@krickert
Copy link
Copy Markdown
Author

krickert commented Mar 30, 2026

Thanks for the detailed feedback. We've addressed all four points made by @rzo1 . Here's a summary of what changed and the JMH data behind each decision.

1. Benchmarks (JMH)

Replaced all hand-rolled System.nanoTime() benchmarks with proper JMH. Three benchmark classes in src/jmh/java:

  • TokenizerMEBenchmark
  • SentenceDetectorMEBenchmark
  • POSTaggerMEBenchmark (with @Param for cache configuration)

Also fixed the existing JMH profile - the annotation processor wasn't wired into the compiler plugin, so the BenchmarkList was never generated. Added maven-compiler-plugin config with annotationProcessorPaths to the jmh profile.

Approaches measured

Approach Description Example
newInstancePerCall Fresh ME per operation - the traditional pattern and baseline. Each call pays full constructor cost (BeamSearch, context generators, feature generator chain). new POSTaggerME(model).tag(tokens)
instancePerThread One ME per thread, reused across operations. No cross-thread sharing, no contention. Eliminates per-call constructor overhead. POSTaggerME tagger = new POSTaggerME(model); then reuse
sharedInstance Single ME shared by all threads. Maximum memory efficiency. Pass one instance to all threads

JMH Results (32 threads, all cores)

Benchmark Mode Cnt Score Error Units
TokenizerMEBenchmark.newInstancePerCall thrpt 5 570469 ± 6885 ops/s
TokenizerMEBenchmark.instancePerThread thrpt 5 576365 ± 25758 ops/s
TokenizerMEBenchmark.sharedInstance thrpt 5 570312 ± 12754 ops/s
SentenceDetectorMEBenchmark.newInstancePerCall thrpt 5 837841 ± 7903 ops/s
SentenceDetectorMEBenchmark.instancePerThread thrpt 5 853319 ± 25920 ops/s
SentenceDetectorMEBenchmark.sharedInstance thrpt 5 849994 ± 31635 ops/s
POSTaggerMEBenchmark.newInstancePerCall thrpt 5 24886 ± 2725 ops/s
POSTaggerMEBenchmark.instancePerThread thrpt 5 62727 ± 2410 ops/s
POSTaggerMEBenchmark.sharedInstance thrpt 5 61666 ± 7119 ops/s

Tokenizer and SentenceDetector: all approaches within error bars (lightweight constructors).
POSTagger: 2.52x speedup for instancePerThread vs newInstancePerCall.

2. Caches

We restored all caches as ThreadLocal (per-thread, not shared). Same behavior as the originals in single-threaded use, safe under concurrency.

We also added a contextCacheSize parameter to POSTaggerME and a DISABLE_CACHE_PROPERTY system property to CachedFeatureGenerator so the cache impact can be measured independently via JMH @Param.

JMH Cache Impact Results (POSTagger, 32 threads)

Benchmark (allCaches) Mode Cnt Score Error Units
POSTaggerMEBenchmark.instancePerThread true thrpt 5 64349 ± 3216 ops/s
POSTaggerMEBenchmark.instancePerThread false thrpt 5 39702 ± 870 ops/s
POSTaggerMEBenchmark.newInstancePerCall true thrpt 5 25394 ± 2467 ops/s
POSTaggerMEBenchmark.newInstancePerCall false thrpt 5 23954 ± 2324 ops/s
POSTaggerMEBenchmark.sharedInstance true thrpt 5 64663 ± 2735 ops/s
POSTaggerMEBenchmark.sharedInstance false thrpt 5 39620 ± 1139 ops/s

This told us which caches matter and which don't:

Cache Restored as JMH Impact Notes
CachedFeatureGenerator ThreadLocal 1.62x (64K vs 39K ops/s) Saves real work - caches outcome-independent features across beam candidates at the same token position
ConfigurablePOSContextGenerator ThreadLocal None (65K vs 64K, within error) Cache key includes prior tags, which differ per beam candidate - near-zero hit rate
BeamSearch.contextsCache ThreadLocal N/A Every caller in the codebase passes cacheSize=0. Never enabled for any ME class. Restored for API backward compatibility

Regarding the BeamSearch cache specifically

you note this was buggy (stale references from the shared probs[] buffer). That may be valid, but removing it rather than fixing it (e.g., storing copies) conflates a bug fix with the thread-safety change.

We restored it as ThreadLocal with per-thread probs[] buffers, which fixes the stale-reference issue. However, we also checked every new BeamSearch(...) call in the codebase - every single one passes cacheSize=0 (either via the 2-arg constructor or explicitly). The cache has never been enabled by any caller in the project's history. We kept the 3-arg constructor for external API compatibility.

3. Thread-safety tests

Addressed all sub-points:

  • Contention forcing: All tests now use CyclicBarrier - threads wait at the barrier before starting, ensuring they hit the critical section simultaneously.
  • LemmatizerME: Added sharedLemmatizerProducesCorrectResults() test.
  • Thread/iteration counts: Math.max(8, availableProcessors()) threads, 200 reps per thread.
  • probs(): Added probsDoesNotThrowUnderConcurrency() test - verifies probs() returns valid data (non-null, non-empty) under concurrent tag() calls without throwing. The returned values are last-writer-wins by design (documented in volatile field comments) - the core processing methods are what we guarantee correct under concurrency.

4. Missing ME classes

All 7 ME classes are now covered:

Class Source change Thread-safety test
TokenizerME volatile + method-local sharedTokenizerProducesCorrectResults()
SentenceDetectorME volatile + method-local sharedSentenceDetectorProducesCorrectResults()
POSTaggerME volatile + method-local + null guard sharedPOSTaggerProducesCorrectResults()
LemmatizerME volatile + method-local sharedLemmatizerProducesCorrectResults()
ChunkerME volatile + method-local + null guard sharedChunkerProducesCorrectResults()
NameFinderME volatile + method-local + null guard sharedNameFinderProducesCorrectResults()
LanguageDetectorME Already thread-safe (stateless) sharedLangDetectorProducesCorrectResults()

All 7 ME classes are annotated @ThreadSafe.

5. ThreadSafe*ME wrappers deprecated

Since the ME classes are now themselves thread-safe, the ThreadSafe*ME wrappers are redundant. We deprecated all 7:

  • ThreadSafeTokenizerME → use TokenizerME directly
  • ThreadSafeSentenceDetectorME → use SentenceDetectorME directly
  • ThreadSafePOSTaggerME → use POSTaggerME directly
  • ThreadSafeLemmatizerME → use LemmatizerME directly
  • ThreadSafeChunkerME → use ChunkerME directly
  • ThreadSafeNameFinderME → use NameFinderME directly
  • ThreadSafeLanguageDetectorME → use LanguageDetectorME directly

We also replaced all internal usages of ThreadSafe*ME with direct ME usage:

  • Muc6NameSampleStreamFactory: ThreadSafeTokenizerMETokenizerME
  • TwentyNewsgroupSampleStreamFactory: ThreadSafeTokenizerMETokenizerME
  • POSTaggerMEIT: ThreadSafeTokenizerME / ThreadSafePOSTaggerMETokenizerME / POSTaggerME

No internal code uses the wrappers anymore.

Open item

a benchmark against a real-world dataset (e.g., from the eval/test corpora) would be far more convincing

Agreed - this would strengthen the perf claims. The JMH benchmarks currently use the project's test data (AnnotatedSentences.txt). We're happy to add an eval-corpus benchmark as a follow-up, or include it in this PR if you'd prefer.

Do you have any real-world dataset tests around that we can run it against quickly? It's the only way I'd feel confident as well.

@krickert
Copy link
Copy Markdown
Author

Summary since first review:

Made all 7 ME classes thread-safe by eliminating shared mutable instance state. Deprecate the ThreadSafe*ME wrappers - users can now share ME instances directly.

Motivation

ME classes were documented as not thread-safe due to mutable instance fields that corrupt under concurrent access. The workarounds were creating a new ME instance per call (expensive) or using ThreadSafe*ME wrappers (ThreadLocal-based, leak-prone in Jakarta EE). This PR makes the ME classes themselves thread-safe, yielding a 2.52x throughput improvement for POSTagger (JMH, 32 threads) by enabling instance reuse.

Approach

Mutable state moved to method-local variables or per-thread caches (ThreadLocal) at every layer:

Layer Change
ME classes (all 7) Result fields (bestSequence, tokProbs, etc.) made volatile; processing uses method-local variables with atomic swap at end
BeamSearch probs[] buffer and contextsCache moved to per-thread ThreadLocal state
CachedFeatureGenerator Cache moved to per-thread ThreadLocal (JMH confirms 1.62x benefit)
ConfigurablePOSContextGenerator Cache moved to per-thread ThreadLocal
DefaultSDContextGenerator buf/collectFeats moved to method-local parameters

Files changed (30 total)

Source (13 files): TokenizerME, SentenceDetectorME, POSTaggerME, LemmatizerME, ChunkerME, NameFinderME, LanguageDetectorME, BeamSearch, CachedFeatureGenerator, ConfigurablePOSContextGenerator, DefaultPOSContextGenerator, DefaultSDContextGenerator, SentenceContextGenerator (Thai)

Deprecated (7 files): ThreadSafeTokenizerME, ThreadSafeSentenceDetectorME, ThreadSafePOSTaggerME, ThreadSafeLemmatizerME, ThreadSafeChunkerME, ThreadSafeNameFinderME, ThreadSafeLanguageDetectorME

Internal usage swaps (3 files): Muc6NameSampleStreamFactory, TwentyNewsgroupSampleStreamFactory, POSTaggerMEIT - replaced ThreadSafe*ME with direct ME usage

Tests/benchmarks (5 files): ThreadSafetyBenchmarkTest (8 JUnit tests), 3 JMH benchmarks, CachedFeatureGeneratorTest update

Build (1 file): pom.xml - fixed JMH annotation processor wiring

@krickert krickert force-pushed the feature/thread-safe-me branch from d31aaa6 to b02c2eb Compare March 31, 2026 03:20
@mawiesne mawiesne marked this pull request as draft March 31, 2026 06:50
@krickert
Copy link
Copy Markdown
Author

@mawiesne - I did a push again to make the code try to match the style better - the problem I had was that your CICD failed linting and forced me to do 80-column code - which makes part of the code look ugly if not for my IDE. Can you ease up on the linting to make it 120 or 140 columns? or is that too much? I don't care either way, it's just a setting on my IDE - but the code in there has 3000+ violations - so I don't suspect it's really been enforced for a long time.

@rzo1
Copy link
Copy Markdown
Contributor

rzo1 commented Mar 31, 2026

Note: You can use the OpenNLP Formatting XML which is provided as download. In addition, you only have a few fixes:

Error:  Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.6.0:check (validate) on project opennlp-runtime: You have 2 Checkstyle violations. -> [Help 1]

@krickert
Copy link
Copy Markdown
Author

Note: You can use the OpenNLP Formatting XML which is provided as download. In addition, you only have a few fixes:

Error:  Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.6.0:check (validate) on project opennlp-runtime: You have 2 Checkstyle violations. -> [Help 1]

Oh cool! Thanks. I'll fix those today

…le state

All 7 ME classes (TokenizerME, SentenceDetectorME, POSTaggerME,
LemmatizerME, ChunkerME, NameFinderME, LanguageDetectorME) are now
safe for concurrent use from multiple threads. The ThreadSafe*ME
wrappers are deprecated — use the ME classes directly.

Thread-safety approach:
- ME instance fields (bestSequence, tokProbs, newTokens, sentProbs)
  changed to volatile with method-local processing, atomic swap at end
- BeamSearch: probs[] buffer and contextsCache moved to per-thread
  state via ThreadLocal
- CachedFeatureGenerator: cache moved to per-thread state via
  ThreadLocal (JMH confirms 1.62x benefit from this cache)
- ConfigurablePOSContextGenerator: cache moved to per-thread state
  via ThreadLocal
- DefaultSDContextGenerator: buf/collectFeats moved to method-local

JMH benchmark results (32 threads):
- POSTagger instancePerThread: 2.52x faster than newInstancePerCall
- POSTagger cache on vs off: no measurable difference for context
  generator cache; CachedFeatureGenerator provides 1.62x benefit
- Tokenizer/SentenceDetector: all approaches within error bars

API changes:
- All 7 ME classes annotated @threadsafe
- All 7 ThreadSafe*ME wrappers annotated @deprecated(since="3.0.0")
- POSTaggerME: added constructor with contextCacheSize parameter
- CachedFeatureGenerator: added DISABLE_CACHE_PROPERTY for benchmarking
- Internal usages of ThreadSafe*ME replaced with direct ME usage

Tests:
- ThreadSafetyBenchmarkTest: 8 JUnit tests with CyclicBarrier
  (all 7 ME classes + probs() concurrency test)
- JMH benchmarks for Tokenizer, SentenceDetector, POSTagger
- Fixed JMH annotation processor config in pom.xml
- All 680 runtime + 352 formats tests pass
@krickert krickert force-pushed the feature/thread-safe-me branch from b02c2eb to 178386f Compare March 31, 2026 23:37
@krickert
Copy link
Copy Markdown
Author

Fixed.. let me know if there's more tests you'd like me to do. I think between the benchmarks, passing tests, and harness, it seems like a great use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants