Summary
The 3.x release line of Apache OpenNLP introduces no known breaking changes while significantly modularizing the project to improve library usage and future extensibility. The core API remains stable and fully compatible with 2.x, so existing projects can continue using the opennlp-tools artifact without (substantial) modifications.
Key Highlights:
- Notable Changes:
- The minimum Java compiler level was raised to 21 (OPENNLP-1735)
- The artifact 'opennlp-models' was renamed to 'opennlp-model-resolver'. It can be used to detect and load OpenNLP compliant models from the classpath (OPENNLP-1807)
- New Features:
- Apache OpenNLP can now detect sentiment from text (OPENNLP-855)
- The eval corpus format for GermEval2014 is now supported (OPENNLP-976)
- Document Categorization is now possible via a binding to LibSVM (OPENNLP-1808)
- Bug Fixes:
- The SentenceDetector got three fixes in handling edge cases with abbreviation dictionaries (OPENNLP-1809, OPENNLP-1810, OPENNLP-1811) - NOTE: These fixes will be back-ported to the upcoming OpenNLP release 2.5.8 as well.
- Improvements:
- Language Codes passed in are now stricter validated to comply with ISO-693 standard (OPENNLP-991)
- The OpenNLP developer manual (HTML + PDF) got an uplift for the UIMA documentation part, being largely extended (OPENNLP-49)
What's Changed
- OPENNLP-1800: Evaluation Build Failure after Docs module packaging was switched to JAR by @rzo1 in #959
- Bump org.apache.maven.plugins:maven-failsafe-plugin from 3.5.4 to 3.5.5 by @dependabot[bot] in #964
- OPENNLP-1802: Update ONNX runtime to 1.24.2 by @dependabot[bot] in #962
- Adjust Badge for Maven Central in README.md by @mawiesne in #960
- OPENNLP-1804: Fix artifact name in doc section on model-loading by @mawiesne in #966
- OPENNLP-1805: Update logcaptor to 2.12.5 by @dependabot[bot] in #967
- OPENNLP-976: Implement GermEval2014 Format by @rzo1 in #971
- OPENNLP-1802: Update ONNX runtime to 1.24.3 by @dependabot[bot] in #972
- OPENNLP-1735: Upgrade minimum Java compiler level to 21 by @mawiesne in #975
- OPENNLP-1806: Update checkstyle plugin to 13.x by @dependabot[bot] in #968
- Bump actions/cache from 5.0.3 to 5.0.4 by @dependabot[bot] in #977
- OPENNLP-1714: Adjust Dev Manual to modularized structure by @rzo1 in #976
- OPENNLP-1807: Rename opennlp-models to opennlp-model-resolver by @rzo1 in #979
- OPENNLP-1803: Fix missing apidocs for all modules aside from opennlp-tools by @mawiesne in #982
- OPENNLP-1801: Extract eval tests into separate opennlp-eval-tests module by @rzo1 in #980
- OPENNLP-1809: SentenceDetector misses multi-letter abbreviations at sentence start by @mawiesne in #983
- OPENNLP-1810: Fix SentenceDetector fails to detect multiple identical abbreviations in the same sentence by @rzo1 in #984
- OPENNLP-1808: Add SVM-based document categorization via zlibsvm by @rzo1 in #981
- OPENNLP-855: Add SentimentDetector to derive sentiment from text by @mawiesne in #579
- OPENNLP-1811: Fix SentenceDetector missing abbreviations at non-first sentence start with useTokenEnd=false by @rzo1 in #985
- OPENNLP-1812: Move opennlp-tools util classes to core components by @mawiesne in #987
- OPENNLP-49: Update documentation for the uima integration by @rzo1 in #988
- OPENNLP-991: Validate all passed in language codes by @rzo1 in #989
- OPENNLP-1805: Update logcaptor to 2.12.6 by @dependabot[bot] in #990
- OPENNLP-1814: Ensure NOTICE file in opennlp-distr is updated via GH action by @mawiesne in #992
- OPENNLP-1815: Adjust maven-release-plugin config to skip tests by default by @mawiesne in #995
Full Changelog: https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12356724&projectId=12311215