feat: Upgrade to Tika 3.2.3, GraalVM 25, Gradle 9.2.0, and Java 25 #69
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive upgrade of Extractous to the latest stable versions of all dependencies as of October 2025. This PR addresses the critical issue that Tika 2.9.2 reached End-of-Life in April 2025 and upgrades to current stable versions across the entire stack.
Motivation
Critical: Tika 2.9.2 EOL
Benefits of Latest Stack
Changes
Version Upgrades
New Dependencies
Added for Tika 3.x email parsing support:
API Compatibility Fixes
Tika 3.x Breaking Change Fixed:
BodyContentHandlerconstructor changed (no longer accepts OutputStream)ParsingReader.java:80OutputStreamWriterper Tika 3.x APIFile:
extractous-core/tika-native/src/main/java/ai/yobix/ParsingReader.javaModule Expansion
Added 2 more parser modules for comprehensive coverage:
Total modules: 19 (up from 17)
Total format coverage: 1,400+ formats
GraalVM Optimizations
Updated native-image build flags for GraalVM 25:
Testing
Build Verification ✅
libtika_native.so(133 MB)Platform Testing ✅
Format Coverage Testing ✅
Validated extraction for:
Performance ✅
Breaking Changes
None for end users. This is an internal Tika version bump. The Extractous Rust API remains unchanged.
Migration Notes
For Extractous Users
For Contributors
Related Issues
Files Changed
Gradle Build:
extractous-core/tika-native/build.gradle- Version updates, new dependenciesextractous-core/tika-native/gradle/wrapper/gradle-wrapper.properties- Gradle 9.2.0Java Source:
extractous-core/tika-native/src/main/java/ai/yobix/ParsingReader.java- Tika 3.x API fixDocumentation (NEW):
UPGRADE_NOTES.md- Build and testing instructionsFORK_MAINTENANCE_STRATEGY.md- Maintenance guidanceChecklist
Additional Notes
Why This Matters
Security: Tika 2.9.2 has no security support (EOL 6 months ago)
Stability: Tika 3.2.3 includes important bug fixes
Future-proofing: Prepares for Tika 4.0 in 3 months
Best practices: Always stay on supported versions
Timeline
Tested Environments
This PR brings Extractous to the cutting edge while maintaining full backward compatibility for users.
Ready to merge after review and any additional platform testing desired.