Thank you for your interest in contributing! This project thrives on community involvement.
- Fork the repository and clone your fork.
- Ensure you have JDK 17+ and Maven 3.8+ installed.
- Run
mvn verifyto confirm a clean baseline build.
- Zero runtime dependencies — do not add any
<scope>compile</scope>dependencies topom.xml. Test-scope dependencies are fine. - Thread safety — all public classes must be safe for concurrent use.
- Test coverage — new code should maintain the ≥ 80% line coverage enforced by JaCoCo.
- Javadoc — all public methods require Javadoc. Run
mvn javadoc:javadocto validate.
- ML training corpus — adding labelled examples to
TrainingData.javaimproves name/org disambiguation accuracy. - International PII patterns — national ID patterns for non-US countries (UK NIN, Indian Aadhaar, EU VAT, etc.).
- ReDoS safety — if you find a pattern that exhibits catastrophic backtracking on adversarial input, please open an issue or a PR.
- Benchmark data — real-world throughput measurements on different hardware are always useful.
- Open an issue first for large changes so we can discuss approach.
- Write tests that cover the new behaviour.
- Run
mvn verifyand ensure the build is green. - Open a PR against
mainwith a clear description.
Be respectful and constructive. This project follows the Contributor Covenant.