-
Notifications
You must be signed in to change notification settings - Fork 915
[Feature] add Korean Foreigner Registration Number recognizer #1825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for detecting Korean Foreigner Registration Numbers (FRN), a 13-digit identifier issued to registered foreigners in Korea. The implementation follows best practices by inheriting from the existing KrRrnRecognizer and reusing common validation logic through a well-designed refactoring.
Changes:
- Added new
KrFrnRecognizerclass that detects Korean Foreigner Registration Numbers with pattern matching, context-based detection, and checksum validation - Refactored
KrRrnRecognizerto extract common checksum computation logic into a reusable_compute_checksummethod - Added comprehensive test coverage with 13 test cases covering valid FRNs (with and without checksums), invalid formats, and edge cases
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/korea/kr_frn_recognizer.py |
New recognizer implementation that inherits from KrRrnRecognizer, defines FRN-specific patterns (gender codes 5-8), context words (Korean and English), and FRN-specific checksum validation formula |
presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/korea/kr_rrn_recognizer.py |
Refactored to extract common checksum computation into _compute_checksum method and improved type hints from Union[bool, None] to Optional[bool] |
presidio-analyzer/tests/test_kr_frn_recognizer.py |
Comprehensive test suite with 13 parametrized test cases covering valid FRNs with correct/incorrect checksums and various invalid formats |
presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/korea/__init__.py |
Added KrFrnRecognizer to module exports in alphabetical order |
presidio-analyzer/presidio_analyzer/predefined_recognizers/__init__.py |
Added KrFrnRecognizer import and export to main predefined recognizers module |
presidio-analyzer/presidio_analyzer/conf/default_recognizers.yaml |
Added KrFrnRecognizer configuration with ko/kr language support and enabled: false flag |
docs/supported_entities.md |
Added KR_FRN entity documentation in alphabetically ordered Korea section |
...nalyzer/presidio_analyzer/predefined_recognizers/country_specific/korea/kr_frn_recognizer.py
Show resolved
Hide resolved
...nalyzer/presidio_analyzer/predefined_recognizers/country_specific/korea/kr_rrn_recognizer.py
Show resolved
Hide resolved
...nalyzer/presidio_analyzer/predefined_recognizers/country_specific/korea/kr_frn_recognizer.py
Show resolved
Hide resolved
|
I’ve addressed the issues raised by Copilot. Could you please take a look at the changes now @omri374 ? |
RonShakutai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Change Description
This PR adds a new recognizer,
KrFrnRecognizer, to detect and validate South Korean Foreigner Registration Numbers.Issue reference
Fixes #XX
Checklist