Skip to content

refactor: use match_taxonomized_value utility and add CONTRIBUTING.md#1834

Open
atheendre130505 wants to merge 6 commits intoopenfoodfacts:mainfrom
atheendre130505:refactor/category-taxonomy
Open

refactor: use match_taxonomized_value utility and add CONTRIBUTING.md#1834
atheendre130505 wants to merge 6 commits intoopenfoodfacts:mainfrom
atheendre130505:refactor/category-taxonomy

Conversation

@atheendre130505
Copy link

@atheendre130505 atheendre130505 commented Jan 24, 2026

What:
Refactored category_taxonomisation in robotoff/prediction/ocr/category.py to use the match_taxonomized_value utility function.
Resolved an internal TODO regarding synonym matching for categories.
Added a missing CONTRIBUTING.md file to the root directory to guide new contributors.

Screenshot
N/A (Backend refactor and documentation)

Fixes bug(s)
#1833

@raphael0202
Copy link
Collaborator

Hello @atheendre130505!
Thank you for your contribution. Can you add unit tests to ensure the function works as intended?

@atheendre130505
Copy link
Author

atheendre130505 commented Jan 27, 2026

implemented tests/unit/prediction/ocr/test_category.py with test_category_taxonomisation.
the commit successfully passes the checks
further changes:
refactored robotoff/prediction/ocr/category.py to use the match_taxonomized_value utility function(to resolve the internal TODO regarding synonym matching. The utility function handles both exact taxonomy matches and mapped values (synonyms))
@raphael0202

@atheendre130505
Copy link
Author

hi @teolemon @raphael0202 ,
please have a look at this pr, unit tests have been implemented and all checks are passing.

Comment on lines +1 to +7
# Contributing to Robotoff

Thank you for your interest in contributing to Robotoff!

For detailed instructions on how to contribute, please visit our [Contributing Guide](https://openfoodfacts.github.io/robotoff/introduction/contributing/).

You can also find more information about the project in our [documentation](https://openfoodfacts.github.io/robotoff).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave this outside the PR, as we try to keep PRs atomic.

Comment on lines +27 to +44
def test_category_taxonomisation(mocker):
from robotoff.prediction.ocr.category import category_taxonomisation

# Mock match_taxonomized_value
mock_match = mocker.patch("robotoff.prediction.ocr.category.match_taxonomized_value")
mock_match.return_value = "en:mocked-category"

# Mock simple regex match object
mock_re_match = mocker.Mock()
mock_re_match.group.return_value = " Some Category "

# Test execution
result = category_taxonomisation("en:", mock_re_match)

# Verify normalization and args
# Expected: "en:" + normalize_tag(" Some Category ") -> "en:some-category"
mock_match.assert_called_once_with("en:some-category", "category")
assert result == "en:mocked-category"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use a real taxonomy to test this function. We can load taxonomies offline, there are some examples of this in the repo (using the offline parameter of the get_taxonomy function).

@github-project-automation github-project-automation bot moved this from Todo to In Progress in 📚 Document Open Food Facts Mar 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

3 participants