Skip to content

Modernize NRCLex package layout, resource loading, and release tooling#22

Merged
DemetersSon83 merged 1 commit intomasterfrom
codex/modernize-project-layout-and-structure
Feb 27, 2026
Merged

Modernize NRCLex package layout, resource loading, and release tooling#22
DemetersSon83 merged 1 commit intomasterfrom
codex/modernize-project-layout-and-structure

Conversation

@DemetersSon83
Copy link
Owner

Motivation

  • Move the code from a single fragile module into a proper package so from nrclex import NRCLex remains stable when installed.
  • Eliminate cwd-dependent lexicon loading and ensure the bundled nrc_en.json is used reliably when no explicit path is provided.
  • Modernize packaging (PEP 517/621), remove bogus stdlib dependencies from metadata, and add packaging/test/release automation and guidance.

Description

  • Reorganized repository into a package layout with nrclex/__init__.py (exports NRCLex, sets __version__) and nrclex/core.py (implementation), and moved the lexicon to nrclex/data/nrc_en.json while keeping from nrclex import NRCLex compatible.
  • Implemented robust lexicon resolution in the constructor: None → load bundled resource via importlib.resources, explicit existing path → load that file, missing default/relative names → fallback to bundled resource, and missing absolute custom paths raise FileNotFoundError with a helpful message.
  • Added caching for lexicon loads via functools.lru_cache and refactored helpers into private methods (_build_word_affect, _compute_top_emotions), introduced EMOTION_ORDER, used Counter and itertools.chain for performance, and added type hints/docstrings.
  • Replaced setup.py with pyproject.toml (PEP 517/621), removed stdlib modules from dependencies, set requires-python = ">=3.9", and configured package data so nrclex/data/nrc_en.json is included in sdist/wheel.
  • Added deterministic pytest unit tests under tests/ (temporary small lexicons) and an optional integration test for load_raw_text that is skipped when TextBlob/corpora are not available, created a GitHub Actions CI workflow (.github/workflows/ci.yml), and added RELEASING.md and COMPATIBILITY.md.

Testing

  • Ran unit tests with PYTHONPATH=. pytest -q -m "not integration" and they passed (all unit tests succeeded).
  • Ran the full test suite with PYTHONPATH=. pytest -q and it passed; the integration test is guarded with pytest.importorskip/skip logic and will be skipped automatically if textblob/corpora are not present in the environment.
  • Executed a local behavioral congruence check (script comparing outputs for a fixed token list against the legacy logic) which showed identical raw_emotion_scores, affect_frequencies, and top_emotions for the sample tokens.
  • Packaging/installation validation (pip install -e ., python -m build, twine check dist/*) could not be completed in this environment because build tools/network access were unavailable or setuptools/build/twine were not installable due to network/proxy restrictions; these steps are included in CI and RELEASING.md for use in a network-enabled environment.

Codex Task

@DemetersSon83 DemetersSon83 merged commit 5e184df into master Feb 27, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant