refactor(display): consume structured rewrite columns in native shape by memadi-nv · Pull Request #144 · NVIDIA-NeMo/Anonymizer

memadi-nv · 2026-05-01T20:33:52Z

Summary

Maintain native shape for structured rewrite columns

Motivation

The rewrite pipeline (LLMJudgeColumnConfig, LLMStructuredColumnConfig) already writes plain Python dicts into the dataframe via pydantic model_dump(). The display layer was defensively re-normalizing these columns from three possible shapes — None, pydantic model, or JSON string — which masked upstream bugs and duplicated work.

This PR removes the defensive coercion and trusts the native dict shape throughout the display path.

Changes

`src/anonymizer/interface/display.py`

Removed speculative type-coercion from _extract_judge_scores: no more None short-circuit, model_dump() fallback, or json.loads attempt. The function now expects a plain dict and returns [] otherwise.
Removed the same coercion ladder from _normalize_disposition; it now expects a plain dict with "sensitivity_disposition" as a list, filtering entries via isinstance(e, dict) instead of calling getattr(e, "model_dump", dict)().
Dropped redundant float(...) wrappers around utility, leakage, and weighted_leakage_rate in _render_scores_section — values are already numeric at this stage.
Added a module-level logger and a warning when COL_JUDGE_EVALUATION is present but yields no scores, so unexpected shapes surface instead of silently rendering nothing.
Updated docstrings on _extract_judge_scores and _normalize_disposition to document the expected native dict shape.

`tests/interface/test_display.py`

Judge-scores test now derives rubric keys from the real PRIVACY_RUBRIC, QUALITY_RUBRIC, and NATURALNESS_RUBRIC configs (imported from final_judge) instead of hardcoded strings, preventing test-vs-runtime drift if a rubric is renamed.
Disposition test fixture is now built via SensitivityDispositionSchema(...).model_dump() with a full EntityDispositionSchema, matching the exact dict that LLMStructuredColumnConfig writes at runtime. Replaces the hand-rolled partial dict.
Disposition test uses the COL_SENSITIVITY_DISPOSITION constant instead of a hardcoded "_sensitivity_disposition" key.
Added imports for COL_SENSITIVITY_DISPOSITION, the three rubric configs, and the two rewrite schemas.

Net effect

display.py: −24 / +18 lines (simpler, one observability signal added).
test_display.py: +35 / −21 lines (fixtures pinned to real schemas/constants).

Test plan

pytest tests/interface/test_display.py passes locally.
Existing e2e rewrite flow renders judge scores and disposition table correctly.
Warning log fires when an unexpected judge-evaluation shape reaches the renderer.

Testing

make test passes locally
make check passes locally (format + lint + typecheck + lock-check)
Added/updated tests for changes

Documentation

If docs changed: make docs-build passes locally

Related Issues

Closes #70

Signed-off-by: memadi <memadi@nvidia.com>

greptile-apps · 2026-05-01T20:35:58Z

Greptile Summary

This PR removes the defensive type-coercion ladder (None-check → model_dump() fallback → json.loads attempt) from _extract_judge_scores and _normalize_disposition, replacing it with a single isinstance(raw, dict) guard that trusts the plain-dict shape the upstream pipeline already writes. A targeted logger.warning fires when a dict is present but yields no scores, surfacing upstream regressions rather than silently rendering nothing.

display.py: Coercion removed from both extraction helpers; warning guard uses isinstance(judge_raw, dict) (correctly handling np.nan rows without false positives); redundant float() wrappers dropped.
test_display.py: Fixtures rebuilt from real schemas (SensitivityDispositionSchema.model_dump()) and real constants (COL_JUDGE_EVALUATION, rubric configs); two new tests verify the NaN-no-warn and malformed-dict-warns contract.

Confidence Score: 5/5

Safe to merge — the display layer now faithfully reflects the dict shape the pipeline already writes, and the warning guard is correctly scoped with isinstance.

The simplification is well-targeted: removing coercion paths that were masking upstream bugs, not removing legitimate defensive guards. The isinstance-based warning correctly distinguishes NaN rows (no warning) from malformed dicts (warning), and both branches are directly covered by new tests. No data loss, no silent breakage for the happy path.

No files require special attention; both changed files are straightforward and covered by the updated tests.

Important Files Changed

Filename	Overview
src/anonymizer/interface/display.py	Removes defensive type-coercion ladders from _extract_judge_scores and _normalize_disposition, trusting the upstream dict shape. Adds a well-scoped warning (using isinstance rather than pd.isna) for malformed judge dicts.
tests/interface/test_display.py	Tests updated to use real constants and real schemas; adds two new focused tests for NaN (no-warn) and malformed-dict (warn) paths that correctly exercise the new isinstance-based warning guard.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[row.get COL_JUDGE_EVALUATION] --> B{isinstance dict?}
    B -- No NaN / None / other --> C[_extract_judge_scores returns empty]
    B -- Yes --> D[_extract_judge_scores iterates keys]
    D --> E{Each value: dict with score?}
    E -- No --> F[skip entry]
    E -- Yes --> G[append name, int score]
    G --> H{judge_scores non-empty?}
    F --> H
    H -- Yes --> I[Render Judge: scores in HTML]
    H -- No --> J[warning: unexpected keys logged]
    J --> K[No Judge section rendered]
    C --> K

_{Reviews (4): Last reviewed commit: "address feedback" | Re-trigger Greptile}

Signed-off-by: memadi <memadi@nvidia.com>

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

lipikaramaswamy

Looks good from an engineering perspective. I reviewed the display refactor and the updated tests; the change is scoped and aligns the renderer with the native dict shape produced by the rewrite pipeline. One small nit, recommend updating to future proof.

I did not run a live model-backed rewrite flow, so approving assuming you did :)

Signed-off-by: memadi <memadi@nvidia.com>

maintain-DD-column-type

ec5115f

Signed-off-by: memadi <memadi@nvidia.com>

memadi-nv requested a review from a team as a code owner May 1, 2026 20:33

greptile-apps Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/anonymizer/interface/display.py Outdated

Comment thread src/anonymizer/interface/display.py Outdated

Comment thread src/anonymizer/interface/display.py

address greptile feedback

f1b94e2

Signed-off-by: memadi <memadi@nvidia.com>

greptile-apps Bot reviewed May 1, 2026

View reviewed changes

Comment thread src/anonymizer/interface/display.py Outdated

Update src/anonymizer/interface/display.py

0ae6cac

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

lipikaramaswamy reviewed May 14, 2026

View reviewed changes

Comment thread tests/interface/test_display.py

lipikaramaswamy approved these changes May 14, 2026

View reviewed changes

address feedback

03b193f

Signed-off-by: memadi <memadi@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(display): consume structured rewrite columns in native shape#144

refactor(display): consume structured rewrite columns in native shape#144
memadi-nv wants to merge 4 commits into
mainfrom
memadi/feature/maintain-native-shape-for-structured-rewrite-columns

memadi-nv commented May 1, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

memadi-nv commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Maintain native shape for structured rewrite columns

Motivation

Changes

src/anonymizer/interface/display.py

tests/interface/test_display.py

Net effect

Test plan

Testing

Documentation

Related Issues

Uh oh!

greptile-apps Bot commented May 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lipikaramaswamy left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

memadi-nv commented May 1, 2026 •

edited

Loading

`src/anonymizer/interface/display.py`

`tests/interface/test_display.py`

greptile-apps Bot commented May 1, 2026 •

edited

Loading