[1 of 2] ENG-3157: Platform identity resolution — OSS type changes for PBAC by galvana · Pull Request #7807 · ethyca/fides

galvana · 2026-04-01T00:34:43Z

Description Of Changes

OSS-side changes to support platform identity resolution in Fidesplus PBAC. These changes make the type system cross-platform ready and allow fidesplus to inject platform-specific identity resolvers.

Breaking changes (PBAC is in active development, no backward compat needed):

RawQueryLogEntry.identity is now a plain str (was user_email: str + principal_subject: str | None)
TableRef fields renamed: project → catalog, dataset → schema (standard SQL catalog terminology)
IdentityResolver Protocol signature changed from (user_email, principal_subject) to (identity: str)

Companion PR: Requires fidesplus#3338

Code Changes

types.py — Remove QueryIdentity dataclass, change RawQueryLogEntry.identity to str, rename TableRef fields
identity/interface.py — Update IdentityResolver Protocol to accept str
identity/basic.py — Simplify BasicIdentityResolver.resolve() to work with plain string
identity/resolver.py — Simplify RedisIdentityResolver.resolve(), update DatasetResolver for schema field
service.py — Add optional identity_resolver param to InProcessPBACEvaluationService.__init__
sql_parser.py — Pass identity as plain string
consumers/entities.py — Add connection_config_key field (shared Redis storage with fidesplus)

Steps to Confirm

Run all PBAC tests in fidesplus container: docker exec fidesplus-slim bash -c "pytest --no-cov tests/ops/service/pbac/ -v" — should be 139 passing
Run the PBAC demo: python demo/pbac_demo.py — should complete with 3 violations detected
Verify BasicIdentityResolver still resolves by email and external_id

Pre-Merge Checklist

Issue requirements met
All CI pipelines succeeded
CHANGELOG.md updated
- Updates unreleased work already in Changelog, no new entry necessary
UX feedback:
- No UX review needed
Followup issues:
- No followup issues
Database migrations:
- No migrations
Documentation:
- Documentation issue created in fidesdocs

🤖 Generated with Claude Code

…jectable resolver - Replace QueryIdentity dataclass with plain str for RawQueryLogEntry.identity - Rename TableRef fields: project→catalog, dataset→schema (standard SQL terminology) - Make InProcessPBACEvaluationService accept injectable identity_resolver parameter - Update IdentityResolver Protocol signature to accept str - Update BasicIdentityResolver, RedisIdentityResolver, DatasetResolver for new field names - Add connection_config_key to fides OSS DataConsumerEntity (shared Redis storage) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vercel · 2026-04-01T00:34:50Z

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
fides-plus-nightly	Ignored	Preview	Apr 9, 2026 8:40pm
fides-privacy-center	Ignored		Apr 9, 2026 8:40pm

codecov · 2026-04-01T19:25:59Z

Codecov Report

❌ Patch coverage is 0% with 87 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.04%. Comparing base (7192e68) to head (64cb289).
⚠️ Report is 8 commits behind head on main.

Files with missing lines	Patch %	Lines
src/fides/service/pbac/evaluate.py	0.00%	19 Missing ⚠️
src/fides/service/pbac/types.py	0.00%	18 Missing ⚠️
src/fides/service/pbac/dataset/resolver.py	0.00%	16 Missing ⚠️
src/fides/service/pbac/identity/basic.py	0.00%	13 Missing ⚠️
src/fides/service/pbac/service.py	0.00%	11 Missing ⚠️
src/fides/service/pbac/identity/resolver.py	0.00%	3 Missing ⚠️
src/fides/service/pbac/consumers/entities.py	0.00%	2 Missing ⚠️
src/fides/service/pbac/consumers/repository.py	0.00%	2 Missing ⚠️
src/fides/service/pbac/dataset/__init__.py	0.00%	2 Missing ⚠️
src/fides/service/pbac/identity/interface.py	0.00%	1 Missing ⚠️

❌ Your patch check has failed because the patch coverage (0.00%) is below the target coverage (100.00%). You can increase the patch coverage or adjust the target coverage.
❌ Your project check has failed because the head coverage (83.04%) is below the target coverage (85.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7807      +/-   ##
==========================================
- Coverage   85.07%   83.04%   -2.04%     
==========================================
  Files         627      627              
  Lines       40780    40763      -17     
  Branches     4742     4736       -6     
==========================================
- Hits        34694    33851     -843     
- Misses       5017     5823     +806     
- Partials     1069     1089      +20

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

- Add EvaluationGap type to types.py (gap_type, identifier, dataset_key, reason) - Add gaps field to EvaluationResult - Update evaluate_access to return EvaluationOutput (result + gaps) - Unresolved identity → gap (was: violation) - Unconfigured dataset → gap (was: silently passing) - Violations are now strictly purpose-mismatch issues Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Delete ResolvedConsumer — resolvers return DataConsumerEntity directly - EvaluationResult.consumer is now DataConsumerEntity | None - Add EvaluationResult.identity field for the unresolved case - Add GapType and ConsumerType enums (replace magic strings) - Consolidate AccessGap into EvaluationGap (remove duplicate type) - evaluate_access uses EvaluationGap with GapType enum directly - BasicIdentityResolver works with DataConsumerEntity - Remove _build_resolved_consumer from service (no conversion needed) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Move DatasetResolver from identity/resolver.py to dataset/resolver.py - Make DatasetResolver injectable via InProcessPBACEvaluationService constructor - Remove build_identity_resolver factory (unused) - Clean up identity/resolver.py to only contain RedisIdentityResolver Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Covers evaluation flow, three outcomes (compliant/violation/gap), extension points with defaults, type inventory, identity resolution, and package structure. No fidesplus concepts or file listings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nsumerEntity Consumers are now identified by `type` + `scope` (a dict of platform-specific identifiers) instead of `external_id` + `connection_config_key`. This enables cross-platform identity resolution where the same consumer (e.g., a Google Group) works across multiple data platform connections without duplication. The scope dict always includes the full namespace chain (e.g., domain + group_email for Google Groups, domain + project_id + role for GCP IAM roles). Each scope key is individually indexed in Redis for efficient filtering. Both BasicIdentityResolver and RedisIdentityResolver updated to match on scope email instead of external_id. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude

Code Review — ENG-3157: Platform identity resolution (OSS type changes)

The overall direction is solid: collapsing user_email/principal_subject into a plain identity: str, introducing the scope dict for platform-specific identifiers, and the gaps/violations split are all good design moves that will make the fidesplus extension points cleaner. The TableRef rename to standard SQL catalog terminology (catalog/schema) is also an improvement.

That said, there are a few issues worth addressing before merge:

Issues requiring attention

1. is_compliant=True for unresolved identities (evaluate.py:69)
The most significant behavioral change: an unresolved consumer now returns is_compliant=True with gaps, rather than is_compliant=False with violations. Downstream code checking is_compliant to gate access or fire alerts will now silently pass unresolved users. This needs either a clear contract change (documented and agreed upon) or is_compliant=False should be returned when gaps exist.

2. DatasetResolver silent fallback to table_ref.schema (dataset/resolver.py:35)
The resolver always returns a non-None value, so the if fides_key: guard in service.py is always truthy. Every table gets "resolved" — via a guess — with no logging. This masks configuration errors and makes the ds_purposes is None branch in _check_access unreachable (see also inline comment on service.py:174).

3. Scope index key collision (consumers/repository.py:45)
Using f"{key}={value}" as a Redis index value is ambiguous when values contain = (AWS ARNs, URLs, encoded tokens). This will silently break lookups for affected consumers.

4. Silent loss of external_id (consumers/entities.py:80)
from_consumer() sets scope={}, discarding any existing external_id value on the ORM model. Consumers previously identified by external_id will silently become unresolvable. If the ORM column still exists, bridge it into scope as part of this change.

5. BasicIdentityResolver only matches scope["email"] (identity/basic.py:45)
The scope dict is designed to hold arbitrary platform identifiers, but both the BasicIdentityResolver and RedisIdentityResolver only ever look up scope["email"]. Consumers with any other scope key (group name, IAM role, service account, etc.) cannot be resolved via OSS code, which contradicts the extension-point design in the README.

Minor issues

ConsumerType enum unused (types.py:142-153): defined with platform-specific values that are never set on any entity. Either enforce it on DataConsumerEntity.type or remove it until it is used.
Two distinct dataset gap cases share one GapType (evaluate.py:128 and 141): "not registered" vs "registered but no purposes" are currently distinguishable only by parsing the reason string, which is fragile for callers.
Dead code in _build_dataset_purposes (service.py:174): always produces a DatasetPurposes (with empty keys) for every resolved dataset, making the ds_purposes is None branch in _check_access unreachable from the in-process service path.

claude · 2026-04-09T18:47:01Z

src/fides/service/pbac/evaluate.py

+        return EvaluationOutput(
+            result=ValidationResult(
+                violations=[],
+                is_compliant=True,


Compliance semantics change: unresolved consumer now returns is_compliant=True

When a consumer has no declared purposes (i.e., an unresolved identity), this path returns is_compliant=True with a list of gaps. Previously this resulted in violations.

This is a meaningful behavioral shift: any downstream code that checks result.is_compliant to gate access or trigger alerts will now treat unresolved identities as passing, not failing. If gaps are not separately monitored and alarmed on, this could silently mask unauthorized access by unknown users.

Consider either:

Returning is_compliant=False when there are identity gaps, or

Adding a has_gaps field to EvaluationResult and documenting clearly that is_compliant=True does not mean "access is safe" when gaps exist.

claude · 2026-04-09T18:47:08Z

src/fides/service/pbac/evaluate.py

-
-    # Dataset not registered in Fides — no purpose metadata available
+    # Dataset not registered or has no purposes — record as gap
    if ds_purposes is None:


Two distinct cases mapped to the same UNCONFIGURED_DATASET gap type

Lines 128–136 (dataset not in registry) and lines 141–150 (dataset registered but no purposes) both emit GapType.UNCONFIGURED_DATASET. The gap consumer must diff on reason string to distinguish them, which is brittle.

Consider splitting into UNREGISTERED_DATASET vs UNCONFIGURED_DATASET (or adding a dedicated GapType value), so callers can handle each case programmatically without parsing the reason string.

claude · 2026-04-09T18:47:17Z

src/fides/service/pbac/dataset/resolver.py

+        if table_ref.schema in self._mappings:
+            return self._mappings[table_ref.schema]
+
+        return table_ref.schema


Silent fallback to table_ref.schema will never return None

The return type is str | None, but this function always returns a non-None value: when no explicit mapping is found it silently falls back to the raw schema name as a fides key. This means:

Callers in service.py (if fides_key:) will always treat every table as resolved, never producing an UNCONFIGURED_DATASET gap from the resolver stage.

There is no indication when a guess is being used vs. an intentional mapping.

Consider returning None when no mapping is found and letting the caller decide whether to fall back to table_ref.schema. At minimum, add a logger.debug or logger.warning here to make the implicit fallback visible.

claude · 2026-04-09T18:47:27Z

src/fides/service/pbac/consumers/repository.py

-            entries.append(("external_id", entity.external_id))
+        # Index each scope key individually for filtering
+        for key, value in sorted(entity.scope.items()):
+            entries.append(("scope", f"{key}={value}"))


Index key collision when scope values contain =

The index entry is formatted as f"{key}={value}", so a scope dict like {"role": "arn:aws:iam::123456789:role/admin=superuser"} would produce the index value "role=arn:aws:iam::123456789:role/admin=superuser". A lookup for scope["role"] == "arn:aws:iam::..." needs to parse this correctly, and anything splitting on the first = would work, but splitting on any = would not.

The RedisIdentityResolver currently does get_by_index("scope", f"email={identity}") which is a prefix-aware lookup — but this pattern will silently break for scope values that happen to contain = characters (e.g. URLs, ARNs, JWT subjects).

Consider using a delimiter that cannot appear in realistic values (e.g., \x00 or |), or URL-encoding the value portion before storing.

claude · 2026-04-09T18:47:35Z

src/fides/service/pbac/consumers/entities.py

            description=obj.description,
            type=obj.type,
-            external_id=obj.external_id,
+            scope={},


external_id is silently dropped when building from a DataConsumer ORM object

from_consumer() sets scope={} unconditionally. If the underlying DataConsumer model still has an external_id column, its value is discarded here with no migration or fallback. Any existing consumers identified by external_id (e.g., group names, role IDs) will silently lose their identity mapping after this change, causing future queries from those consumers to produce UNRESOLVED_IDENTITY gaps instead of resolving correctly.

If DataConsumer.external_id still exists on the ORM model, consider seeding scope={"external_id": obj.external_id} (or a more domain-appropriate key) as a migration bridge, at least until the companion database migration removes the column.

claude · 2026-04-09T18:47:44Z

src/fides/service/pbac/types.py

+    SNOWFLAKE_DATABASE_ROLE = "snowflake_database_role"
+    SNOWFLAKE_SERVICE_USER = "snowflake_service_user"
+    SYSTEM = "system"
+    UNRESOLVED = "unresolved"


ConsumerType enum is defined but never used by any entity construction code

from_consumer() sets type=obj.type (a raw string from the ORM), and from_system() sets type="system" (a string literal). Neither uses ConsumerType.GROUP, ConsumerType.IAM_ROLE, etc. As a result:

The enum values are never validated against

ConsumerType.GOOGLE_GROUP, ConsumerType.IAM_ROLE, ConsumerType.SNOWFLAKE_ROLE, etc. are defined but unreachable from existing construction paths

Type checkers won't catch type="iam_role" vs ConsumerType.IAM_ROLE mismatches

Either make DataConsumerEntity.type a ConsumerType field (with appropriate coercion in from_consumer/from_dict), or remove the enum until it is actually enforced. An unused enum in a shared types module creates misleading signals about what values are valid.

claude · 2026-04-09T18:48:04Z

src/fides/service/pbac/identity/basic.py

+                self._by_email[c.contact_email] = c
+            scope_email = c.scope.get("email")
+            if scope_email:
+                self._by_scope_email[scope_email] = c


BasicIdentityResolver only indexes scope["email"] — all other scope keys are ignored

The new scope dict is intended to be a generic map of platform identifiers (group_email, role, project_id, etc.), but the index built here only extracts scope.get("email"). A consumer with scope={"group_email": "analytics@company.com", "domain": "company.com"} will never be resolved by this resolver unless one of the keys happens to be "email".

The same limitation applies to RedisIdentityResolver, which queries get_by_index("scope", f"email={identity}") — it will only match consumers whose scope contains the key "email".

The README's extension points table promises that IdentityResolver implementations resolve any platform identity. The current OSS implementation does not deliver on this for non-email scope keys. At minimum, document the constraint; ideally, iterate over all scope values (or allow configuring which scope key to match on).

… no purposes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Consumer types are handled dynamically by fidesplus's ConsumerTypeDescriptor provider pattern, making this enum dead code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

galvana changed the title ~~Platform identity resolution: OSS type changes for PBAC~~ ENG-3157: Platform identity resolution — OSS type changes for PBAC Apr 1, 2026

Merge branch 'main' into platform-identity-resolution

8e7c94a

Adrian Galvan and others added 4 commits April 1, 2026 12:51

galvana force-pushed the platform-identity-resolution branch from 87f0cd0 to 6ffe02f Compare April 2, 2026 04:56

Adrian Galvan and others added 2 commits April 1, 2026 22:00

Remove unrelated docs/plans files from branch

37315ff

galvana changed the title ~~ENG-3157: Platform identity resolution — OSS type changes for PBAC~~ [1 of 2] ENG-3157: Platform identity resolution — OSS type changes for PBAC Apr 6, 2026

galvana requested a review from thabofletcher April 9, 2026 18:40

Merge branch 'main' into platform-identity-resolution

1def43f

galvana marked this pull request as ready for review April 9, 2026 18:40

galvana requested a review from a team as a code owner April 9, 2026 18:40

vercel bot deployed to Preview – fides-plus-nightly April 9, 2026 18:46 View deployment

claude bot reviewed Apr 9, 2026

View reviewed changes

Adrian Galvan and others added 4 commits April 9, 2026 12:25

fix: update test to expect gap instead of violation for consumer with…

10a5eb9

… no purposes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge branch 'main' into platform-identity-resolution

6042bf0

chore: remove unreleased PBAC backward-compat re-export shims

0bf8a50

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: remove unused ConsumerType enum from PBAC types

64cb289

Consumer types are handled dynamically by fidesplus's ConsumerTypeDescriptor provider pattern, making this enum dead code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1 of 2] ENG-3157: Platform identity resolution — OSS type changes for PBAC#7807

[1 of 2] ENG-3157: Platform identity resolution — OSS type changes for PBAC#7807
galvana wants to merge 13 commits intomainfrom
platform-identity-resolution

galvana commented Apr 1, 2026 •

edited by atlassian bot

Loading

Uh oh!

vercel bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

codecov bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

claude bot left a comment

Uh oh!

claude bot Apr 9, 2026

Uh oh!

claude bot Apr 9, 2026

Uh oh!

claude bot Apr 9, 2026

Uh oh!

claude bot Apr 9, 2026

Uh oh!

claude bot Apr 9, 2026

Uh oh!

claude bot Apr 9, 2026

Uh oh!

claude bot Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

galvana commented Apr 1, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description Of Changes

Code Changes

Steps to Confirm

Pre-Merge Checklist

Uh oh!

vercel bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Code Review — ENG-3157: Platform identity resolution (OSS type changes)

Issues requiring attention

Minor issues

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

galvana commented Apr 1, 2026 •

edited by atlassian bot

Loading

vercel bot commented Apr 1, 2026 •

edited

Loading

codecov bot commented Apr 1, 2026 •

edited

Loading