Skip to content

Commit bd0a6e5

Browse files
labrenbeMalewarerazvansiegfriedweber
authored
feat(superset): Role mapping from OPA (#979)
* Fixing to 2.0.0 * Fixing typo * Fixing up requests dependency * Adding superset Opa manager * Adding road map * Adding comment * fix indentation * adding wip and deployment setup * fix opa rule * start investigation into role usage * Add error handling and configurability & refactor code * create patch for superset-opa * make manager a seperate file to load it only if necessary * Updating manager, leaving todos * Manager works the way expected now * More sophisticated logs * Adding better check. Only apply default role if user has none * Adding STACKABLE_OPA_BASE_URL * add first unit tests * add more unit test and fix code style * move opa-authorizer as separate package * fix gitignore * add caching to 'get_opa_user_roles' * fix caching * remove supersetopa-integration directory * add dummy changelog entry * fix linting * remove opa client * fix typo * address feedback on PR * add readme and remove opa client * fix changelog * refactor opa authorizer to cache resolved roles * poetry install doesn't find python. use sync instead * do not default to the Public role * pin poetry version and use heredoc syntax * do not mutate user roles anymore * use the correct SQLAlchemy session to update user roles * docs and silence some checker errors * fix project dependencies * change log level to debug * clarify doc * cleanup roles before updating * do not raise exception if role doesn't exist * update doc * Set user roles in "update_user_auth_stat" instead of "get_user_roles" * remove auth_opa_package --------- Co-authored-by: Maxi Wittich <[email protected]> Co-authored-by: Maximilian Wittich <[email protected]> Co-authored-by: Razvan-Daniel Mihai <[email protected]> Co-authored-by: Siegfried Weber <[email protected]>
1 parent abf9eb1 commit bd0a6e5

File tree

12 files changed

+4694
-2
lines changed

12 files changed

+4694
-2
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ All notable changes to this project will be documented in this file.
99
- airflow: Add OPA support to Airflow ([#978]).
1010
- nifi: Activate `include-hadoop` profile for NiFi version 2.* ([#958]).
1111
- nifi: Add NiFi hadoop Azure and GCP libraries ([#943]).
12+
- superset: Add role mapping from OPA ([#979]).
1213
- base: Add containerdebug tool ([#928], [#959]).
1314
- tools: Add the package util-linux-core ([#952]).
1415
util-linux-core contains a basic set of Linux utilities, including the
@@ -61,6 +62,7 @@ All notable changes to this project will be documented in this file.
6162
[#935]: https://github.com/stackabletech/docker-images/pull/935
6263
[#962]: https://github.com/stackabletech/docker-images/pull/962
6364
[#978]: https://github.com/stackabletech/docker-images/pull/978
65+
[#979]: https://github.com/stackabletech/docker-images/pull/979
6466
[#980]: https://github.com/stackabletech/docker-images/pull/980
6567
[#981]: https://github.com/stackabletech/docker-images/pull/981
6668
[#982]: https://github.com/stackabletech/docker-images/pull/982

superset/Dockerfile

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,36 @@
33

44
FROM stackable/image/statsd_exporter AS statsd_exporter-builder
55

6+
FROM stackable/image/stackable-base AS opa-authorizer-builder
7+
8+
ARG PYTHON
9+
10+
COPY superset/stackable/opa-authorizer /tmp/opa-authorizer
11+
12+
RUN <<EOF
13+
microdnf update
14+
microdnf install \
15+
gcc \
16+
gcc-c++ \
17+
python${PYTHON} \
18+
python${PYTHON}-devel \
19+
python${PYTHON}-pip
20+
microdnf clean all
21+
rm -rf /var/cache/yum
22+
23+
pip install \
24+
--no-cache-dir \
25+
--upgrade \
26+
poetry==2.1.1 \
27+
pytest==8.3.4
28+
29+
cd /tmp/opa-authorizer
30+
31+
poetry sync
32+
poetry run pytest
33+
poetry build
34+
EOF
35+
636
FROM stackable/image/vector AS builder
737

838
ARG PRODUCT
@@ -12,6 +42,7 @@ ARG TARGETARCH
1242
ARG TARGETOS
1343

1444
COPY superset/constraints-${PRODUCT}.txt /tmp/constraints.txt
45+
COPY --from=opa-authorizer-builder /tmp/opa-authorizer/dist/opa_authorizer-0.1.0-py3-none-any.whl /tmp/
1546

1647
RUN microdnf update \
1748
&& microdnf install \
@@ -62,7 +93,7 @@ RUN python3 -m venv /stackable/app \
6293
# Since https://github.com/stackabletech/superset-operator/pull/530
6394
# admins can add custom configuration to superset_conf.py.
6495
Flask_OIDC==2.2.0 \
65-
Flask-OpenID==1.3.1\
96+
Flask-OpenID==1.3.1 \
6697
# Redhat has removed `tzdata` from the ubi-minimal images: see https://bugzilla.redhat.com/show_bug.cgi?id=2223028.
6798
# Superset relies on ZoneInfo (https://docs.python.org/3/library/zoneinfo.html#data-sources) to resolve time zones, and this is done
6899
# by searching first under `TZPATH` (which is empty due to the point above) or for the tzdata python package.
@@ -80,7 +111,8 @@ RUN python3 -m venv /stackable/app \
80111
--upgrade \
81112
python-json-logger \
82113
cyclonedx-bom \
83-
&& if [ -n "$AUTHLIB" ]; then pip install Authlib==${AUTHLIB}; fi
114+
&& if [ -n "$AUTHLIB" ]; then pip install Authlib==${AUTHLIB}; fi && \
115+
pip install --no-cache-dir /tmp/opa_authorizer-0.1.0-py3-none-any.whl
84116

85117
COPY superset/stackable/patches /patches
86118
RUN /patches/apply_patches.sh ${PRODUCT}
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
**/.pytest_cache
2+
dist
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Superset OPA authorizer
2+
3+
Custom Superset security manager that syncs to an Open Policy
4+
Agent
5+
6+
[Poetry](https://python-poetry.org/) is used to build the project:
7+
8+
poetry build
9+
10+
The unit tests can be run as follows:
11+
12+
poetry run pytest

superset/stackable/opa-authorizer/opa_authorizer/__init__.py

Whitespace-only changes.
Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
"""
2+
Custom security manager for Superset.
3+
4+
Assigns OPA roles to a user. The roles and their permissions must exist in the
5+
Superset database.
6+
"""
7+
8+
import logging
9+
from dataclasses import dataclass
10+
11+
import requests
12+
from cachetools import TTLCache, cachedmethod
13+
from flask import current_app
14+
from flask_appbuilder import AppBuilder
15+
from flask_appbuilder.security.sqla.models import Role, User
16+
from overrides import override
17+
from sqlalchemy.orm.session import Session
18+
from superset.security import SupersetSecurityManager
19+
20+
log = logging.getLogger(__name__)
21+
22+
23+
class OpaError(Exception):
24+
pass
25+
26+
27+
class SupersetError(Exception):
28+
pass
29+
30+
31+
@dataclass
32+
class OpaResponse:
33+
roles: list[str]
34+
35+
36+
def opa_response_from_json(json: dict[str, object]) -> OpaResponse:
37+
"""Converts a JSON object to an OpaResponse object."""
38+
if "result" in json:
39+
if type(json["result"]) is list:
40+
return OpaResponse(roles=json["result"])
41+
42+
raise OpaError(f"Invalid OPA response: [{json}]")
43+
44+
45+
class OpaSupersetSecurityManager(SupersetSecurityManager):
46+
"""
47+
Custom security manager that syncs role mappings from Open Policy Agent to Superset.
48+
"""
49+
50+
AUTH_OPA_CACHE_MAXSIZE_DEFAULT: int = 1000
51+
AUTH_OPA_CACHE_TTL_IN_SEC_DEFAULT: int = 30
52+
AUTH_OPA_REQUEST_URL_DEFAULT: str = "http://opa:8081/"
53+
AUTH_OPA_REQUEST_TIMEOUT_DEFAULT: int = 10
54+
AUTH_OPA_PACKAGE_DEFAULT: str = "superset"
55+
AUTH_OPA_RULE_DEFAULT: str = "user_roles"
56+
57+
def __init__(self, appbuilder: AppBuilder):
58+
super().__init__(appbuilder)
59+
60+
config = appbuilder.get_app.config
61+
62+
self.role_cache: TTLCache[str, set[Role]] = TTLCache(
63+
maxsize=config.get(
64+
"AUTH_OPA_CACHE_MAXSIZE", self.AUTH_OPA_CACHE_MAXSIZE_DEFAULT
65+
),
66+
ttl=config.get(
67+
"AUTH_OPA_CACHE_TTL_IN_SEC", self.AUTH_OPA_CACHE_TTL_IN_SEC_DEFAULT
68+
),
69+
)
70+
71+
self.auth_opa_url: str = config.get(
72+
"AUTH_OPA_REQUEST_URL", self.AUTH_OPA_REQUEST_URL_DEFAULT
73+
)
74+
self.auth_opa_rule: str = config.get(
75+
"AUTH_OPA_RULE", self.AUTH_OPA_RULE_DEFAULT
76+
)
77+
self.auth_opa_request_timeout: int = current_app.config.get(
78+
"AUTH_OPA_REQUEST_TIMEOUT", self.AUTH_OPA_REQUEST_TIMEOUT_DEFAULT
79+
)
80+
81+
self.opa_session: requests.Session = requests.Session()
82+
83+
@override
84+
def update_user_auth_stat(self, user, success=True):
85+
"""
86+
Update user authentication stats upon successful/unsuccessful
87+
authentication attempts.
88+
Additionally, retrieve the roles of a successfully authenticated
89+
user from an Open Policy Agent instance and update the user-role
90+
mapping in the database.
91+
"""
92+
if success:
93+
resolved_opa_roles = self.roles(user)
94+
user.roles = resolved_opa_roles
95+
96+
super().update_user_auth_stat(user, success)
97+
98+
@cachedmethod(lambda self: self.role_cache)
99+
def roles(self, user: User) -> list[Role]:
100+
"""
101+
Retrieves a user's role names from an Open Policy Agent instance and
102+
maps them to existing Role objects in the Superset database.
103+
The result is cached.
104+
"""
105+
opa_role_names = self.opa_get_user_roles(user.username)
106+
result: list[Role] = self.resolve_user_roles(user, opa_role_names)
107+
return result
108+
109+
def opa_get_user_roles(self, username: str) -> list[str]:
110+
"""
111+
Queries an Open Policy Agent instance for the roles of a given user.
112+
113+
:returns: A list of Role objects assigned to the user or an empty list.
114+
"""
115+
input = {"input": {"username": username}}
116+
try:
117+
req_url = f"{self.auth_opa_url}/{self.auth_opa_rule}"
118+
response = self.call_opa(
119+
url=req_url,
120+
json=input,
121+
timeout=self.auth_opa_request_timeout,
122+
)
123+
124+
opa_response: OpaResponse = response.json(
125+
object_hook=opa_response_from_json
126+
)
127+
128+
log.info(f"OPA role names for user [{username}]: [{opa_response.roles}]")
129+
130+
return opa_response.roles
131+
132+
except Exception as e:
133+
log.error("Failed to get OPA role names", exc_info=e)
134+
return []
135+
136+
def call_opa(self, url: str, json: dict, timeout: int) -> requests.Response:
137+
return self.opa_session.post(
138+
url=url,
139+
json=json,
140+
timeout=timeout,
141+
)
142+
143+
def resolve_user_roles(self, user: User, roles: list[str]) -> list[Role]:
144+
"""
145+
Given a user object and a list of OPA role names, return the Role objects
146+
that must be assigned to this user.
147+
148+
The user object is only needed to ensure that the Role objects are resolved
149+
using the same SQLAlchemy session as the user object.
150+
151+
The Session object assigned to the SecurityManager is apparently not the same
152+
Session as the one used by the FAB login.
153+
"""
154+
result: list[Role] = list()
155+
sqla_session = Session.object_session(user)
156+
superset_roles = sqla_session.query(Role).all()
157+
for role_name in roles:
158+
found = False
159+
160+
for role in superset_roles:
161+
if role.name == role_name:
162+
result.append(role)
163+
log.debug(f"Resolved Superset role [{role_name}].")
164+
found = True
165+
166+
if not found:
167+
log.error(f"Superset role [{role_name}] does not exist.")
168+
return result

0 commit comments

Comments
 (0)