Skip to content
Merged
Show file tree
Hide file tree
Changes from 45 commits
Commits
Show all changes
55 commits
Select commit Hold shift + click to select a range
2761706
Fixing to 2.0.0
Maleware Dec 9, 2024
7fe1305
Fixing typo
Maleware Dec 9, 2024
e138382
Fixing up requests dependency
Maleware Dec 9, 2024
5c710d8
Adding superset Opa manager
Maleware Dec 11, 2024
44c042a
Adding road map
Maleware Dec 11, 2024
79af40d
Adding comment
Maleware Dec 11, 2024
8b6c39d
fix indentation
Maleware Dec 11, 2024
d20cf01
adding wip and deployment setup
Maleware Dec 11, 2024
a55d625
fix opa rule
labrenbe Dec 12, 2024
df4e169
start investigation into role usage
labrenbe Dec 12, 2024
95a7233
Add error handling and configurability & refactor code
labrenbe Dec 18, 2024
1a15b1b
create patch for superset-opa
labrenbe Dec 20, 2024
eb18ffe
make manager a seperate file to load it only if necessary
Maleware Dec 23, 2024
2549452
Updating manager, leaving todos
Maleware Dec 27, 2024
0f52ea7
Manager works the way expected now
Maleware Dec 27, 2024
f8084bb
More sophisticated logs
Maleware Dec 27, 2024
b3de046
Adding better check. Only apply default role if user has none
Maleware Dec 30, 2024
d4e985d
Adding STACKABLE_OPA_BASE_URL
Maleware Jan 2, 2025
82f8652
add first unit tests
labrenbe Jan 3, 2025
e1d7583
add more unit test and fix code style
labrenbe Jan 6, 2025
43b03b4
move opa-authorizer as separate package
labrenbe Jan 10, 2025
7954d3a
fix gitignore
labrenbe Jan 10, 2025
6c83ba4
add caching to 'get_opa_user_roles'
labrenbe Jan 15, 2025
a04716f
fix caching
labrenbe Jan 16, 2025
5a94dc9
remove supersetopa-integration directory
labrenbe Jan 21, 2025
2b1f598
Merge remote-tracking branch 'origin/main' into feature/superset-opa-…
labrenbe Jan 21, 2025
9e7f111
add dummy changelog entry
labrenbe Jan 21, 2025
a20d640
fix linting
labrenbe Jan 21, 2025
44637a1
Merge branch 'main' into feature/superset-opa-integration
Maleware Jan 28, 2025
99dfe75
remove opa client
labrenbe Jan 31, 2025
e387df6
fix typo
labrenbe Jan 31, 2025
01e8950
Merge remote-tracking branch 'origin/main' into feature/superset-opa-…
labrenbe Jan 31, 2025
a3dc945
address feedback on PR
labrenbe Feb 6, 2025
7cd8907
add readme and remove opa client
labrenbe Feb 6, 2025
d8ff055
fix changelog
labrenbe Feb 7, 2025
be486c9
Merge remote-tracking branch 'origin/main' into feature/superset-opa-…
labrenbe Feb 7, 2025
fc1b5b5
Merge remote-tracking branch 'origin/main' into feature/superset-opa-…
labrenbe Feb 7, 2025
a4e345f
Merge branch 'main' into feature/superset-opa-integration
razvan Feb 18, 2025
ea94470
refactor opa authorizer to cache resolved roles
razvan Feb 19, 2025
95a92ad
poetry install doesn't find python. use sync instead
razvan Feb 19, 2025
81b783a
do not default to the Public role
razvan Feb 19, 2025
5a79599
pin poetry version and use heredoc syntax
razvan Feb 20, 2025
6874127
do not mutate user roles anymore
razvan Feb 20, 2025
fc2372c
use the correct SQLAlchemy session to update user roles
razvan Feb 22, 2025
31f7b1e
docs and silence some checker errors
razvan Feb 23, 2025
8cc5029
Merge branch 'main' into feature/superset-opa-integration
razvan Feb 24, 2025
c0ef2f8
fix project dependencies
razvan Feb 25, 2025
27d151c
change log level to debug
razvan Feb 25, 2025
37689c2
clarify doc
razvan Feb 25, 2025
c1709c2
cleanup roles before updating
razvan Feb 25, 2025
39035a8
do not raise exception if role doesn't exist
razvan Feb 25, 2025
2fcacf6
update doc
razvan Feb 25, 2025
889d9e4
Set user roles in "update_user_auth_stat" instead of "get_user_roles"
siegfriedweber Feb 25, 2025
4c83924
remove auth_opa_package
razvan Feb 26, 2025
49ef38d
Merge branch 'main' into feature/superset-opa-integration
razvan Feb 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ All notable changes to this project will be documented in this file.
- airflow: Add OPA support to Airflow ([#978]).
- nifi: Activate `include-hadoop` profile for NiFi version 2.* ([#958]).
- nifi: Add NiFi hadoop Azure and GCP libraries ([#943]).
- superset: Add role mapping from OPA ([#979]).
- base: Add containerdebug tool ([#928], [#959]).
- tools: Add the package util-linux-core ([#952]).
util-linux-core contains a basic set of Linux utilities, including the
Expand Down Expand Up @@ -59,6 +60,7 @@ All notable changes to this project will be documented in this file.
[#935]: https://github.com/stackabletech/docker-images/pull/935
[#962]: https://github.com/stackabletech/docker-images/pull/962
[#978]: https://github.com/stackabletech/docker-images/pull/978
[#979]: https://github.com/stackabletech/docker-images/pull/979
[#980]: https://github.com/stackabletech/docker-images/pull/980
[#981]: https://github.com/stackabletech/docker-images/pull/981
[#982]: https://github.com/stackabletech/docker-images/pull/982
Expand Down
36 changes: 34 additions & 2 deletions superset/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,36 @@

FROM stackable/image/statsd_exporter AS statsd_exporter-builder

FROM stackable/image/stackable-base AS opa-authorizer-builder

ARG PYTHON

COPY superset/stackable/opa-authorizer /tmp/opa-authorizer

RUN <<EOF
microdnf update
microdnf install \
gcc \
gcc-c++ \
python${PYTHON} \
python${PYTHON}-devel \
python${PYTHON}-pip
microdnf clean all
rm -rf /var/cache/yum

pip install \
--no-cache-dir \
--upgrade \
poetry==2.1.1 \
pytest==8.3.4

cd /tmp/opa-authorizer

poetry sync
poetry run pytest
poetry build
EOF

FROM stackable/image/vector AS builder

ARG PRODUCT
Expand All @@ -12,6 +42,7 @@ ARG TARGETARCH
ARG TARGETOS

COPY superset/constraints-${PRODUCT}.txt /tmp/constraints.txt
COPY --from=opa-authorizer-builder /tmp/opa-authorizer/dist/opa_authorizer-0.1.0-py3-none-any.whl /tmp/

RUN microdnf update \
&& microdnf install \
Expand Down Expand Up @@ -62,7 +93,7 @@ RUN python3 -m venv /stackable/app \
# Since https://github.com/stackabletech/superset-operator/pull/530
# admins can add custom configuration to superset_conf.py.
Flask_OIDC==2.2.0 \
Flask-OpenID==1.3.1\
Flask-OpenID==1.3.1 \
# Redhat has removed `tzdata` from the ubi-minimal images: see https://bugzilla.redhat.com/show_bug.cgi?id=2223028.
# Superset relies on ZoneInfo (https://docs.python.org/3/library/zoneinfo.html#data-sources) to resolve time zones, and this is done
# by searching first under `TZPATH` (which is empty due to the point above) or for the tzdata python package.
Expand All @@ -80,7 +111,8 @@ RUN python3 -m venv /stackable/app \
--upgrade \
python-json-logger \
cyclonedx-bom \
&& if [ -n "$AUTHLIB" ]; then pip install Authlib==${AUTHLIB}; fi
&& if [ -n "$AUTHLIB" ]; then pip install Authlib==${AUTHLIB}; fi && \
pip install --no-cache-dir /tmp/opa_authorizer-0.1.0-py3-none-any.whl

COPY superset/stackable/patches /patches
RUN /patches/apply_patches.sh ${PRODUCT}
Expand Down
2 changes: 2 additions & 0 deletions superset/stackable/opa-authorizer/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
**/.pytest_cache
dist
12 changes: 12 additions & 0 deletions superset/stackable/opa-authorizer/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Superset OPA authorizer

Custom Superset security manager that syncs to an Open Policy
Agent

[Poetry](https://python-poetry.org/) is used to build the project:

poetry build

The unit tests can be run as follows:

poetry run pytest
Empty file.
196 changes: 196 additions & 0 deletions superset/stackable/opa-authorizer/opa_authorizer/opa_manager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
"""
Custom security manager for Superset.

Assigns OPA roles to a user. The roles and their permissions must exist in the
Superset database.
"""

import logging
from dataclasses import dataclass
from typing import Optional

import requests
from cachetools import TTLCache, cachedmethod
from flask import current_app, g
from flask_appbuilder import AppBuilder
from flask_appbuilder.security.sqla.models import Role, User
from overrides import override
from sqlalchemy.orm.session import Session
from superset.security import SupersetSecurityManager

log = logging.getLogger(__name__)


class OpaError(Exception):
pass


class SupersetError(Exception):
pass


@dataclass
class OpaResponse:
roles: list[str]


def opa_response_from_json(json: dict[str, object]) -> OpaResponse:
"""Converts a JSON object to an OpaResponse object."""
if "result" in json:
if type(json["result"]) is list:
return OpaResponse(roles=json["result"])

raise OpaError(f"Invalid OPA response: [{json}]")


class OpaSupersetSecurityManager(SupersetSecurityManager):
"""
Custom security manager that syncs role mappings from Open Policy Agent to Superset.
"""

AUTH_OPA_CACHE_MAXSIZE_DEFAULT: int = 1000
AUTH_OPA_CACHE_TTL_IN_SEC_DEFAULT: int = 30
AUTH_OPA_REQUEST_URL_DEFAULT: str = "http://opa:8081/"
AUTH_OPA_REQUEST_TIMEOUT_DEFAULT: int = 10
AUTH_OPA_PACKAGE_DEFAULT: str = "superset"
AUTH_OPA_RULE_DEFAULT: str = "user_roles"

def __init__(self, appbuilder: AppBuilder):
super().__init__(appbuilder)

config = appbuilder.get_app.config

self.role_cache: TTLCache[str, set[Role]] = TTLCache(
maxsize=config.get(
"AUTH_OPA_CACHE_MAXSIZE", self.AUTH_OPA_CACHE_MAXSIZE_DEFAULT
),
ttl=config.get(
"AUTH_OPA_CACHE_TTL_IN_SEC", self.AUTH_OPA_CACHE_TTL_IN_SEC_DEFAULT
),
)

self.auth_opa_url: str = config.get(
"AUTH_OPA_REQUEST_URL", self.AUTH_OPA_REQUEST_URL_DEFAULT
)
self.auth_opa_package: str = config.get(
"AUTH_OPA_PACKAGE", self.AUTH_OPA_PACKAGE_DEFAULT
)
self.auth_opa_rule: str = config.get(
"AUTH_OPA_RULE", self.AUTH_OPA_RULE_DEFAULT
)
self.auth_opa_request_timeout: int = current_app.config.get(
"AUTH_OPA_REQUEST_TIMEOUT", self.AUTH_OPA_REQUEST_TIMEOUT_DEFAULT
)

self.opa_session: requests.Session = requests.Session()

@override
def get_user_roles(self, user: Optional[User] = None) -> list[Role]:
"""
Retrieves a user's roles from an Open Policy Agent instance updating the
user-role mapping in Superset's database in the process.

:returns: A list of roles.
"""
if not user:
user = g.user

if user:
resolved_opa_roles = self.roles(user)

self.merge_user_roles(user, resolved_opa_roles)

return resolved_opa_roles
else:
raise Exception("Cannot get roles without a user.")

@cachedmethod(lambda self: self.role_cache)
def roles(self, user: User) -> list[Role]:
"""
Retrieves a user's role names from an Open Policy Agent instance.
Maps these names to existing Role objects in the Superset database and
possibly updates the user entity.
The result is cached.
"""
opa_role_names = self.opa_get_user_roles(user.username)
result: list[Role] = self.resolve_user_roles(user, opa_role_names)
return result

def merge_user_roles(self, user: User, roles: list[Role]):
"""
Updates the roles of a user in the Superset database if neededd.
"""
if self.superset_roles_outdated(user.roles, roles):
user.roles = roles
# We need to use the same SQLA Session that was used to create the object
sqla_session = Session.object_session(user)
sqla_session.merge(user)
sqla_session.commit()

def superset_roles_outdated(
self, superset_roles: list[Role], opa_roles: list[Role]
) -> bool:
superset_role_set: set[str] = set([role.name for role in superset_roles])
opa_role_set: set[str] = set([role.name for role in opa_roles])
return superset_role_set != opa_role_set

def opa_get_user_roles(self, username: str) -> list[str]:
"""
Queries an Open Policy Agent instance for the roles of a given user.

:returns: A list of Role objects assigned to the user or an empty list.
"""
input = {"input": {"username": username}}
try:
req_url = f"{self.auth_opa_url}/v1/data/{self.auth_opa_package}/{self.auth_opa_rule}"
response = self.call_opa(
url=req_url,
json=input,
timeout=self.auth_opa_request_timeout,
)

opa_response: OpaResponse = response.json(
object_hook=opa_response_from_json
)

log.info(f"OPA role names for user [{username}]: [{opa_response.roles}]")

return opa_response.roles

except Exception as e:
log.error("Failed to get OPA role names", exc_info=e)
return []

def call_opa(self, url: str, json: dict, timeout: int) -> requests.Response:
return self.opa_session.post(
url=url,
json=json,
timeout=timeout,
)

def resolve_user_roles(self, user: User, roles: list[str]) -> list[Role]:
"""
Given a user object and a list of OPA role names, return the Role objects
that must be assigned to this user.

The user object is only needed to ensure that the Role objects are resolved
using the same SQLAlchemy session as the user object.

The Session object assigned to the SecurityManager is apparently not the same
Session as the one used by the FAB login.
"""
result: list[Role] = list()
sqla_session = Session.object_session(user)
superset_roles = sqla_session.query(Role).all()
for role_name in roles:
found = False

for role in superset_roles:
if role.name == role_name:
result.append(role)
log.info(f"Resolved Superset role [{role_name}].")
found = True

if not found:
raise SupersetError(f"Superset role [{role_name}] does not exist.")
return result
Loading