Skip to content
Open
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
4d3dc09
remove obsolete errors
maltesander Nov 5, 2025
ca2c9d0
compiling - without tls
maltesander Nov 5, 2025
11dafe1
oparized smoke test working
maltesander Nov 7, 2025
af9f8cb
add hive opa example
maltesander Nov 7, 2025
7321b2c
add opa testing to smoke
maltesander Nov 7, 2025
56aa6cd
Merge branch 'main' into feat/enable-opa-authorization
maltesander Nov 7, 2025
5c9c73c
adapted changelog
maltesander Nov 7, 2025
7c7bedf
enable tls
maltesander Nov 8, 2025
c8c9f86
add opa-use-tls dimension
maltesander Nov 8, 2025
2fe3b8c
remove left over trino references
maltesander Nov 8, 2025
0cf0465
started docs
maltesander Nov 9, 2025
e29a06e
formatting
maltesander Nov 9, 2025
e3cf46a
pre commit
maltesander Nov 9, 2025
815a74b
add opa opeartor to test suite
maltesander Nov 12, 2025
32b401c
Update docs/modules/hive/pages/usage-guide/security.adoc
maltesander Nov 17, 2025
01d7bea
add missing docs link
maltesander Nov 17, 2025
5e8b395
review feedback
maltesander Nov 17, 2025
297ad8c
fix broken crd docs url
maltesander Nov 18, 2025
335cc38
regenerate charts
maltesander Nov 18, 2025
975d0dc
add 4.2.0 to tests and supported versions
maltesander Nov 25, 2025
fbffd67
Merge branch 'main' into feat/enable-opa-authorization
maltesander Nov 25, 2025
1495f07
Merge branch 'main' into feat/enable-opa-authorization
maltesander Dec 1, 2025
59de3cf
Merge remote-tracking branch 'origin/main' into feat/enable-opa-autho…
maltesander Dec 3, 2025
2d65eee
Use "org.apache.derby.iapi.jdbc.AutoloadedDriver" for derby in 4.2.0
maltesander Dec 3, 2025
81e584b
fix pre commit
maltesander Dec 3, 2025
cc7ca9d
add document start
maltesander Dec 3, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.

## [Unreleased]

### Added

- Add OPA authorization using the operator-rs `OpaConfig` ([#652]).

[#652]: https://github.com/stackabletech/hive-operator/pull/652

## [25.11.0] - 2025-11-07

## [25.11.0-rc1] - 2025-11-06
Expand Down
29 changes: 28 additions & 1 deletion deploy/helm/hive-operator/crds/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ spec:
The settings in the `clusterConfig` are cluster wide settings that do not need to be configurable at role or role group level.
properties:
authentication:
description: Settings related to user [authentication](https://docs.stackable.tech/home/nightly/usage-guide/security).
description: Settings related to user [authentication](https://docs.stackable.tech/home/nightly/hive/usage-guide/security).
nullable: true
properties:
kerberos:
Expand All @@ -49,6 +49,33 @@ spec:
required:
- kerberos
type: object
authorization:
description: |-
Authorization options for Hive.
Learn more in the [Hive authorization usage guide](https://docs.stackable.tech/home/nightly/hive/usage-guide/security#authorization).
nullable: true
properties:
opa:
description: |-
Configure the OPA stacklet [discovery ConfigMap](https://docs.stackable.tech/home/nightly/concepts/service_discovery)
and the name of the Rego package containing your authorization rules.
Consult the [OPA authorization documentation](https://docs.stackable.tech/home/nightly/concepts/opa)
to learn how to deploy Rego authorization rules with OPA.
nullable: true
properties:
configMapName:
description: |-
The [discovery ConfigMap](https://docs.stackable.tech/home/nightly/concepts/service_discovery)
for the OPA stacklet that should be used for authorization requests.
type: string
package:
description: The name of the Rego package containing the Rego rules for the product.
nullable: true
type: string
required:
- configMapName
type: object
type: object
database:
description: Database connection specification for the metadata database.
properties:
Expand Down
116 changes: 116 additions & 0 deletions docs/modules/hive/pages/usage-guide/security.adoc
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
= Security
:description: Secure Apache Hive with Kerberos authentication in Kubernetes. Configure Kerberos server, SecretClass, and access Hive securely with provided guides.
:opa-rego-docs: https://www.openpolicyagent.org/docs/latest/#rego

== Authentication
Currently, the only supported authentication mechanism is Kerberos, which is disabled by default.
Expand Down Expand Up @@ -45,3 +46,118 @@ The `kerberos.secretClass` is used to give Hive the possibility to request keyta
=== 5. Access Hive
In case you want to access Hive it is recommended to start up a client Pod that connects to Hive, rather than shelling into the master.
We have an https://github.com/stackabletech/hive-operator/blob/main/tests/templates/kuttl/kerberos/70-install-access-hive.yaml.j2[integration test] for this exact purpose, where you can see how to connect and get a valid keytab.


== Authorization
The Stackable Operator for Apache Hive supports the following authorization methods.

=== Open Policy Agent (OPA)
The Apache Hive metastore can be configured to delegate authorization decisions to an Open Policy Agent (OPA) instance.
More information on the setup and configuration of OPA can be found in the xref:opa:index.adoc[OPA Operator documentation].
A Hive cluster can be configured using OPA authorization by adding this section to the configuration:

[source,yaml]
----
spec:
clusterConfig:
authorization:
opa:
configMapName: opa # <1>
package: hms # <2>
----
<1> The name of your OPA Stacklet (`opa` in this case)
<2> The rego rule package to use for policy decisions.
This is optional and defaults to the name of the Hive Stacklet.

==== Defining rego rules
For a general explanation of how rules are written, please refer to the {opa-rego-docs}[OPA documentation].
Authorization with OPA is done using the https://github.com/boschglobal/hive-metastore-opa-authorizer[hive-metastore-opa-authorizer] plugin.

===== OPA Inputs
The payload sent by Hive with each request to OPA, that is accessible within the rego rules, has the following structure:

[source,json]
----
{
"identity": {
"username": "<user>",
"groups": ["<group1>", "<group2>"]
},
"resources": {
"database": null,
"table": null,
"partition": null,
"columns": ["col1", "col2"]
},
"privileges": {
"readRequiredPriv": [],
"writeRequiredPriv": [],
"inputs": null,
"outputs": null
}
}
----
* `identity`: Contains user information.
** `username`: The name of the user.
** `groups`: A list of groups the user belongs to.
* `resources`: Specifies the resources involved in the request.
** `database`: The database object.
** `table`: The table object.
** `partition`: The partition object.
** `columns`: A list of column names involved in the request.
* `privileges`: Details the privileges required for the request.
** `readRequiredPriv`: A list of required read privileges.
** `writeRequiredPriv`: A list of required write privileges.
** `inputs`: Input tables for the request.
** `outputs`: Output tables for the request.

===== Example OPA Rego Rule
Below is a basic rego rule that demonstrates how to handle input dictionary sent from the hive authorizer to OPA:

[source,rego]
----
package hms

default database_allow = false
default table_allow = false
default column_allow = false
default partition_allow = false
default user_allow = false

database_allow if {
input.identity.username == "stackable"
input.resources.database.name == "test_db"
}

table_allow if {
input.identity.username == "stackable"
input.resources.table.dbName == "test_db"
input.resources.table.tableName == "test_table"
input.privileges.readRequiredPriv[0].priv == "SELECT"
}

table_allow if {
input.identity.username == "stackable"
input.resources.table.dbName == "test_db"
input.privileges.writeRequiredPriv[0].priv == "CREATE"
}
----
* `database_allow` grants access if the user is `stackable` and the database is `test_db`.
* `table_allow` grants access if the user is `stackable`, the database is `test_db` and:
** the table is `test_table` and the required read privilege is `SELECT`.
** the required write privilege is `CREATE` without any table restriction.

==== Configuring policy URLs

The `database_allow`, `table_allow`, `column_allow`, `partition_allow`, and `user_allow` policy URLs can be xref:usage-guide/overrides.adoc#_configuration_properties[config overridden] using the properties in `hive-site.xml`:

* `com.bosch.bdps.opa.authorization.policy.url.database`
* `com.bosch.bdps.opa.authorization.policy.url.table`
* `com.bosch.bdps.opa.authorization.policy.url.column`
* `com.bosch.bdps.opa.authorization.policy.url.partition`
* `com.bosch.bdps.opa.authorization.policy.url.user`

==== TLS secured OPA cluster

Stackable OPA clusters secured via TLS are supported and no further configuration is required.
The Stackable Hive operator automatically adds the certificate from the SecretClass used to secure the OPA cluster to its trust.
1 change: 1 addition & 0 deletions docs/modules/hive/partials/supported-versions.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
// This is a separate file, since it is used by both the direct Hive-Operator documentation, and the overarching
// Stackable Platform documentation.

- 4.2.0 (experimental)
- 4.1.0 (experimental)
- 4.0.1 (LTS)
- 4.0.0 (deprecated)
Expand Down
89 changes: 89 additions & 0 deletions examples/hive-opa-cluster.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# helm install postgresql oci://registry-1.docker.io/bitnamicharts/postgresql \
# --version 16.5.0 \
# --namespace default \
# --set image.repository=bitnamilegacy/postgresql \
# --set volumePermissions.image.repository=bitnamilegacy/os-shell \
# --set metrics.image.repository=bitnamilegacy/postgres-exporter \
# --set global.security.allowInsecureImages=true \
# --set auth.username=hive \
# --set auth.password=hive \
# --set auth.database=hive \
# --set primary.extendedConfiguration="password_encryption=md5" \
# --wait
---
apiVersion: hive.stackable.tech/v1alpha1
kind: HiveCluster
metadata:
name: hive
spec:
image:
productVersion: 4.1.0
pullPolicy: IfNotPresent
clusterConfig:
authorization:
opa:
configMapName: opa
package: hms
database:
connString: jdbc:postgresql://postgresql:5432/hive
credentialsSecret: hive-postgresql-credentials
dbType: postgres
metastore:
roleGroups:
default:
replicas: 1
config:
resources:
cpu:
min: 300m
max: "2"
memory:
limit: 5Gi
---
apiVersion: v1
kind: Secret
metadata:
name: hive-postgresql-credentials
type: Opaque
stringData:
username: hive
password: hive
---
apiVersion: opa.stackable.tech/v1alpha1
kind: OpaCluster
metadata:
name: opa
spec:
image:
productVersion: 1.8.0
servers:
config:
logging:
enableVectorAgent: false
containers:
opa:
console:
level: INFO
file:
level: INFO
loggers:
decision:
level: INFO
roleGroups:
default: {}
---
apiVersion: v1
kind: ConfigMap
metadata:
name: hive-opa-bundle
labels:
opa.stackable.tech/bundle: "hms"
data:
hive.rego: |
package hms

database_allow = true
table_allow = true
column_allow = true
partition_allow = true
user_allow = true
22 changes: 17 additions & 5 deletions rust/operator-binary/src/command.rs
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
use stackable_operator::crd::s3;

use crate::crd::{
DB_PASSWORD_ENV, DB_PASSWORD_PLACEHOLDER, DB_USERNAME_ENV, DB_USERNAME_PLACEHOLDER,
HIVE_METASTORE_LOG4J2_PROPERTIES, HIVE_SITE_XML, STACKABLE_CONFIG_DIR,
STACKABLE_CONFIG_MOUNT_DIR, STACKABLE_LOG_CONFIG_MOUNT_DIR, STACKABLE_TRUST_STORE,
STACKABLE_TRUST_STORE_PASSWORD, v1alpha1,
use crate::{
config::opa::HiveOpaConfig,
crd::{
DB_PASSWORD_ENV, DB_PASSWORD_PLACEHOLDER, DB_USERNAME_ENV, DB_USERNAME_PLACEHOLDER,
HIVE_METASTORE_LOG4J2_PROPERTIES, HIVE_SITE_XML, STACKABLE_CONFIG_DIR,
STACKABLE_CONFIG_MOUNT_DIR, STACKABLE_LOG_CONFIG_MOUNT_DIR, STACKABLE_TRUST_STORE,
STACKABLE_TRUST_STORE_PASSWORD, v1alpha1,
},
};

pub fn build_container_command_args(
hive: &v1alpha1::HiveCluster,
start_command: String,
s3_connection_spec: Option<&s3::v1alpha1::ConnectionSpec>,
hive_opa_config: Option<&HiveOpaConfig>,
) -> Vec<String> {
let mut args = vec![
// copy config files to a writeable empty folder in order to set s3 access and secret keys
Expand Down Expand Up @@ -51,6 +55,14 @@ pub fn build_container_command_args(
}
}

if let Some(opa) = hive_opa_config {
if let Some(ca_cert_dir) = opa.tls_ca_cert_mount_path() {
args.push(format!(
"cert-tools generate-pkcs12-truststore --pkcs12 {STACKABLE_TRUST_STORE}:{STACKABLE_TRUST_STORE_PASSWORD} --pem {ca_cert_dir}/ca.crt --out {STACKABLE_TRUST_STORE} --out-password {STACKABLE_TRUST_STORE_PASSWORD}"
));
}
}

// db credentials
args.extend([
format!("echo replacing {DB_USERNAME_PLACEHOLDER} and {DB_PASSWORD_PLACEHOLDER} with secret values."),
Expand Down
1 change: 1 addition & 0 deletions rust/operator-binary/src/config/mod.rs
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
pub mod jvm;
pub mod opa;
Loading