Skip to content

Conversation

@mauricioharley
Copy link

@mauricioharley mauricioharley commented Sep 19, 2025

Implement a comprehensive adoption framework for migrating Barbican service from OpenStack 17.1 to RHOSO 18 while preserving Proteccio Hardware Security Module (HSM) integration.

This PR introduces extends the Barbican adoption role and supporting infrastructure for environments that use Proteccio HSM with Barbican key management service. The standard data-plane-adoption framework does not support HSM backends, making this specialized approach necessary to preserve HSM integration and access to existing secrets.

Fixes: OSPRH-18981

@openshift-ci
Copy link

openshift-ci bot commented Sep 19, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign holser for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/92735661b1e04327a88d672ca5d67c0d

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph RETRY_LIMIT in 17m 33s
adoption-standalone-to-crc-no-ceph RETRY_LIMIT in 18m 49s
✔️ adoption-docs-preview SUCCESS in 1m 25s

@mauricioharley
Copy link
Author

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/a829108aa7ae4509b3ae71edc05ea1ac

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 3h 05m 12s
✔️ adoption-standalone-to-crc-no-ceph SUCCESS in 3h 11m 10s
✔️ adoption-docs-preview SUCCESS in 1m 31s

@mauricioharley
Copy link
Author

recheck

@mauricioharley mauricioharley requested review from jistr and xek September 23, 2025 14:26
@mauricioharley mauricioharley force-pushed the adopt_proteccio_exe branch 6 times, most recently from ce29245 to 7288985 Compare September 29, 2025 13:41
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/8c672d0616ff4a80917a0f85cd7b96fd

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 45m 15s
adoption-standalone-to-crc-no-ceph FAILURE in 1h 46m 13s
✔️ adoption-docs-preview SUCCESS in 1m 33s

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/cd1667bf1e184e3bbdbfd2d9b934642a

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 50m 11s
adoption-standalone-to-crc-no-ceph FAILURE in 1h 49m 17s
✔️ adoption-docs-preview SUCCESS in 1m 22s

# Check for HSM configuration
HSM_CONFIG=$($CONTROLLER1_SSH "sudo grep -A 20 '\[p11_crypto_plugin\]' /var/lib/config-data/puppet-generated/barbican/etc/barbican/barbican.conf || echo 'NO_HSM_CONFIG'")
# Determine if HSM is configured
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not clear to me that the config file would not contain pkcs11 config even if we're not using pkcs11.
Correctly identifying exactly what the config is - is tricky - simply because there are two ways to configure this - one for a single secret store and one for multiple secret stores. I think the multiple secret store way is probably the default at this point.

Regardless, we don't need a general parsing mechanism because we can specify exactly what we want the config to be - whatever is what tripleo configures. I don't remember what it does - but you should be able to see and code accordingly.

The approach I would take (for multiple secret stores) is -

name: check if pkcs11 plugin is enabled

  • look at the config value: secretstore: stores_lookup_suffix
  • this value should be one of "luna, atos, ncipher" (or something like that)
  • you can register this variable as the backend_type

name: patch osp-secret with kek

  • just keep the old task here (it works and has been tested)

block:
if: backend_type = "simple_crypto"

name: patch osp-secret with kek
- just keep the old task here (it works and has been tested)

name:  patch osp-secret with kek
  • same task as before

block:
if : backend_type != simple_crypto"

name: look for hmac_label
get from config entry : p11_crypto_plugin:hmac_label

name: look for mkek_label
get from config as above

name: look for pk11 pin(login)
get from config as above

name: create vendor specific config file

(as you've done - passing in the pin, hmac and mkek labels)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can even be more prescriptive and have simpler code.

Assume that if the person setting up the job knows what parameters to set. They will then need to set
hsm_type: "luna, proteccio,.." or none. You can also assume the entries for the login and labels are correct as specified.

block:
if hsm_type == None
name: patch osp-secret with kek
- just keep the old task here (it works and has been tested)

name: patch osp-secret with kek
-- same as before

block:
if hsm_type != None
name: patch with hsm specific template
-- as you have done

- HSM Detected in Source: {{ barbican_detected_hsm }}
- Adoption Type: {{ 'HSM-aware' if (barbican_hsm_enabled and barbican_detected_hsm) else 'Standard' }}
- name: warn if HSM detected but not enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a lot of validation logic here to determine if the hsm is enabled and if it is detected and so on.
This is - I suppose - nice to have - but you should recall that we're setting up the test environment and the test job completely.

I'd rather have a clean initial implementation that avoids lots of extra variables and extraneous tests so we can focus on what is working.

fi
echo "HSM prerequisites validated successfully"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this whole section is proteccio specific testing. But its also stuff that needed to be done as preparation before the adoption started. So I'm not sure what testing here would do.

- name: use vendor override if provided
ansible.builtin.set_fact:
detected_hsm_vendor: "{{ hsm_vendor_override }}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what this is - lets keep it simple.

- hsm_vendor_override != ""
- barbican_hsm_enabled and barbican_detected_hsm

- name: verify HSM functionality when enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, there is nothing specific here to HSM testing. I thought that we already tested storing and retrieving a secret somewhere else in the adoption tasks. @xek please confirm.

We do need to confirm that we can store and retrieve secrets - but this is not specific to HSM. From a user's point of view, the backend is transparent.

database: BarbicanDatabasePassword
service: BarbicanPassword
simplecryptokek: BarbicanSimpleCryptoKEK
{% if barbican_hsm_enabled %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - if we're using hsm_specific templates for the patches , why do we need this here?

@@ -0,0 +1,84 @@
---
# Proteccio HSM Infrastructure Setup (Fully Parameterized)
- name: Setup Proteccio HSM Infrastructure
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to run this as a pre-script?

An alternative would be to have this as a playbook under barbican_adoption/tasks and then have it be called from there before you patch the control plane.

- barbican_dest_image_tag

- name: Validate Proteccio prerequisites exist
ansible.builtin.stat:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this file going to come from? How is it going to get onto the test machine?
I think what you need here is a curl to retrieve this file from a specific URL - if the proteccio role doesn't already have code to retrieve this file.

@@ -0,0 +1,54 @@
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the way the adoption code is invoked. see test-minimal to see how the code is invoked.

Copy link
Contributor

@vakwetu vakwetu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are lots and lots of variables here. To be honest, that makes this difficult to review.

What we want as a first pass is a simple patch that does what is needed. All the code that detects the hsm_type and then validates that it matches the defined type that is passed in etc. is not needed.

Try to reduce to the smallest amount of required parameters. That will make this easier to incorporate into our tests.

@mauricioharley mauricioharley force-pushed the adopt_proteccio_exe branch 2 times, most recently from bbe1849 to d3b3d73 Compare October 2, 2025 02:04
Extend the existing barbican_adoption role with minimal HSM support for Proteccio integration.

Fixes: OSPRH-18981

Signed-off-by: Mauricio Harley <[email protected]>
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/444d98ceb23b4e17a0468f14ee9e8354

✔️ noop SUCCESS in 0s
✔️ adoption-standalone-to-crc-ceph SUCCESS in 3h 07m 47s
adoption-standalone-to-crc-no-ceph FAILURE in 1h 34m 51s

@mauricioharley
Copy link
Author

recheck

Copy link
Contributor

@vakwetu vakwetu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting to see how all this gets put together. Where is the code that generates the new barbican images being called?

@github-actions
Copy link

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this PR will be closed in 7 days.

@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@github-actions github-actions bot removed the Stale label Oct 23, 2025
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

This PR is stale because it has been for over 15 days with no activity.
Remove stale label or comment or this PR will be closed in 7 days.

@softwarefactory-project-zuul
Copy link

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/data-plane-adoption for 1059,365cbaec3f08852dab6a8b11b50b337f7399bc6b

@openshift-ci
Copy link

openshift-ci bot commented Nov 14, 2025

@mauricioharley: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images 365cbae link true /test images
ci/prow/unit 365cbae link true /test unit

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants