Skip to content

Conversation

YCShen1010
Copy link
Contributor

@YCShen1010 YCShen1010 commented Aug 14, 2025

What this PR does / why we need it:
Adds automatic IDP URL updating functionality to the common-services database restore script. After restoring the account_iam database, the it now automatically updates IDP configuration URLs to match the current cluster domain.

Which issue(s) this PR fixes:
Fixes # https://github.ibm.com/IBMPrivateCloud/roadmap/issues/67335

How to test:

  1. oc login into a cluster and install UM operator

  2. Manually change the idp in database to other URL in common-service-db terminal by using command

    psql -U postgres
     \l
     \c account_iam
     \dn
     \dt accountiam.*;
     SELECT * FROM accountiam.idp_config;
    
    UPDATE accountiam.idp_config
    SET idp = 'https://test-idp-url.example.com/idprovider/v1/auth'
    
  3. Create a simple test script for the new functions and execute.

  4. Check if the idp URL changes back

Test result:

➜  common-service-db git:(update_idp) ✗ bash test-idp-update.sh mcsp  
Getting primary PostgreSQL pod...
Primary pod: common-service-db-1
Checking if account_iam database exists...
Current IDP configuration BEFORE update:
                 uid                  |    realm     |                                  idp                                   |          modified_ts          
--------------------------------------+--------------+------------------------------------------------------------------------+-------------------------------
 0a928673-f2a7-4804-b781-ffa9f54e7985 | PrimaryRealm | https://test-idp-url.example.com/idprovider/v1/auth | 2025-08-14 21:38:05.730994+00
 ed890909-2896-4a7a-951c-954f524892ad | SAML         | https://test-idp-url.example.com/idprovider/v1/auth | 2025-08-14 21:38:05.730994+00
 2d13f58e-19bb-4601-9d29-e8d9c295cf71 | OIDC         | https://test-idp-url.example.com/idprovider/v1/auth | 2025-08-14 21:38:05.730994+00
 d26bb99a-ed17-4b4c-8918-04ef966c52ee | PrimaryRealm | https://test-idp-url.example.com/idprovider/v1/auth | 2025-08-14 21:38:05.730994+00
(4 rows)

Getting cluster domain...
Detected cluster domain: apps.cutie1.cp.fyre.ibm.com
New IDP URL will be: https://cp-console.mcsp.apps.cutie1.cp.fyre.ibm.com/idprovider/v1/auth
Updating IDP configuration...
UPDATE 4
IDP configuration AFTER update:
                 uid                  |    realm     |                                  idp                                   |          modified_ts          
--------------------------------------+--------------+------------------------------------------------------------------------+-------------------------------
 0a928673-f2a7-4804-b781-ffa9f54e7985 | PrimaryRealm | https://cp-console.mcsp.apps.cutie1.cp.fyre.ibm.com/idprovider/v1/auth | 2025-08-14 21:55:25.937126+00
 ed890909-2896-4a7a-951c-954f524892ad | SAML         | https://cp-console.mcsp.apps.cutie1.cp.fyre.ibm.com/idprovider/v1/auth | 2025-08-14 21:55:25.937126+00
 2d13f58e-19bb-4601-9d29-e8d9c295cf71 | OIDC         | https://cp-console.mcsp.apps.cutie1.cp.fyre.ibm.com/idprovider/v1/auth | 2025-08-14 21:55:25.937126+00
 d26bb99a-ed17-4b4c-8918-04ef966c52ee | PrimaryRealm | https://cp-console.mcsp.apps.cutie1.cp.fyre.ibm.com/idprovider/v1/auth | 2025-08-14 21:55:25.937126+00
(4 rows)

Test completed!

@YCShen1010 YCShen1010 requested a review from bluzarraga August 14, 2025 22:00
@ibm-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: YCShen1010

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bluzarraga
Copy link
Member

@YCShen1010 can you share the test script you used with me?

Copy link
Member

@bluzarraga bluzarraga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my main issue is that we don't want to be adding new Roles/Rolebindings in new namespaces to obtain the domain info. Preferably we want to get this value in a way that requires less permissions. One place to check for the cluster domain value is the configmap ibmcloud-cluster-info in the services namespace. We do not BR this configmap, it is generated by the IM operator so it should always contain the correct value. We may just have to make sure the configmap exists.

The logic you have in the pr looks like it would work (I haven't gotten a chance to try it yet) but to know if this PR would work, we want to see it run as a restore hook. The script you have to test is a good proof of concept to make sure the logic is doing what you want it to but somethings we will only see when testing more along the lines of how they will be used. For example, the test script would not have picked up the permissions problem but running the restore would have.

info "Updating IDP configuration with actual cluster domain..."

# Get the cluster domain from the management ingress
CLUSTER_DOMAIN=$(oc get route console -n openshift-console -o jsonpath='{.spec.host}' | sed 's/^console-openshift-console\.//')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add permissions to the BR role to enable this? Right now the br resources are limited to the services namespace. Can we determine this value from somewhere else?

# Check if account_iam database exists
ACCOUNT_IAM_EXISTS=$(oc -n $CSDB_NAMESPACE exec -t $CNPG_PRIMARY_POD -c postgres -- psql -U postgres -c "\list" | grep "account_iam" || echo False)

if [[ $ACCOUNT_IAM_EXISTS != "False" ]]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should include the call to check the cluster domain value inside this if. That way we only look for this value when we need it


success "IDP configuration updated successfully."
else
warning "account_iam database not found, skipping IDP configuration update."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick but I think this should be info as right now it is more likely to not be installed given current adopters' usage. I think we could change the message from "skipping IDP configuration update" to something less ominous. If you are reading these logs and don't know what you are skipping, you may think that the backup and restore process did not work properly because the IDP config wasn't updated when what really happened was the MCSP specific URL wasn't updated. Paired with the warning it could send the wrong message

Copy link
Member

@bluzarraga bluzarraga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting this error when I try to run the BR hook:

pg_dump: error: connection to server on socket "/controller/run/.s.PGSQL.5432" failed: FATAL:  database "cloudpak" does not exist
command terminated with exit code 1

I think I remember something about removing the cloudpak database from the cs db setup since it was unused. Can we go ahead and remove it from this script as well? Removing it in my cluster moved the backup forward but eventually my restore failed because of use of xargs


if [[ $ACCOUNT_IAM_EXISTS != "False" ]]; then
# Check current IDP configuration
CURRENT_IDP=$(oc -n $CSDB_NAMESPACE exec -t $CNPG_PRIMARY_POD -c postgres -- psql -U postgres -d account_iam -t -c "SELECT DISTINCT idp FROM accountiam.idp_config WHERE idp LIKE '%/idprovider/v1/%' LIMIT 1;" | xargs || echo "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just finished a full BR with these changes and realized that you are using xargs here. Unfortunately, we cannot use xargs because it has a number of security vulnerabilities and we have been unable to clear it with Tod for inclusion in the cpfs-utils image. We'll need to update this command to not rely on xargs. Very annoying I know but unfortunately not very flexible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used another approach excluding xargs to get the same IDP and have pushed the commit now, thanks for pointing this out.

@bluzarraga
Copy link
Member

it looks like this pr was supposed to have removed the cloud pak database https://github.com/IBM/ibm-common-service-operator/pull/2532/files so I am not sure why it is still present here

@YCShen1010
Copy link
Contributor Author

YCShen1010 commented Aug 26, 2025

Hi @bluzarraga, I have done the rebase, the cloudpak database is not in the script anymore.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants