-
Notifications
You must be signed in to change notification settings - Fork 27
Add Troubleshooting Guide and Improve Installer Script Variable Validation #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
taliandre49
wants to merge
10
commits into
ocp-power-automation:devel
Choose a base branch
from
taliandre49:update_readme_fix
base: devel
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+180
−1
Open
Changes from 6 commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
f870e01
adding logic to give more details on version of rhocs
0f85b69
addinng a troubleshooting document and updating readme accordingly
f1e3688
updating comments for clarity
7afea67
updating for reccomened suggestions from PR comments (i.e blank spaces)
ea9e92b
updating release versions to 4.19
06665e3
updating to account for signature check
d04dcc4
updating troubleShooting guide for uniformity and clarity
838f5dc
incoporating changes after review
b5f41b5
updating script from 4.16 to 4.20
e80bef9
updating script to resolve PR comment, removing redundant checking ad…
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,183 @@ | ||
|
|
||
| # OpenShift on IBM PowerVS: Common Issues and Resolutions | ||
|
|
||
| This document lists common issues encountered when deploying OpenShift on IBM PowerVS using the `openshift-install-powervs` wrapper, along with their causes and resolutions. | ||
|
|
||
| --- | ||
|
|
||
| ## Terraform Stored Resource IDs | ||
|
|
||
| **Error:** | ||
|
|
||
| Error: cannot find resource with id <resource-id> | ||
|
|
||
| **Cause:** | ||
| Terraform retains deleted PowerVS resource IDs in its state or backup files. This often occurs after a Terraform rerun when instances or resources have changed in PowerVS. | ||
|
|
||
|
|
||
| **Resolution:** | ||
|
|
||
| Search for the stale ID in Terraform state or backup files: | ||
|
|
||
| ```bash | ||
| grep -R "<resource-id>" . | ||
| ``` | ||
|
|
||
| Remove stale state entries: | ||
|
|
||
| ```bash | ||
|
|
||
| terraform state rm <resource-name> | ||
| ``` | ||
|
|
||
| Re-run the apply: | ||
|
|
||
| ```bash | ||
| terraform apply | ||
| ``` | ||
|
|
||
| To rebuild specific worker or master nodes: | ||
|
|
||
| ```bash | ||
| terraform taint module.nodes.ibm_pi_instance.worker[0] | ||
| terraform apply | ||
| ``` | ||
|
|
||
| ## Bastion Node OS Compatibility | ||
|
|
||
| If getting errors regarding missing packages or incorrect storage type while using CentOS 10, switch to CentOS Stream 9 to avoid missing package errors or volume type mismatches. | ||
|
|
||
| Common Issues and Fixes | ||
|
|
||
| Missing Required Packages (e.g. Ansible) | ||
|
|
||
| **Error**: | ||
| Missing ansible or dependency packages during setup. | ||
|
|
||
| **Resolution**: | ||
| SSH into the bastion node using the generated key: | ||
| ssh -i id_rsa root@<bastion-external-ip> | ||
| sudo dnf install ansible | ||
|
|
||
| - note: you can also import using python and pip, if the above does not work. | ||
|
|
||
| **Error** | ||
| Incorrect Storage Type (e.g. "nfs" not recognized) | ||
|
|
||
| Error: "pi_volume_type" must contain a value from ["ssd", "standard", "tier1", "tier3"], got "" | ||
taliandre49 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
|
|
||
| **Resolution**: | ||
| Edit your variables.tf or corresponding .tfvars file: | ||
| bastion_storage_type = "tier3" | ||
|
|
||
| - if needed change the defautlt bastion_storage_type in variables.tf to the storage type you desire | ||
| - note you can easly find this by hitting CTRL + W and searching for `bastion_storage_type` | ||
|
|
||
|
|
||
| ## Re-installation / Network Name Conflict | ||
|
|
||
| **Error:** | ||
|
|
||
| Error: Network with name "ocp-net" already exists. | ||
|
|
||
|
|
||
| **Cause:** | ||
| On a subsequent UPI install attempt, Terraform tries to create a network with the same name that already exists. | ||
| PowerVS does not allow duplicate network names—even if the old network is inactive. | ||
|
|
||
| **Resolution:** | ||
|
|
||
| - Log into your PowerVS workspace. | ||
|
|
||
| - Delete or rename the existing ocp-net network or subnet. | ||
|
|
||
| - Re-run the installer: | ||
| ```bash | ||
| terraform apply ./openshift-install-powervs create | ||
| ``` | ||
|
|
||
| ## Remote-Exec Provisioning Errors | ||
|
|
||
| **Error:** | ||
|
|
||
| Terraform remote-exec provisioner failures | ||
|
|
||
|
|
||
| Cause: | ||
| These are transient SSH or remote-execution issues that occur during provisioning. | ||
|
|
||
| Resolution: | ||
| Re-run Terraform: | ||
|
|
||
| terraform apply | ||
|
|
||
|
|
||
| This typically resolves the issue automatically. | ||
| See ocp4-upi-powervs known issues for more details. ["OCP Known issues"]((https://github.com/ocp-power-automation/ocp4-upi-powervs/blob/release-4.6/docs/known_issues.md)) | ||
|
|
||
| 5. LPAR in WARNING State | ||
|
|
||
| Error: | ||
|
|
||
| Error: the operation cannot be performed when the lpar health in the WARNING State | ||
|
|
||
|
|
||
| Cause: | ||
| Terraform cannot modify instances whose PowerVS LPAR health is in WARNING state. | ||
| This often occurs after partial provisioning, failed networking setup, or API timeouts. | ||
|
|
||
| Resolution: | ||
|
|
||
| Check instance health: | ||
| ```bash | ||
| ibmcloud pi instance get <INSTANCE_ID> | ||
| ``` | ||
| **Note**: Due to RSCT daemon not being available for RHCOS, RHCOS instances in dashboard can show "Warning" Status, ignore this! | ||
|
|
||
| In console reboot instances by OS shutting down the instance, then restarting | ||
|
|
||
| To rebuild only specific nodes: | ||
| ```bash | ||
|
|
||
| terraform taint module.nodes.ibm_pi_instance.master[1] | ||
| terraform taint module.nodes.ibm_pi_instance.worker[0] | ||
| terraform apply | ||
| ``` | ||
|
|
||
| ## Missing or Outdated Images (RHEL / RHCOS) | ||
|
|
||
| **Error:** | ||
|
|
||
| Error: failed to perform Get Image Operation for image rhcos-4.15 | ||
| [pcloudCloudinstancesImagesGetNotFound] Image does not exist. ID: rhcos-4.12 | ||
|
|
||
| **Cause:** | ||
| Terraform and the PowerVS provider reference image names (e.g. rhcos-4.15, rhel-8.3) that may not exist in your workspace. | ||
| The wrapper may also use the RHEL version for RHCOS images by mistake. | ||
|
|
||
| **Resolution:** | ||
|
|
||
| Option 1 — Import Pre-built Images | ||
|
|
||
| Use pre-built RHCOS and RHEL OVA images from IBM’s public repository. | ||
| See Christy Norman’s blog | ||
| for steps. ["Blog"](https://community.ibm.com/community/user/blogs/christy-norman/2024/08/06/import-pre-built-red-hat-coreos-ovas-into-powervs) | ||
|
|
||
| Option 2 — Update variables.tf | ||
|
|
||
| Set available image names manually: | ||
| ```bash | ||
|
|
||
| variable "rhel_image_name" { | ||
| default = "rhel-9.6" | ||
| } | ||
|
|
||
| variable "rhcos_image_name" { | ||
| default = "rhcos-4.19" | ||
| } | ||
| ``` | ||
| Option 3 — Export Versions Before Running | ||
| export RELEASE_VER=4.19 | ||
|
|
||
| Ensure RHEL and RHCOS versions are aligned and available. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.