|
| 1 | +--- |
| 2 | +title: "Domain Secret Mismatch" |
| 3 | +date: 2020-03-02T08:08:19-04:00 |
| 4 | +draft: false |
| 5 | +weight: 21 |
| 6 | +--- |
| 7 | + |
| 8 | +> One or more WebLogic Server instances in my domain will not start and the domain resource `status` or the pod log report errors like this: |
| 9 | +> |
| 10 | +> ***Domain secret mismatch. The domain secret in 'DOMAIN_HOME/security/SerializedSystemIni.dat' where DOMAIN_HOME='$DOMAIN_HOME' does not match the domain secret found by the introspector job. WebLogic requires that all WebLogic servers in the same domain share the same domain secret.*** |
| 11 | +
|
| 12 | +When you see these kinds of errors, it means that the WebLogic domain directory's security configuration files have changed in an incompatible way between when the operator scanned |
| 13 | +the domain directory, which occurs during the "introspection" phase, and when the server instance attempted to start. |
| 14 | + |
| 15 | +To understand the "incompatible domain security configuration" type of failure, it's important to review the contents of the |
| 16 | +[WebLogic domain directory](https://docs.oracle.com/middleware/12213/wls/DOMCF/config_files.htm#DOMCF140). Each WebLogic |
| 17 | +domain directory contains a "security" subdirectory that contains a file called "SerializedSystemIni.dat". This file contains |
| 18 | +security data to bootstrap the WebLogic domain, including a domain-specific encryption key. |
| 19 | + |
| 20 | +During introspection, the operator generates a Kubernetes job that runs a pod in the domain's Kubernetes namespace and with the |
| 21 | +same Kubernetes service account that will be used later to run the Administration Server. This pod has access to the Kubernetes |
| 22 | +secret referenced by `weblogicCredentialsSecret` and encrypts these values with the domain-specific encryption key so that the |
| 23 | +secured value can be injected in to the "boot.properties" files when starting server instances. |
| 24 | + |
| 25 | +When the domain directory is changed such that the domain-specific encryption key is different, the "boot.properties" entries |
| 26 | +generated during introspection will now be invalid. |
| 27 | + |
| 28 | +This can happen in a variety of ways, depending on the [model selected](https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/choosing-a-model/): |
| 29 | + |
| 30 | +### Domain in a Docker image |
| 31 | + |
| 32 | +#### 1. Rolling to an image containing new or unrelated domain directory |
| 33 | + |
| 34 | +The error occurs while rolling pods to have containers based on a new Docker image that contains an entirely new or unrelated domain directory. |
| 35 | + |
| 36 | +The problem is that WebLogic cannot support server instances being part of the same WebLogic domain if the server instances do |
| 37 | +not all share the same domain-specific encryption key. Additionally, operator introspection |
| 38 | +currently happens only when starting servers following a total shutdown. Therefore, the "boot.properites" files generated from |
| 39 | +introspecting the image containing the original domain directory will be invalid when used with a container started with |
| 40 | +the updated Docker image containing the new or unrelated domain directory. |
| 41 | + |
| 42 | +The solution is to follow either the recommended [CI/CD guidelines](https://oracle.github.io/weblogic-kubernetes-operator/userguide/cicd/) so that the original and new Docker images contain domain directories |
| 43 | +with consistent domain-specific encryption keys and bootstrapping security details, or to [perform a total shutdown](https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/domain-lifecycle/startup/#starting-and-stopping-servers) of the domain so |
| 44 | +that introspection reoccurs as servers are restarted. |
| 45 | + |
| 46 | +#### 2. Full domain shutdown and restart |
| 47 | + |
| 48 | +The error occurs while starting servers after a full domain shutdown. |
| 49 | + |
| 50 | +If your development model generates new Docker images |
| 51 | +with new and unrelated domain directories and then tags those images with the same tag, then different Kubernetes worker nodes |
| 52 | +may have different images under the same tag in their individual, local Docker repositories. |
| 53 | + |
| 54 | +The simplest solution is to set `imagePullPolicy` to `Always`; however, the better solution would be to design your development |
| 55 | +pipeline to generate new Docker image tags on every build and to never reuse an existing tag. |
| 56 | + |
| 57 | +### Domain on a persistent volume |
| 58 | + |
| 59 | +#### 1. Completely replacing the domain directory |
| 60 | + |
| 61 | +The error occurs while starting servers when the domain directory change was made while other servers were still running. |
| 62 | + |
| 63 | +If completely replacing the domain directory, then you must [stop all running servers](https://oracle.github.io/weblogic-kubernetes-operator/userguide/managing-domains/domain-lifecycle/startup/#starting-and-stopping-servers). |
| 64 | + |
| 65 | +Because all servers will already be stopped, there is no requirement that the new contents of the domain directory be related to |
| 66 | +the previous contents of the domain directory. When starting servers again, the operator will perform its introspection |
| 67 | +of the domain directory. However, you may want to preserve the domain directory security configuration including the domain-specific |
| 68 | +encryption key and, in that case, you should follow a similar pattern as is described in the [CI/CD guidelines](https://oracle.github.io/weblogic-kubernetes-operator/userguide/cicd/) for the domain |
| 69 | +in a Docker image model to preserve the original security-related domain directory files. |
| 70 | + |
0 commit comments