Skip to content

Commit 68384bc

Browse files
authored
Merge pull request #243 from ody/recover_missing_psql_docs
(SOLARCH-434) Procedure for recovering PSQL
2 parents f46ac61 + a4e3f35 commit 68384bc

File tree

2 files changed

+41
-2
lines changed

2 files changed

+41
-2
lines changed
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Recovery procedures
2+
3+
These instructions provide automated procedures for recovering from select failures of PE components which are managed by PEADM.
4+
5+
Additional manual procedures are documented in [recovery.md](recovery.md)
6+
7+
## Replace failed PE-PostgreSQL server (A or B side)
8+
9+
The procedure for replacing a failed PE-PostgreSQL server is the same regardless of which PE-PostgreSQL server is missing or if the name of the PE-PostgrSQL server is the same or different. This procedure uses the following placeholder references.
10+
11+
* _\<replacement-postgres-server-fqdn\>_ - The FQDN and certname of the new server being brought in to replace the failed PE-PostgreSQL server
12+
* _\<working-postgres-server-fqdn\>_ - The FQDN and certname of the still-working PE-PostgreSQL server
13+
* _\<failed-postgres-server-fqdn\>_ - The FQDN and certname of the failed PE-PostgreSQL server
14+
* _\<primary-server-fqdn\>_ - The FQDN and certname of the primary Puppet server
15+
* _\<replica-server-fqdn\>_ - The FQDN and certname of the replica Puppet server
16+
17+
Procedure:
18+
19+
1. Stop `puppet.service` on Puppet server primary and replica
20+
21+
bolt task run service name=puppet.service action=stop --targets <primary-server-fqdn>,<replica-server-fqdn>
22+
23+
2. Temporarily set both primary and replica server nodes so that they use the remaining healthy PE-PostgreSQL server
24+
25+
bolt plan run peadm::util::update_db_setting --target <primary-server-fqdn>,<replica-server-fqdn> primary_postgresql_host=<working-postgres-server-fqdn>
26+
27+
3. Restart `pe-puppetdb.service` on Puppet server primary and replica
28+
29+
bolt task run service name=pe-puppetdb.service action=restart --targets <primary-server-fqdn>,<replica-server-fqdn>
30+
31+
4. Purge failed PE-PostgreSQL node from PuppetDB
32+
33+
bolt command run "/opt/puppetlabs/bin/puppet node purge <failed-postgres-server-fqdn>" --targets <primary-server-fqdn>
34+
35+
5. Run `peadm::add_database` plan to deploy replacement PE-PostgreSQL server
36+
37+
bolt plan run peadm::add_database -t <replacement-postgres-server-fqdn> primary_host=<primary-server-fqdn>

plans/util/update_classification.pp

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,15 +42,17 @@
4242
$overridden_replica_postgresql_target = $replica_postgresql_target
4343
}
4444

45-
$new = merge($current, {
45+
$filtered = {
4646
'primary_host' => $primary_target.peadm::certname(),
4747
'replica_host' => $replica_target.peadm::certname(),
4848
'primary_postgresql_host' => $primary_postgresql_target.peadm::certname(),
4949
'replica_postgresql_host' => $overridden_replica_postgresql_target.peadm::certname(),
5050
'compiler_pool_address' => $compiler_pool_address,
5151
'internal_compiler_a_pool_address' => $internal_compiler_a_pool_address,
5252
'internal_compiler_b_pool_address' => $internal_compiler_b_pool_address
53-
})
53+
}.filter |$parameter| { $parameter[1] }
54+
55+
$new = merge($current, $filtered)
5456

5557
out::message('Classification to be updated using the following hash...')
5658
out::message($new)

0 commit comments

Comments
 (0)