-
Notifications
You must be signed in to change notification settings - Fork 129
Description
When attempting to initialize a pg_auto_failover node using pg_autoctl create postgres, the command consistently ignores the provided --hostname, --name, and --listen parameters. Furthermore, it fails to generate the pg_autoctl.conf file in the specified PGDATA directory, and listen_addresses remains at its default localhost in postgresql.conf. The node registers itself with the monitor using an incorrect and unconfigured IP address (10.126.80.191) and name (node_3) that are not present on the node's actual network interfaces or DNS configuration.
Version Information
pg_autoctl version v2.2
pg_autoctl extension version 2.2
compiled with PostgreSQL 17.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-26), 64-bit
compatible with Postgres 13, 14, 15, 16 and 17
Environment
Node Hostname: vl8dt360dods11
Node Desired IP: 10.126.80.192
Monitor Host IP: 10.126.80.185
Operating System: Red Hat Enterprise Linux (based on GCC info)
Steps to Reproduce
On the Monitor Host (10.126.80.185): Ensure any previously registered node for the problematic IP/name (e.g., node_3, 10.126.80.191) is deleted from the pg_auto_failover monitor database.
SQL
psql 'postgresql://[email protected]:1521/pg_auto_failover?sslmode=require'
DELETE FROM pgautofailover.node WHERE nodename = 'node_3' AND nodehost = '10.126.80.191' AND formationid = 'default';
\q
On the Node Host (vl8dt360dods11 - 10.126.80.192): Perform a full cleanup of PGDATA and pg_autoctl's local state.
Bash
pg_autoctl stop # Ensure it's stopped if running
sudo rm -rf /apps/t360/pgsql_data/autodb/*
sudo rm -rf /apps/t360/pgsql_data/autodb/.??*
rm -rf /home/postgres/.local/share/pg_autoctl/apps/t360/pgsql_data/autodb
On the Node Host (vl8dt360dods11 - 10.126.80.192): Execute the pg_autoctl create postgres command with explicit parameters:
Bash
/apps/t360/pgsql-17/bin/pg_autoctl create postgres
--pgdata /apps/t360/pgsql_data/autodb
--hostname 10.126.80.192
--pgport 1521
--name node_2
--auth trust
--ssl-self-signed
--listen 10.126.80.192
--monitor 'postgres://[email protected]:1521/pg_auto_failover?sslmode=require'
Observe the output of the create command and subsequent checks.
Expected Behavior
The node vl8dt360dods11 should register with the monitor as node_2 with the IP address 10.126.80.192.
A pg_autoctl.conf file should be created within /apps/t360/pgsql_data/autodb/ reflecting the configured name, hostname, and monitor URL.
The listen_addresses parameter in /apps/t360/pgsql_data/autodb/postgresql.conf should be set to 10.126.80.192.
The SSL certificate's Common Name (CN) should be 10.126.80.192.
Actual Behavior
The pg_autoctl create postgres command completes, but the node registers with the monitor as node_3 (Node ID 37) with the IP address 10.126.80.191.
The SSL certificate generated uses /CN=10.126.80.191.
The file /apps/t360/pgsql_data/autodb/pg_autoctl.conf is not created.
The listen_addresses parameter in /apps/t360/pgsql_data/autodb/postgresql.conf remains commented out, effectively defaulting to localhost.
The IP address 10.126.80.191 is not configured on any network interface of vl8dt360dods11.
DNS/reverse DNS lookups from the monitor for vl8dt360dods11 and 10.126.80.192 are correct.
Diagnostic Information
- pg_autoctl create postgres command output:
02:38:20 2624228 INFO Using default --ssl-mode "require"
02:38:20 2624228 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic
02:38:20 2624228 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.
02:38:20 2624228 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details
02:38:20 2624228 INFO Started pg_autoctl postgres service with pid 2624230
02:38:20 2624228 INFO Started pg_autoctl node-init service with pid 2624231
02:38:20 2624230 INFO /apps/t360/pgsql-17/bin/pg_autoctl do service postgres --pgdata /apps/t360/pgsql_data/autodb -v
02:38:20 2624231 INFO Registered node 37 "node_3" (10.126.80.191:1521) in formation "default", group 0, state "single"
02:38:20 2624231 INFO Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/apps/t360/pgsql_data/autodb/pg_autoctl.state"
02:38:20 2624231 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/apps/t360/pgsql_data/autodb/pg_autoctl.init"
02:38:20 2624231 INFO Successfully registered as "single" to the monitor.
02:38:20 2624231 INFO FSM transition from "init" to "single": Start as a single node
02:38:20 2624231 INFO Initialising postgres as a primary
02:38:20 2624231 INFO Initialising a PostgreSQL cluster at "/apps/t360/pgsql_data/autodb"
02:38:20 2624231 INFO /apps/t360/pgsql-17/bin/pg_ctl initdb -s -D /apps/t360/pgsql_data/autodb --option '--auth=trust'
02:38:21 2624231 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /apps/t360/pgsql_data/autodb/server.crt -keyout /apps/t360/pgsql_data/autodb/server.key -subj "/CN=10.126.80.191"
02:38:21 2624252 INFO /apps/t360/pgsql-17/bin/postgres -D /apps/t360/pgsql_data/autodb -p 1521 -h *
02:38:21 2624231 INFO The user "postgres" already exists, skipping.
02:38:21 2624231 INFO CREATE USER postgres
02:38:21 2624231 INFO CREATE DATABASE postgres;
02:38:21 2624231 INFO The database "postgres" already exists, skipping.
02:38:21 2624231 INFO CREATE EXTENSION pg_stat_statements;
02:38:21 2624230 INFO Postgres is now serving PGDATA "/apps/t360/pgsql_data/autodb" on port 1521 with pid 2624252
02:38:21 2624231 INFO Disabling synchronous replication
02:38:21 2624231 INFO Reloading Postgres configuration and HBA rules
02:38:21 2624231 INFO Reloading Postgres configuration and HBA rules
02:38:21 2624231 INFO Transition complete: current state is now "single"
02:38:21 2624231 INFO keeper has been successfully initialized.
02:38:22 2624228 WARN pg_autoctl service node-init exited with exit status 0
02:38:22 2624230 INFO Postgres controller service received signal SIGTERM, terminating
02:38:22 2624230 INFO Stopping pg_autoctl postgres service
02:38:22 2624230 INFO /apps/t360/pgsql-17/bin/pg_ctl --pgdata /apps/t360/pgsql_data/autodb --wait stop --mode fast
02:38:22 2624228 INFO Stop pg_autoctl
2. pg_autoctl show state output (from node):
Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
-------+-------+--------------------+----------------+--------------+---------------------+--------------------
node_3 | 37 | 10.126.80.191:1521 | 1: 0/1545B90 | read-write ! | single | single
3. cat /apps/t360/pgsql_data/autodb/pg_autoctl.conf output (from node):
cat: /apps/t360/pgsql_data/autodb/pg_autoctl.conf: No such file or directory
4. grep listen_addresses /apps/t360/pgsql_data/autodb/postgresql.conf output (from node):
#listen_addresses = 'localhost' # what IP address(es) to listen on;
5. ip a output (from node vl8dt360dods11):
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:be:23:3c brd ff:ff:ff:ff:ff:ff
altname enp3s0
inet 10.126.80.192/22 brd 10.126.83.255 scope global noprefixroute ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:febe:233c/64 scope link
valid_lft forever preferred_lft forever
3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:43:bf:01 brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
- cat /etc/hosts output (from node vl8dt360dods11):
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.126.80.192 vl8dt360dods11.ams.com vl8dt360dods11
10.126.80.61 lin-dco-satellite.ams.com lin-dco-satellite
- DNS/Reverse DNS from Monitor (vl8dt360dods13) for the node's IP/hostname:
[postgres@vl8dt360dods13 log]$ host 10.126.80.192
192.80.126.10.in-addr.arpa domain name pointer vl8dt360dods11.ams.com.
[postgres@vl8dt360dods13 log]$ host vl8dt360dods11
vl8dt360dods11.ams.com has address 10.126.80.192
Troubleshooting Performed
Thorough cleanup of PGDATA and pg_autoctl local state before each create postgres attempt.
Verified no conflicting IP (10.126.80.191) is present on node's network interfaces (ip a).
Verified node's /etc/hosts file correctly maps its hostname to 10.126.80.192.
Verified monitor's DNS/reverse DNS resolves node's IP/hostname correctly to 10.126.80.192.
Attempted to use --auto-create-config (discovered it's not an option for create postgres in v2.2).
Attempted to use --listen 10.126.80.192 explicitly to set listen_addresses.
All evidence points to pg_autoctl v2.2 not properly adhering to explicit --hostname, --name, and --listen parameters during create postgres, and failing to generate its primary configuration file in PGDATA.