PostgreSQL Switchover / Failover
Charmed PostgreSQL constantly monitors the cluster status and performs automated failover in case of Primary unit gone. Sometimes manual switchover is necessary for hardware maintenance reasons. Check the difference between them here.
The manual switchover is possible using Juju action promote-to-primary.
Important: Charmed PostgreSQL has been designed to provide maximum guaranties for the data survival in all corner cases, therefor allowed actions depends on the configured Juju unit state.
Switchover
To switchover the PostgreSQL Primary (write-endpoint) to new Juju unit, use Juju action promote-to-primary
(on the unit x
, which will be promoted as a new Primary):
juju run postgresql/x promote-to-primary scope=unit
Note: The manual switchover is possible on the healthy ‘Sync Standby’ unit only. Otherwise it will be rejected by Patroni with the reason explanation.
Note: It is a normal situation when Juju leader unit and PostgreSQL Primary unit are pointing to different Juju units. Juju Leader failover is fully automated and can be enforced for educational purpose only! Do NOT trigger Juju leader election for Primary moves.
Failover
Charmed PostgreSQL doesn’t provide manual failover due to lack of data safety guaranties. Advanced users can still execute it using patronictl and Patroni REST API. The same time Charmed PostgreSQL allows the cluster recovery using the full PostgreSQL/Patroni/Raft cluster re-initialization.
Raft re-initialization
Warning: this is the worst possible recovery case scenario when Primary and ALL Sync Standby units lost simultaneously and their data cannot be recovered from the disc. In this case Patroni cannot perform automatic failover for the only available Replica(s) units. Still Patroni provides the read-only access to the data.
The manual failover procedure cannot guaranty the latest SQL transactions availability on the Replica unit(s) (due to the lag distance to Primary)! Also Raft cluster consensus is not possible when one unit left in three units cluster.
The command to re-init Raft cluster should be executed when charm is ready:
- the one/last Juju unit is available in Juju application
- the last unit was has detected Raft majority lost, status:
Raft majority loss, run: promote-to-primary
To re-initialize Raft and fix the Partition/PostgreSQL cluster (when requested):
juju run postgresql/x promote-to-primary scope=unit force=true
Example of Raft re-initialization
Deploy PostgreSQL 3 units:
> juju deploy postgresql --config synchronous_node_count=1
> juju status
Model Controller Cloud/Region Version SLA Timestamp
postgresql lxd localhost/localhost 3.6.5 unsupported 14:50:19+02:00
App Version Status Scale Charm Channel Rev Exposed Message
postgresql 14.17 active 3 postgresql 14/edge 615 no
Unit Workload Agent Machine Public address Ports Message
postgresql/0* active idle 0 10.189.210.53 5432/tcp
postgresql/1 active idle 1 10.189.210.166 5432/tcp
postgresql/2 active idle 2 10.189.210.188 5432/tcp Primary
Machine State Address Inst id Base AZ Message
0 started 10.189.210.53 juju-422c1a-0 [email protected] Running
1 started 10.189.210.166 juju-422c1a-1 [email protected] Running
2 started 10.189.210.188 juju-422c1a-2 [email protected] Running
Find the current Primary/Standby/Replica:
> juju ssh postgresql/0
ubuntu@juju-422c1a-0:~$ sudo -u snap_daemon patronictl -c /var/snap/charmed-postgresql/current/etc/patroni/patroni.yaml list
+ Cluster: postgresql (7499430436963402504) ---+-----------+----+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------------+----------------+--------------+-----------+----+-----------+
| postgresql-0 | 10.189.210.53 | Sync Standby | streaming | 3 | 0 |
| postgresql-1 | 10.189.210.166 | Replica | streaming | 3 | 0 |
| postgresql-2 | 10.189.210.188 | Leader | running | 3 | |
+--------------+----------------+--------------+-----------+----+-----------+
Kill the Leader and Sync Standby machines:
> lxc stop --force juju-422c1a-0 && lxc stop --force juju-422c1a-2
> juju status
Model Controller Cloud/Region Version SLA Timestamp
postgresql lxd localhost/localhost 3.6.5 unsupported 14:54:40+02:00
App Version Status Scale Charm Channel Rev Exposed Message
postgresql 14.17 active 1/3 postgresql 14/edge 615 no
Unit Workload Agent Machine Public address Ports Message
postgresql/0 unknown lost 0 10.189.210.53 5432/tcp agent lost, see 'juju show-status-log postgresql/0'
postgresql/1* active idle 1 10.189.210.166 5432/tcp <<<<<<<<< Replica unit left only
postgresql/2 unknown lost 2 10.189.210.188 5432/tcp agent lost, see 'juju show-status-log postgresql/2'
Machine State Address Inst id Base AZ Message
0 down 10.189.210.53 juju-422c1a-0 [email protected] Running
1 started 10.189.210.166 juju-422c1a-1 [email protected] Running
2 down 10.189.210.188 juju-422c1a-2 [email protected] Running
At this stage it is recommended to restore the lost nodes, they will rejoin the cluster automatically once Juju detects their availability.
To start Raft re-initialization, remove DEAD machines as a signal to charm that they cannot be restored/started and no risks for split-brain:
> juju remove-machine --force 0
WARNING This command will perform the following actions:
will remove machine 0
- will remove unit postgresql/0
- will remove storage pgdata/0
Continue [y/N]? y
> juju remove-machine --force 2
WARNING This command will perform the following actions:
will remove machine 2
- will remove unit postgresql/2
- will remove storage pgdata/2
Continue [y/N]? y
Check the status to ensure Raft majority loss
:
> juju status
...
Unit Workload Agent Machine Public address Ports Message
postgresql/1* blocked executing 1 10.189.210.166 5432/tcp Raft majority loss, run: promote-to-primary
...
Start Raft re-initialization:
> juju run postgresql/1 promote-to-primary scope=unit force=true
Wait for re-initiation to be completed:
> juju status
...
Unit Workload Agent Machine Public address Ports Message
postgresql/1* maintenance executing 3 10.189.210.166 5432/tcp (promote-to-primary) Reinitialising raft
...
At the end, the Primary until is back:
> juju status
Model Controller Cloud/Region Version SLA Timestamp
postgresql lxd localhost/localhost 3.6.5 unsupported 15:03:12+02:00
App Version Status Scale Charm Channel Rev Exposed Message
postgresql 14.17 active 1 postgresql 14/edge 615 no
Unit Workload Agent Machine Public address Ports Message
postgresql/1* active idle 1 10.189.210.166 5432/tcp Primary
Machine State Address Inst id Base AZ Message
1 started 10.189.210.166 juju-422c1a-1 [email protected] Running
Scale application to 3+ units to complete HA recovery:
> juju add-unit postgresql -n 2
The healthy status:
> juju status
Model Controller Cloud/Region Version SLA Timestamp
postgresql lxd localhost/localhost 3.6.5 unsupported 15:09:56+02:00
App Version Status Scale Charm Channel Rev Exposed Message
postgresql 14.17 active 3 postgresql 14/edge 615 no
Unit Workload Agent Machine Public address Ports Message
postgresql/1* active idle 1 10.189.210.166 5432/tcp Primary
postgresql/3 active idle 3 10.189.210.124 5432/tcp
postgresql/4 active idle 4 10.189.210.178 5432/tcp
Machine State Address Inst id Base AZ Message
1 started 10.189.210.166 juju-422c1a-1 [email protected] Running
3 started 10.189.210.124 juju-422c1a-3 [email protected] Running
4 started 10.189.210.178 juju-422c1a-4 [email protected] Running