[Slony1-general] failover problem

Thu Jun 25 04:24:29 PDT 2009

Hello all,

Platform: centOS Linux 4.7
PostgreSQL: 8.3.6
Slony: 1.2.15
Has anyone run into the problem of recovering the failed node after 
failover process when replication includes large tables?
There is the following situation:
We have 2 nodes running in master-slave configuration with "hot" backup 
mode. When master node failes, the slave should become active.
The problem is that that we may have large tables in the set and it will 
take too much time to subscribe the failed node from scratch.
Are there any solutions for such kind of situation?
May be it is possible to override the node subscription procedure (COPY 
of all the data in the set) with some kind of self-written procedure 
which will copy only the most needed (recent) data, and then copy the 
rest in background and take the responsibility and risks for node sync?
The system is kind of critical application, which needs a backup node 
ready to go.
Also, is there any solution to prevent OS from reboot or shutdown until 
master-slave switchover process is completed?
It's very uncomfortable that when we accidentally reboot master machine, 
our software is receiving soft termination signal at first and tries to 
do a switchover, but there is a pretty big chance that it won't be 
completed before all processes (including slony and postgres) receive 
kill signal and being aborted.

Best regards, Nick.