Thu Jun 18 11:18:23 PDT 2009
- Previous message: [Slony1-general] Issue with configuring Slony-I on Windows
- Next message: [Slony1-general] Subscription errors don't automatically recover
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, I've been trying to get failover to work in 2.0.2, but it seems to hang. I have a 3-node architecture, and have tried the instructions, per http://www.slony.info/documentation/failover.html#COMPLEXFAILOVER Here's how I do it (node 1 is provider, and node 2 is failover node): -- subscribe node 3 to node 2 -- execute FAILOVER -- slonik hangs If I go into node 2 and to and look at sl_subscribe, there is only one row with provider=2, subscriber=3 (which is correct and expected). However, looking at sl_status, looks like everything is running just fine (sl_event_lag and sl_time_lag go up and down, as if there's activity). HOWEVER, if I do an update on node 2, the update never makes it to node 3. (Node 1 still says provider=1, subscriber=2 AND provider=2, subscriber=3) slonik is still running/hanging during all this. if I strace the slonik process, I find the following: ======BEGIN STRACE====== rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0 sendto(3, "Q\0\0\0\30begin transaction; \0"..., 25, 0, NULL, 0) = 25 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 poll([{fd=3, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=3, revents=POLLIN}]) recvfrom(3, "C\0\0\0\nBEGIN\0Z\0\0\0\5T"..., 16384, 0, NULL, NULL) = 17 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0 sendto(3, "Q\0\0\0Wselect nl_backendpid from \"_sltest \".sl_nodelock where nl_backendpid <> 28927; \0"..., 88, 0, NULL, 0) = 88 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 poll([{fd=3, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=3, revents=POLLIN}]) recvfrom(3, "T\0\0\0&\0\1nl_backendpid\0\304\27Dn \0\3\0\0\0\27\0\4\377\377\377\377\0\0D\0\0\0\17\0\1\0\0\0\00529006D \0\0\0\17\0\1\0\0\0\00529011D\0\0\0\17\0\1\0\0\0\00529012C \0\0\0\vSELECT\0Z\0\0\0\5T"..., 16384, 0, NULL, NULL) = 105 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0 sendto(3, "Q\0\0\0\32rollback transaction;\0"..., 27, 0, NULL, 0) = 27 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 poll([{fd=3, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=3, revents=POLLIN}]) recvfrom(3, "C\0\0\0\rROLLBACK\0Z\0\0\0\5I"..., 16384, 0, NULL, NULL) = 20 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0 sendto(4, "Q\0\0\0\30begin transaction; \0"..., 25, 0, NULL, 0) = 25 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}]) recvfrom(4, "C\0\0\0\nBEGIN\0Z\0\0\0\5T"..., 16384, 0, NULL, NULL) = 17 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0 sendto(4, "Q\0\0\0Wselect nl_backendpid from \"_sltest \".sl_nodelock where nl_backendpid <> 16155; \0"..., 88, 0, NULL, 0) = 88 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}]) recvfrom(4, "T\0\0\0&\0\1nl_backendpid \0\0\1\"\203\0\3\0\0\0\27\0\4\377\377\377\377\0\0D \0\0\0\17\0\1\0\0\0\00517510D\0\0\0\17\0\1\0\0\0\00517511C \0\0\0\vSELECT\0Z\0\0\0\5T"..., 16384, 0, NULL, NULL) = 89 rt_sigprocmask(SIG_BLOCK, [PIPE], [], 8) = 0 sendto(4, "Q\0\0\0\32rollback transaction;\0"..., 27, 0, NULL, 0) = 27 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 poll([{fd=4, events=POLLIN|POLLERR}], 1, -1) = 1 ([{fd=4, revents=POLLIN}]) recvfrom(4, "C\0\0\0\rROLLBACK\0Z\0\0\0\5I"..., 16384, 0, NULL, NULL) = 20 rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 nanosleep({1, 0}, {1, 0}) = 0 ======END STRACE====== This repeats over and over again in the log (infinite loop?) I also tried a different time with the script provided by slony-ctl, but no luck. (It DOES, however, work when there's only 2 nodes) Are there any know issues for 3+ node failover in 2.0.2? Would anyone be able to walk me through this, if perhaps I'm doing something wrong? Thanks! --Richard
- Previous message: [Slony1-general] Issue with configuring Slony-I on Windows
- Next message: [Slony1-general] Subscription errors don't automatically recover
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list