bugzilla-daemon at main.slony.info bugzilla-daemon at main.slony.info
Wed Aug 14 11:39:29 PDT 2013
http://www.slony.info/bugzilla/show_bug.cgi?id=310

           Summary: slon loops restarting on a FAILOVER
           Product: Slony-I
           Version: devel
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: enhancement
          Priority: low
         Component: slon
        AssignedTo: slony1-bugs at lists.slony.info
        ReportedBy: ssinger at ca.afilias.info
                CC: slony1-bugs at lists.slony.info
   Estimated Hours: 0.0


A tester has reported a case during a 3 node FAILOVER in 2.2.0 b5 where the
slon keeps restarting and processing the FAILED_NODE event.


What appears to be happening is that the remote_worker processes the
FAILOVER_NODE event.

1. It falls the failedNode(...)  stored function
2. The failedNode(...) stored function notifies the Restart 
3. The transaction commits

The local listener then picks up the Restart requests and restarts the slon.
The slon then repeats steps 1-3 because the FAILOVER_EVENT has not yet been
added to sl_event and sl_confirm.

The FAILOVER_EVENT isn't being marked as processed because the next steps that
need to happen are

4.  The slon needs to wait until some events with ev_origin=failed_node arrive
from one of the remaining nodes.
5. Then it can finish the FAILOVER_EVENT processing by calling
failoverSet_int(...)

We need to commit + restart at step 3 so the slon will listen from
ev_origin=failed_node events from other places.

-- 
Configure bugmail: http://www.slony.info/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Slony1-bugs mailing list