Thu Feb 3 07:19:39 PST 2011
- Previous message: [Slony1-hackers] automatic WAIT FOR proposal
- Next message: [Slony1-hackers] automatic WAIT FOR proposal
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 11-02-03 09:44 AM, Jan Wieck wrote: > On 2/2/2011 11:42 AM, Steve Singer wrote: >> On 10-12-22 04:30 PM, Steve Singer wrote: >> >> >> Since I haven't had much response on this maybe a plain language example >> would be useful. >> >> Consider a cluster with paths where node 1 is a provider+origin to all >> other nodes >> >> 4--1----2 >> | \ / >> |--- 3 >> >> EXECUTE SCRIPT( FILE=file1.sql, EVENT NODE=1); >> wait for event(origin=1, confirmed=2, wait on=1); >> EXECUTE SCRIPT(file=file2.sql, EVENT NODE=2); >> >> Take node 3. Does node 3 perform the SQL in file1.sql first or >> file2.sql first? Today this is non-deterministic either could win. >> >> The two solutions I see are >> a) Require all nodes to be caught up before going to the next event >> node. As discussed this seems somewhat limiting >> b) Make slon wait for the event with origin=1 to be applied on node 3 >> before applying the event from node 2 (because the event from node 1 had >> already been processed on node 2 by the time the node 2 event was >> generated). >> >> b) is what I am proposing to implement here. >> >> I can create this type of race condition with other event types as well >> it isn't specific to execute script. > > What you are basically asking for is a guaranteed total order in which > events from multiple nodes are processed. Very much like the total order > guarantees provided by group communication systems. I'm not going as far as a total order over all events just an ordering over that deals with events that have already been processed by the event origin. For example if remote events are processed node 1: node 2: 2,1233 1,1233 (node 1 has seen 2,1233 and node 2 has seen 1,1233) then they each do a sync generating events 1,1234 2,1234 In the scheme I propose node 3 can either process events in this order 1,1233 2,1233 1,1234 2,1234 OR 1,1233 2,1233 2,1234 1,1234 ie I am not requiring any ordering constraints between the two events 1,1234 and 2,1234 other than they must come after 1,1233 and 2,1233. What i describe requires no additional communications between nodes over what we are already doing. The issue I describe isn't specific to two execute scripts. For example I have a 3 node cluster with two sets (set 1 origin is node 1, set 2 origin is node 2). subscribe set(set id=1,provider=1,receiver=2) subscribe set(set id=2,provider=2,receiver=1) wait for event(origin=1,confirmed=2,wait on=1) wait for event(origin=2,confirmed=1,wait on=2) subscribe set(set id=1,origin=1,receiver=3) subscribe set(set id=2,origin=2,receiver=3) # # subscribing to set 3 takes a LONG time # because it is in a remote data centre # # while it is subscribing I discover # I need to make an emergency schema change # via EXECUTE SCRIPT such that I can't wait # for node 3 to finish subscribing before # making the change on node 1 and 2. If i use node 1 or node 2 as the event node it might get applied on node 3 before the set from the other node finishes. --------- Here is an example that doesn't involve execute script. (assume the same cluster config as in my last example) create set(id=1, origin=1) set add table(set id=1, origin=1, fully qualified table='public.foo'); #commands execute, dba notices a mistake drop set(set id=1,event node=1); wait for event(origin=1,confirmed=3,wait on=1); create set(id=2, origin=2) set add table(set id=2,origin=3); set add table(set id=2, origin=2, fully qualified table='public.foo'); Node 3 might process the add table from node 2 BEFORE it proceses the drop set from node 1. The above example probably happens in the real world quite a bit, a dba creates a set then notices they are hosting it on the wrong node and wants to fix things. > > While the example above seems to be possible, I don't know why someone > would actually attempt such. If node 1 is the origin of everything, it > doesn't even make sense to use node 2 as the event node unless node 2 > also is the ONLY node to execute it. > > The design of EXECUTE SCRIPT expects the event node to be the origin of > the objects modified, so that the SQL statements inside the script are > executed at the same data SYNC point on all nodes. Since it is > impractical to perform sanity checks against the script to ensure that > the user is actually doing that, all we can and should do is to make > this requirement clearer in the documentation. > > > Jan >
- Previous message: [Slony1-hackers] automatic WAIT FOR proposal
- Next message: [Slony1-hackers] automatic WAIT FOR proposal
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-hackers mailing list