Maxime Henrion mux at oxado.com
Wed Dec 5 06:23:21 PST 2007
	Hello all,



I have code that generates slonik scripts to automatically add new
tables to existing sets.  So I run the CREATE SET / SET ADD TABLE /
SUBSCRIBE SET / MERGE SET commands, all intertwined with WAIT FOR EVENT
calls and all commands wrapped in try blocks so as to catch any problem.

My problem is that I very often get failures with timeouts on the WAIT
FOR EVENT call that is right after the SYNC command, issued so as to be
sure that SUBSCRIBE SET is finished, as recommended in the
documentation.

I'd like to double-check that everything I'm doing is correct because
the documentation is very confusing on this.  In the documentation of
the SYNC command, there is this example slonik code:

SUBSCRIBE SET (ID = 10, PROVIDER = 1, RECEIVER = 2);
WAIT FOR EVENT (ORIGIN = 2, CONFIRMED = 1);
SYNC (ID = 1);
WAIT FOR EVENT (ORIGIN = 1, CONFIRMED = 2);

This is what I am doing in my own script.  However, the documentation
for the MERGE SET command gives this other example:

# Assuming that set 1 has direct subscribers 2 and 3
SUBSCRIBE SET (ID = 999, PROVIDER = 1, RECEIVER = 2);
WAIT FOR EVENT (ORIGIN = 2, CONFIRMED = 1);
SYNC (ID=1);
SUBSCRIBE SET (ID = 999, PROVIDER = 1, RECEIVER = 3);
WAIT FOR EVENT (ORIGIN = 3, CONFIRMED = 1);
SYNC (ID=1);
MERGE SET ( ID = 1, ADD ID = 999, ORIGIN = 1 );

As you can see, there is no WAIT FOR EVENT call after the SYNC calls,
and the script proceeds directly to the MERGE SET command!  So I'm
thinking this might be why my script always fail with a timeout; maybe I
shouldn't have a WAIT FOR EVENT call there, and maybe the example in the
documentation of the SYNC command should be fixed?
 
Here is the generated slonik script I have been talking about:

cluster name = foo;
<node conninfos skipped>

try {
        create set (id = 13858, origin = 3);
} on error {
        echo 'Could not create set id 13858!';
        exit 1;
}
wait for event (origin = 3, confirmed = 4, wait on = 3);
try {
        set add table (set id = 13858, origin = 3, id = 2732,
          full qualified name = 'public.foo_13858');
        set add table (set id = 13858, origin = 3, id = 2733,
          full qualified name = 'public.bar_13858');

} on error {
        echo 'Could not populate set id 13858!';
        drop set (id = 13858, origin = 3);
        exit 1;
}
try {
        subscribe set (id = 13858, provider = 3, receiver = 4,
          forward = yes);
} on error {
        echo 'Could not subscribe set id 13858 to node 4!';
        exit 1;
}
wait for event (origin = 4, confirmed = 3, wait on = 4);

sync (id = 3);
wait for event (origin = 3, confirmed = 4, wait on = 3);

try {
        merge set (id = 5, add id = 13858, origin = 3);
} on error {
        echo 'Could not merge set id 13858 in set id 5!';
        exit 1;
}
wait for event (origin = 3, confirmed = ALL, wait on = 3);

As I said, the timeout occurs just after the SYNC call.

Any help would be greatly appreciated.

Thanks,
Maxime



More information about the Slony1-hackers mailing list