Steve Singer ssinger at ca.afilias.info
Wed Jan 30 08:26:56 PST 2013
On 13-01-30 10:38 AM, Jan Wieck wrote:
> On 1/30/2013 9:55 AM, bugzilla-daemon at main.slony.info wrote:
>
>> The UPDATE to sl_setsync does not have ssy_origin as part of the where clause,
>> because we are running in READ COMMITTED mode the DELETE+INSERT of the row on
>> sl_setsync becomes visible to the UPDATE part of sync_event() even though the
>> sync_event() started before the ACCEPT_SET was processed.
>
> This sounds very plausible.
>
> I wonder if Slony in general is using too many concurrent threads.
> Unfortunately changing that won't be easy.
>
>
> Jan
>

I suspect the solution is to at ssy_origin as part of the where clause 
on that update so we only change sl_setsync rows for the current remote 
worker.

I am testing
*************** sync_event(SlonNode *node, SlonConn *loc
*** 4693,4701 ****
                         "update %s.sl_setsync set "
                         "    ssy_seqno = '%s', ssy_snapshot = '%s', "
                         "    ssy_action_list = '' "
!                       "where ssy_setid in (",
                         rtcfg_namespace,
!                       seqbuf, event->ev_snapshot_c);
     i = 0;
     for (provider = wd->provider_head; provider; provider = provider->next)
     {
--- 4692,4700 ----
                         "update %s.sl_setsync set "
                         "    ssy_seqno = '%s', ssy_snapshot = '%s', "
                         "    ssy_action_list = '' "
!                       "where ssy_origin=%d and  ssy_setid in (",
                         rtcfg_namespace,
!                       seqbuf, event->ev_snapshot_c,node->no_id);
     i = 0;
     for (provider = wd->provider_head; provider; provider = provider->next)
     {


To see if that makes the problem go away.  Because the race condition is 
somewhat rare it will be a few days before I can have an idea if that helps




More information about the Slony1-bugs mailing list