[Slony1-general] Slony delay to replicate

Thu Sep 21 06:37:30 PDT 2006

 Yes, I am using version 1.1.5, my master and slave are in local network and
they are virtual machines.

 Today I get this logs below. I think that "timeout for event" is time out
in network, is this correct?

../slon_log:2006-09-20 22:08:41 BRT ERROR  remoteListenThread_2: timeout for
event selection
../slon_log:2006-09-20 23:25:22 BRT ERROR  remoteListenThread_2: timeout for
event selection
../slon_log:2006-09-21 00:05:29 BRT ERROR  remoteListenThread_2: timeout for
event selection
../slon_log.20060918:2006-09-17 19:05:50 BRT ERROR  remoteListenThread_2:
timeout for event selection
../slon_log.20060918:2006-09-18 11:01:26 BRT ERROR  remoteListenThread_2:
timeout for event selection
../slon_log_PRINCIPAL_SLAVE:2006-09-20 09:42:22 BRT ERROR
remoteListenThread_1: timeout for event selection
../slon_log_PRINCIPAL_SLAVE:2006-09-21 00:05:29 BRT ERROR
remoteListenThread_1: timeout for event selection
../slon_log_PRINCIPAL_SLAVE:2006-09-21 09:52:40 BRT ERROR
remoteListenThread_1: timeout for event selection

2006/9/21, cbbrowne at ca.afilias.info <cbbrowne at ca.afilias.info>:
>
> >  I am using Slony but some time Slony delay to do replicate. Some time
> > Slony
> > replicate fast but some time I need to wait 10, 20 or 45 minutes.
> >
> >  Master (table) and Slave (table) are in the some server Postgres but in
> > BD
> > different.
> >
> >  What do I need to do?
> >
> >  I find this logs in Master when Slony delay:
>
> The logs on an origin node won't have any terribly informative messages,
> as there's not much of interest that the slon for that node has to do.
> All that that slon does is to periodically mark SYNC intervals, which will
> merely log a timestamp once in a while.
>
> The interesting work takes place when a subscriber slon is doing the work
> to figure out what data to draw in.
>
> If things are falling behind by a lot, that is quite likely to lead to
> there being error messages (relating to timeouts and such) in a
> subscriber's logs.
>
> It might be useful to grep the subscriber's logs for ERROR or FATAL.
>
> There are two main reasons I'd expect things to sometimes fall behind:
>
> 1.  The network is flakey, which should be accompanied by error messages
> about timeouts or inability to run queries.
>
> 2.  You periodically have very heavy update loads.  For instance, you
> might have some queries that in one big transaction update 100,000 tuples.
> The overall transaction might take 10 minutes to apply on the origin, and
> might be distributed as 1000 updates affecting 100 tuples apiece, but when
> that commits as one transaction, that will be applied in one SYNC, and be
> something of a "blood clot" passing heavily through the replication
> system.
>
> It should be easy to find error messages indicating 1. in the logs.  In
> version 1.2, I added logging of how many tuples were affected by a SYNC,
> which would make it easier to detect "clots" passing through; you're
> probably running 1.1.5, where it won't be particularly easy to detect
> that.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gborg.postgresql.org/pipermail/slony1-general/attachments/20060921/43e538c6/attachment.html