Peter Geoghegan peter at
Wed May 25 05:03:24 PDT 2011
On 25 May 2011 12:43, Ger Timmens <Ger.Timmens at> wrote:

>    Alternatively, this might occur because the slon for this node
> has been broken for a long time, and there are an enormous number of
> entries in sl_event on this or other nodes for the node to work
> through, and it is taking more than slon_conf_remote_listen_timeout
> seconds to run the query. In older versions of Slony-I, that
> configuration parameter did not exist; the timeout was fixed at 300
> seconds. In newer versions, you might increase that timeout in the
> slon config file to a larger value so that it can continue to
> completion. And then investigate why nobody was monitoring things
> such that replication broke for such a long time...

If this is the case, then you can change the listen timeout to
something in the hundreds of seconds.

> Replication seems to continue fine after this error.
> Is it save to continue ?
> Or should we start from scratch ?
> If so what do we have to do to prevent this error from happening again ?

In general, Slony will not allow slaves to enter an inconsistent state.

Look at the "test_slony_state" Perl script which looks at various
parts of the configuration and verifies that things are running

This should form part of your monitoring setup. It is common to
automatically run the script at regular intervals.

Peter Geoghegan
PostgreSQL Development, 24x7 Support, Training and Services

More information about the Slony1-general mailing list