[Slony1-general] Catch-up and sl_status, and yum repo

Thu Dec 1 09:19:57 PST 2011

We recently had an issue with a long Slony (2.0) catch-up after 10 days of
disconnect between a master and a slave. It's possible that it was caused
by either bug 167 or bug 222. To test this behavior we made a dummy cluster
on our local LAN, and did the following steps:

1. Initialize a two-node Slony cluster with two identical copies of the
database.
2. Allow initial subscription to catch up.
3. On the slave: drop the network connection to the master.
4. On the master: run ~4 million update operations.
5. On the slave: restore the connection to the master.

We did this, and I was able to watch sync events get submitted and received
in the logs. However, in sl_status, st_lag_num_events and st_lag_time kept
going up, and the backlogged changes were not propagated (after a couple of
hours, at least). The LAN link between the two nodes is fast, and neither
node is lagging due to server/IO/network load. Why is this occurring/what
did I do wrong?

Also, on what time period do you publish RPMs of Slony to public Yum
repositories? (i.e. when should we expect to see an RPM of Slony 2.1?)

Thanks!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20111201/975bada4/attachment.htm