Bill Moran wmoran at potentialtech.com
Fri Nov 12 11:05:34 PST 2010
In response to Aleksey Tsalolikhin <atsaloli.tech at gmail.com>:

> Any pointers for troubleshooting slony replication lag?  We're running
> slony-I 1.2.20, and replicating a 23 GB database across WAN (between
> nearby cities).
> 
> last afternoon, slony lag started growing, and has been steadily
> slowly inching up... the lag is now 2 hours 48 minutes...
> 
> I don't see any "troubleshooting" documentation on the slony web site
> and hope someone can help me out...
> 
> I've tried restarting the slony slave, and the slony master and slave,
> and even rebooting the slave server...  the lag is still growing.
> I've confirmed the slave can get to the master's DB, and the master
> can get to the slave's DB.
> 
> i've checked latency in our network monitoring system, between the two
> sites, and things are humming along.
> 
> what else do I look at?
> 
> oh, yeah, and I initiated a slony log switch yesterday, that didn't help...
> 
> What else should I try besides dropping the replication set and re-creating it?

Are you monitoring the PostgreSQL and Slony logs on both servers?

Usually when we see lag that isn't catching up, it's the result of someone
making a schema change and not properly changing all the servers.  This
usually has some fairly obvious errors in the logs.

-- 
Bill Moran
http://www.potentialtech.com
http://people.collaborativefusion.com/~wmoran/


More information about the Slony1-general mailing list