John Moran johnfrederickmoran at gmail.com
Wed Nov 11 15:03:18 PST 2009
Hi Chris,

> What you *particularly* want to make sure of is that you're generating
> SYNCs against the "master" any time it's up and running.
>
> The case that turns out notably badly is where the "master" is accepting
> changes while slon processes are down, and when slons finally return, it
> generates one giant weekend-long SYNC that draws a barrel of changes in
> as one big increment.

All slons are on the master (this is easier to manage generally). I
guess that's why I've never experienced this.

I've noticed that when connectivity is eventually restored (by which
time the event lag in sl_status has perhaps reached thousands or tens
of thousands of events), it tends to take 2 or 3 syncs to clear
through the backlog (each time, several thousand events are cleared).
The lag grows, even though there usually isn't any actual activity
that would have added to the backlog of events to be replicated (I
guess this is normal). Suppose little or no actual activity occurs -
does the fact that the event lag is quite big matter at all, or is the
expense of replicating those missed events the same, regardless of the
event lag?

> So, if parts of the system are out for a few hours, it may take a number
> of SYNCs to get up to date; things play more nicely if those SYNCs are
> small, as opposed to being giant mudballs of changes that take a long
> time to apply.

Care to comment on how sensible you think this is in general?

Thanks a lot,

John Moran


More information about the Slony1-general mailing list