Christopher Browne cbbrowne at ca.afilias.info
Thu Jan 22 14:12:38 PST 2009
Michael Weber <dr.michi at gmail.com> writes:
> After a version upgrade (1.2.11 -> 1.2.15)
> I now have all 3 servers at sony 1.2.15 (postgres version is 8.3 on
> master1/slave2 and 8.2 on slave3).
>
> slave3 ran out of disk space, but when I restarted slony it started
> catching up. But now slave3 is in sync with node2, but both are behind
> the master. Is there a way to get everything back to sync again?

The information in the origin's log tends not to be terribly
interesting, as the only work it does is to run SYNC events every so
often.  The slon for that node doesn't do any real replication work.

The question I always ask, at this point, is "what was the output of
test_slony_state???"

It is a pretty longstanding "best practice" to run that fairly
frequently (I ask that our DBAs run it against all our clusters on an
hourly basis), as it represents a very good "early warning" test for a
number of sorts of misconfiguration that have historically caused
people problems.

There are a number of ways in which nodes 2 and 3 might be behind, and
I haven't read anything to distinguish what the cause might be.

- Supposing the disk space outage caused the slons not to run (e.g. -
  all slons were running on the same host as slave3), then the
  subscribers could be working their way through one Really Giant SYNC.

  There is a way to avoid this, namely to run generate_syncs.sh
  reasonably regularly against the origin.

- Perhaps some configuration problem is causing nodes 2/3 to fail to
  pull data from node 1.

- Supposing the arrangement is 1 --> 2 --> 3, that is,
    node 2 subscribes to 1, and node 3 subscribes to 2,
  then there *might* be some benefit to resubscribing node 3 directly
  to #1.

- It is not evident whether the problem is that:

  a) nodes 2 and 3 are doing work, but just not catching up quickly, or

  b) nodes 2 and 3 are "stuck" somewhere, and aren't progressing.

- I would anticipate the most interesting logs to be those for node
  #2.

  Particularly interesting would be any error messages.  Grep for
  "ERROR" :-).
-- 
"cbbrowne","@","linuxfinances.info"
http://cbbrowne.com/info/slony.html
"X is like pavement:  once you figure out how to lay  it on the ground
and  paint yellow  lines on  it, there  isn't much  left to  say about
it. The exciting new developments  are happening in things that run ON
TOP of  the pavement, like  cars, bicycles, trucks,  and motorcycles."
-- Eugene O'Neil <eugene at cs.umb.edu>


More information about the Slony1-general mailing list