bugzilla-daemon at main.slony.info bugzilla-daemon at main.slony.info
Thu Aug 25 12:52:10 PDT 2011
http://www.slony.info/bugzilla/show_bug.cgi?id=235

--- Comment #9 from Christopher Browne <cbbrowne at ca.afilias.info> 2011-08-25 12:52:09 PDT ---
Created an attachment (id=125)
 --> (http://www.slony.info/bugzilla/attachment.cgi?id=125)
State diagram for revised SYNC grouping logic

Steve Singer and I whiteboarded up a state diagram describing a proposed
revised logic for handling SYNC grouping.  I have turned that into a TCM
diagram, attached.

We basically propose revising the logic to be thus...

- At start time, the initial "max" is set to 1.

- If SYNCs are processed properly, the grouping doubles (e.g. - 1, then 2, 4,
8, 16, ...), until reaching either the maximum SYNCs outstanding, or the
configured maximum.

- Any time things fail, we fall back by 1/2.  max of 32 --> max of 16.

- Once we catch up, we're typically processing one or just a few SYNCs at a
time; if things fall behind, the doubling will kick in, but typically, we only
*have* a few to process at a time.

We discussed the possibility of starting with an initial "max" being as large
as possible (e.g. - the configured max value), and using the "halving upon
failure" to scale back as needed, but this seems to have several disadvantages
to starting with 1 and doubling:

1.  By starting with just a few SYNCs, there's the hope that we *immediately*
get some SYNCs replicated, and the subscriber visibly starts to catch up.

Starting with a huge grouping means there's no such quick feedback, which could
be disconcerting to users.

1a.  More on the "disconcerting" bit...  If the administrator is disconcerted
by seeing no evident progress, they may decide to kill the slon and retry,
which would waste any work that has been done.  And the retry will behave
identically; it'll just look like replication is broken.

Better to let some SYNCs through quickly, if possible.

2.  In 2.1, we expect a fundamentally better behaviour due to the fix of bug
#167.  A small grouping shouldn't be reverting to SEQ SCAN; we can hope for
rather better behaviour of that query.

-- 
Configure bugmail: http://www.slony.info/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
You are the assignee for the bug.


More information about the Slony1-bugs mailing list