[Slony1-general] Proposal: using COPY to pull sl_log_? data to subscribers

Sun Jan 27 18:53:06 PST 2008

On Tue, Jan 08, 2008 at 09:55:49AM -0500, Andrew Sullivan wrote:
> On Mon, Jan 07, 2008 at 12:43:53PM -0500, Christopher Browne wrote:
> > I have been looking at this idea for a while, and I *think* it has
> > enough merit to consider implementing it.
> 
> [&c.] 
> 
> This proposal seems like a good idea to me, but I think particular
> community input is needed on a couple areas.
> 
> >   1.  Some processing load gets taken off the provider
> 
> [. . .]
> 
> >   2.  Processing load is moved from slon to subscriber DBMS
> 
> What this means, of course, is that load goes down on the origin and
> up on every subscriber.  This seems to me to be obviously desirable,
> but I wonder if there's anyone who has designed specifically around
> the load profile of Slony today, and who will have problems with
> this new load profile.  Anyone?

While absence of evidence isn't evidence of absence, I'd have figured
that anybody negatively affected by this would be yelling loudly by
now.

> >       An important question: Will that loop lead to grossly
> >       excessive backend memory usage in cases where Large Tuples
> >       are processed?  (e.g. - where the INSERT statement is
> >       inserting a tuple consisting of 40MB of data)
> 
> This is obviously an empirical matter, but I suggest people need to
> start thinking very carefully about possible test cases --
> particularly unusual and expensive ones -- now.  The worst effect of
> this change would be if we traded load on the origin for lighter
> load that only works 95% of the time.  Reliability is, I think, the
> cardinal value in this system, and everything else needs to be
> sacrificed to that.

Agreed.

> >       There is a downside: with this approach, we now have no
> >       option for a subscriber node to NOT be configured to be a
> >       provider; all nodes now load data into the sl_log_? tables.
> 
> Does anyone care about this?  It strikes me that the subscribe-only
> option is a frill that could be traded away, but if people are
> dedicated to it, now would be a good time to speak up.
> 
> I like the proposal, but it'd be nice to here widespread agreement
> on a change that is a fairly deep architectural one before we go
> ahead with this strategy.

Just anecdotally, the second biggest gripe I keep hearing about Slony
is its behavior on long-running transactions, especially initial
syncs.  See Ow Mun Heng's recent post on planetpostgresql.org for a
sample ("the dreaded fetch 100 from log").

The biggest gripe, by the way, is the amount of fiddling you need to
do for common operations like, "create a two-node cluster with a
replication set that includes every table and sequence" or "add this
table to this replication set."

Cheers,
David.
-- 
David Fetter <david at fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter at gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate