Marc Munro marc
Thu May 18 09:06:01 PDT 2006
I've investigated further and come to some conclusions which I'll share.
This is no longer a big problem for me as I have work-arounds but I
think it does illustrate a potential issue in the handling of ddl.

First, I am not doing huge amounts of ddl.  My partition rolling scheme
will create new partitions for a small number of large tables on a
monthly basis.  This does not seem like an unreasonable thing to want to
automate, and it is easier for me to do it from plpgsql than from the OS
via shell, perl or whatever.

It seems that there is a fundamental conflict between sync events and
ddl events.  Before each ddl event, a sync is issued, meaning that our
triggering transaction (which contains the ddl) now spans 2 syncs.

It seems that DML issued before the ddlscript call can be run twice at
the subscriber, once for each sync.  This is not always the case: it
seems to depend on how long the ddl takes to run.

DML issued after the ddlscript has always, for me, been lost.  I assume
that this again is a race condition and that there may be conditions
where this is not so.

I suspect that I could overcome this by creating a copy of ddlscript
that does not issue a sync but I am reluctant to experiment at this
level without feedback from the slony developers.

My work around, btw, is to simply perform dml using ddlscript, and to
separate out the logging of progress, errors, etc into a subsequent
transaction.  Although this is painful (and loses transaction
integrity), there are relatively few dml statements that need to be
performed in those state transitions that perform ddl.

I'm happy to help if anyone wants to follow up on this.

__
Marc


On Wed, 2006-05-17 at 18:38 +0100,
slony1-general-request at gborg.postgresql.org wrote:
> Date: Tue, 16 May 2006 07:00:22 -0400
> From: Andrew Sullivan <ajs at crankycanuck.ca>
> Subject: Re: [Slony1-general] logtrigger not firing
> To: slony1-general at gborg.postgresql.org
> Message-ID: <20060516110022.GA22940 at phlogiston.dyndns.org>
> Content-Type: text/plain; charset=us-ascii
> 
> On Mon, May 15, 2006 at 05:12:32PM -0700, Marc Munro wrote:
> > I guess the unusual thing I am doing from slony's perspective is
> > combining a bunch of ddlscript commands with normal slony-logged dml
> > operations.  The time that I can guarantee to get the error is after
> a
> > fairly long-running (several seconds) series of ddl.  Interestingly,
> the
> > duplicate log record has already appeared (been comitted) at the
> > subscriber while the dml still appears to be running.
> > 
> > Is this an absolute no-no, have I stumbled upon a bug, or am I just
> > being dense and missing something?
> 
> I suspect you're right that there's a race here: the entire approach
> to DDL is really intended to be used in the context of EXECUTE
> SCRIPT, which is a pretty big hammer.  It appears you're trying to
> make it lighter by calling it in the context of your transaction, and
> I suspect that just won't work.  In particular, if you have things
> that need to be propogated from your DDL before replication resumes,
> the usual thing to add is a WAIT FOR EVENT.  That's probably why
> you're having trouble.  Slony is built on the assumption that DDL
> changes are infrequent and part of maintenance, whereas you have an
> application where they're fairly regular.
> 
> A
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
Url : http://gborg.postgresql.org/pipermail/slony1-general/attachments/20060518/01eb09aa/attachment.bin



More information about the Slony1-general mailing list