[Slony1-general] DDL replication ...

Fri Feb 24 14:08:50 PST 2006

On Feb 24, 2006, at 9:30 AM, Jan Wieck wrote:

> On 2/23/2006 4:00 AM, Gavin Sherry wrote:
>> On Thu, 23 Feb 2006, Jim C. Nasby wrote:
>>
>>> On Thu, Feb 23, 2006 at 03:02:14PM +1100, Gavin Sherry wrote:
>>>> On Wed, 22 Feb 2006, Darcy Buskermolen wrote:
>>>>
>>>>> On Wednesday 22 February 2006 08:48, Marc G. Fournier wrote:
>>>>>> Is there any work being done on Slony-I replicating DDL's?   
>>>>>> And, of
>>>>>> course, setting up replication on any TABLES created in the  
>>>>>> process?
>>>>>
>>>>> No there is not any work on this.. in order to support this it  
>>>>> would require
>>>>> hacks to the PG backend..  This could probably be done with  
>>>>> Gavin's system
>>>>> table trigger patches.  The patches were never accepted for  
>>>>> inclusion by
>>>>> core.
>>>>
>>>> The issue, for me, is that there are two ways we could go about
>>>> implementing this and both have problems.
>>>>
>>>> 1) Create triggers on individual system tables.
>>>>
>>>> If you wanted to know when a new table was created, when a table  
>>>> was
>>>> modified or dropped, you would create a trigger on pg_class, I  
>>>> suppose.
>>>> The problem, though, is that you might want to see what columns  
>>>> a new
>>>> table has but are they visible yet? It's a bit of a can of worms.
>>>
>>> Visible?? You mean you're worried about about seeing stuff that  
>>> hasn't
>>> committed yet?
>>
>> I was implying that it was a timing problem. In the code we create  
>> the
>> pg_class entry and then populate pg_attribute from memory. We need  
>> to be
>> careful about where we invoke a trigger on pg_class to make sure  
>> all it's
>> ramifications are visible -- ie, that we've issued
>> CommandCounterIncrement().
>
> Plus having created things like the PK index for the table and other
> constraints.
>
> Which leads to yet another problem. What does Slony do if the admin
> creates a table without a primary or even without any possile  
> candidate
> key? Don't replicate the table? Abort the CREATE TABLE?
>
> And how exactly do I tell Slony (after it got all that super smart to
> master DDL all by itself) that I want a different set of indexes on
> "this particular replica", because that's my search engine server?

It seems to me that there are two cases to be handled:

1) Hot standby.  Just duplicate the whole thing exactly
2) Anything else you might want to do.

Slony handles both of these cases wonderfully because it is designed  
to be completely flexible which in my opinion is the way to go.  At  
the same time once a stable flexible solution is in place it makes  
sense to me to automate certain common cases.

I see this as anagous to say autovacuum.  It's nice to just turn it  
on and let it take care of you.  If you have tables though that are  
really big or really small or really wide and are updated often then  
just turning on autovacuum is not going to do it for you.  You are  
going to need to do some analysis and hand tune the per table  
parameters to make sure everything is taken care of.

So why not have an option to just replicate an entire database.  If  
that option is selected then you just replicate every table.  Put it  
all in one set.  If there is a primary key use it.  If not then look  
for a not null unique field.  If that doesn't exist add a serial  
column just as if the user had requested it explicitly in the slony  
config.

Don't get me wrong I think that the most important thing is that  
slony provides a way to do everything you might need to do.  The  
automation of common tasks has to come second to that.  But slony is  
now a stable reliable replication system.  Why not add some extra  
support for the case of "just give me an exact replica" and make it  
totally turn key.

Yes this indeed will not work for all people and large or more  
complex setups will require some hand tuning / configuring to get  
good performance for what they want to do.  If there are technical  
issues that can't be overcome that is understandable but it sounds  
like you are simply trying to maintain the flexibility of slony for  
advanced configurations.  I think it could be done such that advanced  
configuration was still possible but a turnkey "just duplicate this  
database to there" situations would be helpful for a lot of  
installations that don't need that flexibility.  I'm sure that there  
are people out there that could use it but don't want to take the  
time to set everything up.  The don't NEED it but it would be nice to  
have if it's not too much trouble.  Getting those people to use it  
would increase the installed base which means more users testing it  
that could possibly be useful to the project (or postgres in general)  
some day.

And it would give postgres the reputation of having an "it just  
works" replication solution.  Mysql has gotten this reputation by  
implementing a half-baked kludged replication solution that appears  
to "just work" but in reality has limitations and is probably not  
very failsafe.  I think that slony could be a solid flexible "done  
right" replication solution that also can be set up in 15 minutes  
just by installing it, turning it on and telling it what databases to  
replicate.

Now that slonly is working and stable I think that it really could be  
the best of both worlds.  But I obviously had no hand in developing  
it.  So maybe there are technical hurdles that make this impossible.   
But just like there was resistance to the windows port for so long I  
think it is good that it was finally added.  Maybe it's not the best  
for big mission critical production sites.  But there are many cases  
where doing this could I think improve the adoption of postgres in  
certain situations.

just my $0.02

Rick