[Slony1-general] Table Selection

Thu Jul 22 14:34:58 PDT 2004

Thomas F. O'Connell <tfo at sitening.com> writes:
> On Jul 18, 2004, at 10:57 PM, Christopher Browne wrote:
>
>> I have a set of Perl scripts for controlling a cluster that I'll be
>> checking in Real Soon Now.
>>
>> Unlike the monolithic script that is in the distribution now that
>> asks all sorts of questions, my toolset grabs cluster config from a
>> config file, and then has a script for each of a variety of major
>> actions you'd want to invoke on the cluster.  It doesn't yet
>> include scripts for shifting the origin; that will presumably come
>> soon enough...
>
> So this doesn't fully answer my question about the status of
> slonik. Has the concept of a set of Perl scripts met with wider
> developer and community approval? Does it make the most sense for
> propagation of Slony in terms of it becoming the generally accepted
> (or one of the generally accepted) replication solutions for
> postgres?  I'm not trying to be antagonistic; I'm just trying to
> understand the development process and rationale and play a bit of
> devil's advocate.

My intent is to throw another choice up against the wall, so we can
see what's going to stick better.

>> One of the assumptions is that there will be, available, a list of
>> all of the relevant relations that are to be replicated.
>
> Available to the script or made available by the script?

Available TO the script.

>> I don't see any way to generalize that in any "vast" way.
>
> Just based on my limited exposure to and use of slony thus far, I
> would think it wouldn't be too nonsensical to have slonik commands
> that correspond to your sets as outlined below:
>
>> My assumption
>> is that the lists exist as the following sets:
>>
>>  - A list of tables that have primary key candidates;
>>
>>  - A list of tables for which Slony-I will need to add a primary key
>> candidate;
>>
>>  - A list of sequences that are to be replicated.
>
> For instance, why not create slonik commands like
>
> SET ADD UNIQUELY CONSTRAINED TABLES
> SET ADD UNCONSTRAINED TABLES
> SET ADD ALL TABLES
> SET ADD ALL SEQUENCES

Because there may be some table that SHOULDN'T be replicated.

In our applications, for instance, there are all the tables that are
"officially" part of the server application, and then there are some
utility tables that the DBAs periodically create when looking to
repair some problem that has occurred.

If we said "replicate all tables," that would include cruddy ones that
perhaps should be thrown away.

A better approach would be to request all tables in a particular
namespace, as it is somewhat more likely that a namespace will be kept
clean.

>> Stick those three things in a config file, and it's dead easy to
>> generate a suitable "create set" that first adds PK candidates and
>> then builds the set of all the tables.

> Again, my point would be why not (plan to) have slonik handle this
> work for you?

Because it may take human planning to get this to all take place at an
appropriate time.

For instance, suppose the database is rather large, and you have to
very carefully schedule when the "unique key" creation will take place
on some critical table, because that will lead to an application
outage.  In that case, having slonik "magically handle this" would be
downright undesirable.

>> If you have thousands of relations, you'll probably need to build
>> queries to look for them.  At that point, throwing them into a file
>> should be no big deal.
>
> Yeah, that's what I'm currently working on.
>
>> I would be reluctant to add functionality to try to do that
>> automagically; the more sophisticated scripts get, the more their
>> actions resemble magick, and the more difficult it is to convince a
>> DBA to trust them :-(.
>
> Well, isn't programming in general just scripting writ a bit larger?
> I mean, what's the difference between a slonik command that grabs
> all user relations and postgres itself in general. If a DBA trusts
> the DBMS, then the DBA ought to be able to have some trust in the
> extended development community that organizes around the DBMS.

That only works if we're looking at:
 - Development in the small, where everyone knows everyone, and
 - Development under a tightly controlled set of DB management
   doctrines.

If there's any reason to need to be _careful_ about what data gets
replicated, then magick becomes a major misfeature.
-- 
let name="cbbrowne" and tld="ca.afilias.info" in String.concat "@" [name;tld];;
<http://dev6.int.libertyrms.com/>
Christopher Browne
(416) 673-4124 (land)