[Slony1-general] partitioning with replication

Fri Jul 23 21:48:45 PDT 2004

Interesting. That's a very helpful analysis. Thanks for taking the time!

The structure of the data to be replicated in our case is actually
pretty simple, so I don't *think* that the referential integrity issues
you raise are going to be a problem. Of course, if it were going to be
implemented in slony it would have to take into account the general
case, which would of course potentially involve referential integrity.
Now, if we introduced a general schema mapping capability (based on XML,
of course)...... ;-0 (just my sick sense of humor)

Time to read up on rules, I guess....

Thanks again.

- DAP 

-----Original Message-----
From: Christopher Browne [mailto:cbbrowne at ca.afilias.info] 
Sent: Friday, July 23, 2004 5:35 PM
To: David Parker
Cc: slony1-general at gborg.postgresql.org
Subject: Re: [Slony1-general] partitioning with replication

"David Parker" <dparker at tazznetworks.com> writes:
> In one area of our application, we would like to be able to capture an

> update to a master table, and select the target replication table 
> based on some key value in the update data, e.g. "if key_field == A, 
> replicate to table X on host A; if key_field == B, replicate to table 
> X on host B", etc.
>
> I realize that slony doesn't do this currently, but I'm wondering if 
> anybody else would find a feature like this desirable. A new table 
> could be added, with an fk to sl_table, to capture the info about the 
> partitioning key/value->target table/host mapping, though supporting 
> multi-columm partition keys would make this more complicated. Then the

> code in remote_worker would need to know about this information. Of 
> course, you'd have to be able to express these relationships in 
> slonik, as well, so I know the whole thing isn't that simple.
>
> Are there any obvious show-stoppers that make it impossible to 
> implement something like this?

The way that I'd approach that would be via having the "combined"
table being an updatable view, where a rule chooses which table to put
the data in based on the contents of key_field.  The update rule could
be pretty nasty, based on the fact that a particular entry might shift
from one table to another if you do an update to key_field...

In that case, table X and Y would exist on the "master" host, and there
would be separate subscription sets where host A would subscribe to set
123, containing table X, and host B would subscribe to set 246,
containing table Y.

That leaves the "smarts" of it out of Slony's hands, and also makes it
abundantly clear which data is flowing in which directions without
having to "crud" Slony up with a lot of extra logic.

Let me point at the nastier implication...  Supposing there was a
related table, which also was partitioned, it is entirely possible that
changing key_field in the the one table would mandate moving data
between _other_ tables.

Thus, let's suppose the notion is that customers are split across three
offices, and the invoices associated with the respective offices'
customers are also so split.  The rule to update "customers"
would have to be clever enough to know that if it's about to migrate
customer 123 from sales office A to sales office B, it also has to move
all of the corresponding invoice data to B's set of tables.
Otherwise foreign key constraints would break as would lots of other
stuff.

Assigning Slony with responsibility for managing how to do that seems to
me like a recipe for making it WAY more complex, whilst PostgreSQL
"rules" were created to do that very sort of thing.
--
(reverse (concatenate 'string "ofni.sailifa.ac" "@" "enworbbc"))
<http://dev6.int.libertyrms.com/> Christopher Browne
(416) 673-4124 (land)