[Slony1-general] Re: Split Set

Wed Jul 11 11:58:02 PDT 2007

[Aside... You may want to head over to lists.slony.info, and subscribe 
there; you sent this to gBorg, which isn't terribly active these days...

Rod Taylor wrote:
> Hi Chrisopher,
>
> Recently I've been thinking about the reason why initial data 
> transfers cannot be programmed to occur one structure at a time. As 
> you know, syncing up 500 structures individually and merging into a 
> master set as you go instead of in a single SET transfer (single large 
> transaction) is far better for the master node.
>
> The reason it cannot be done currently is that a single SET cannot be 
> transferred into components without causing issues.
>
>
> What about adding a command that is the opposite of a MERGE?
>
> When subscribing to a set, slony could do the following:
>
> The big catch is the set number has to change.
>
> Externally we issue something like:
>
> SUBSCRIBE SET (ID = 10, PROVIDER = 1, RECEIVER = 2, TRANSFERRED SET = 
> 11);
>
>
> Slony creates an empty set 11 which is immediately subscribed to 
> receiver 2.
>
> Then it does the following:
>
> foreach $table (@tables_in_set_10) {
>
> # Split set copies the state of ID 10 to a newly built set 999.
> # It moves the table from being a part of 10 to 999
> SPLIT SET (ID = 10, MOVE TABLE = $table, NEW SET = 999);
> WAIT FOR EVENT (ORIGIN = 1, CONFIRMED = 2);
> SYNC (ID = 1);
>
> SUBSCRIBE SET (ID = 999, PROVIDER = 1, RECEIVER = 2);
> WAIT FOR EVENT (ORIGIN = 2, CONFIRMED = 1);
> SYNC (ID = 1);
>
> MERGE SET (ID = 11, ADD ID = 999, ORIGIN = 1);
> WAIT FOR EVENT (ORIGIN = 2, CONFIRMED = 1);
> SYNC (ID = 1);
> }
>
> # All tables are now in set 11. Heck, could even rename 11 to 10
> REMOVE SET (ID = 10);
>
>
>
> Sure, it's a little fragile but would be a very welcome addition. A 
> built in SPLIT SET function would allow us to write the wrapper on our 
> own.
There's something interesting there, though simultaneously something 
very clumsy about having to ask to do the split this way...

There's a more "literal" way to handle this, by the way, as there is a 
"SET MOVE TABLE" command. Let me redo the loop:

### Move all tables from set 1 to #998
create set (id=998, provider=1, comment='Temporary set');
foreach $table (@tables_in_set_10) {
set move table (origin=1, id=$table, new set=998);
}
### Get set #10 subscribed
subscribe set (id=10, provider=1, receiver=2, forward=yes);
### Subscribe tables, one by one, and merge into set #10
foreach $table (@tables_in_set_11) {
create set (id=999, origin=1, comment='Temporary pseudoset');
set move table (origin=1, id=$table, new set=999);
subscribe set (id=999, provider=1, receiver=2);
wait for event (origin=2, confirmed=1)
sync(id=1);
merge set (id=10, add id=999, origin=1);
}

All the tables are, at the end, back in set #10, where they belong.

This could be extended to coping with doing a *second* subscription, 
which is worth outlining because it gives you pretty much "full 
generality" in that it should allow coping with having as many cascaded 
subscribers as needed.

### Move all tables from set 1 to #998
create set (id=998, provider=1, comment='Temporary set');
[subscribe #998 identically to set #10]
foreach $table (@tables_in_set_10) {
set move table (origin=1, id=$table, new set=998);
}
### Get set #10 (now emptied of tables) subscribed to the new node
subscribe set (id=10, provider=1, receiver=3, forward=yes);
### Subscribe tables, one by one, and merge into set #10
foreach $table (@tables_in_set_11) {
create set (id=999, origin=1, comment='Temporary pseudoset');
[subscribe #999 identically to set #998]
set move table (origin=1, id=$table, new set=999);
subscribe set (id=999, provider=1, receiver=2);
wait for event (origin=2, confirmed=1)
sync(id=1);
merge set (id=10, add id=999, origin=1);
}

This is do-able, but it feels mighty clumsy, and even more fragile.

It seems really tempting to think about having a "virtual replication 
set" for this, and putting that into Slony-I proper.

The notion of the "virtual replication set" still seems pretty clumsy.

Here's an outline of a very different approach that Jan and I have been 
talking about that we call "COPY pipelining." It ought to help by 
parallelizing the data load. It's in the TODO...

======================================================
COPY pipelining

- the notion here is to try to parallelize the data load at
SUBSCRIBE time. Suppose we decide we can process 4 tables at a
time, we set up 4 threads. We then iterate thus:

For each table
- acquire a thread (waiting as needed)
- submit COPY TO stdout to the provider, and feed to
COPY FROM stdin on the subscriber
- Submit the REINDEX request on the subscriber

Even with a fairly small number of threads, we should be able to
process the whole subscription in as long as it takes to process
the single largest table.

This introduces a risk of locking problems not true at present
(alas) in that, at present, the subscription process is able to
demand exclusive locks on all tables up front; that is no longer
possible if the subscriptions are split across multiple tables.
In addition, the updates will COMMIT across some period of time on
the subscriber rather than appearing at one instant in time.

The timing improvement is probably still worthwhile.

http://lists.slony.info/pipermail/slony1-hackers/2007-April/000000.html
======================================================