Christopher Browne cbbrowne
Thu Mar 9 15:34:50 PST 2006
Hannu Krosing <hannu at skype.net> writes:

> ??hel kenal p??eval, E, 2006-03-06 kell 18:08, kirjutas Christopher
> Browne:
>> There is a list of Works In Progress...
>> 
>> http://slony-wiki.dbitech.ca/index.php/Works_In_Progress
>> 
>> Most things are addressed, at this point.
>> 
>> We should see about making sure that outstanding items that should be
>> done for 1.2 are assigned.
>> 
>> Once they are done, we should see about scheduling a release, which will
>> need to include a goodly amount of testing, as there is a LOT of new
>> stuff, including:
>> 
>> - Windows support
>> - Major revision of memory management (so that big tuples don't blow
>> memory out) which should limit memory usage pretty incredibly
>> - DDL scripts are broken into individual statements
>> - Log switching (between sl_log_1 and sl_log_2)
>> - Subscribe set aggressively locks tables on the subscriber to avoid
>> failures
>> - A lot of fixes to build environment (this needs to be tested on lots
>> of platforms)
>> - pg_listener is used *way* less; slon uses polling, if things are
>> running busily
>> - slon "lag interval" option
>> - slon "stop after event" option
>> 
>> I'd like to improve some of the scripts, probably via redoing watchdogs
>> in plain shell; it's not clear if that'll happen for 1.2...
>> 
>> At any rate, if this list of enhancements is incomplete, I'd appreciate
>> hearing about what may be missing.  And if there are things that Just
>> Must Go In, we'd be best to know that ASAP...
>
> I have'nt checked latest versions, but I fear that there are still some
> footguns lurking in how subscribe is done, especially what is checked
> and what not.
>
> A) Last time I checked, it was still possible to redirect subscriber to
> a node, where the subscribed table might even not exists.
>
> example:
>    I have nodea 1, 2 and three
>    set1 has master on node 1 and subscriber on node2 
>    it was possible to change subscribe of set1  on node2 to use node3 as
> master, even though there was no set1 there - no error, no warning,
> nothing.

<http://gborg.postgresql.org/project/slony1/bugs/bugupdate.php?1362>

See the relevant code in CVS HEAD, in function subscribeSet()...

	-- ---
	-- Verify that the provider is either the origin or an active subscriber
	-- Bug report #1362
	-- ---
	if v_set_origin <> p_sub_provider then
		if not exists (select 1 from @NAMESPACE at .sl_subscribe
			where sub_set = p_sub_set and 
                              sub_receiver = p_sub_provider and
			      sub_forward and sub_active) then
			raise exception ''Slony-I: provider % is not an active forwarding node for replication set %'', p_sub_provider, p_sub_set;
		end if;
	end if;

There is every reason to believe this case to be handled in 1.2.

> B) Another thing:
>   It seems that if there are no sub-subscribers on subscriber, then even
> with "subscribe, forward=yes", nothing is stored in sl_log_1. when doing
> a change in subscription from one node to another it is then possible to
> lose some data
>
> example:
>    nodes 1,2,3,4
>    subscribe set1 1 -> 2 -> 4 , and 1 -> 3 with forward
>    no change 4 to use 3 as its source - some data is lost

That appears inconsistent with the code.

Immediately before applying the replication request (e.g. - an
insert/update/delete), in remote_worker.c, the following code is
invoked.

	/*
	 * If we are forwarding this set, add the insert
	 * into sl_log_?
	 */
	if (wd->tab_forward[log_tableid])
	{
		slon_appendquery(&(line->log),
				"insert into %s.sl_log_%d "
				"    (log_origin, log_xid, log_tableid, "
				"     log_actionseq, log_cmdtype, "
				"     log_cmddata) values "
                                "    ('%s', '%s', '%d', '%s', '%q', '%q');\n",
				rtcfg_namespace, wd->active_log_table,
				log_origin, log_xid, log_tableid,
				log_actionseq, log_cmdtype, log_cmddata);
		largemem *= 2;
	}

It is also inconsistent with an examination that I just did of a set
of nodes where subscribers have forwarding set to true, but where none
of those subscribers actually are providers for other nodes.

I'd need to see more evidence that this "not bothering to populate
sl_log_1 or sl_log_2" actually takes place.

> C) And yet another maybe even more fundamental thing
>
> paths are calculated based on nodes, but subscription unit is a set,
> so subscribing some sets from 1 to 4 (above) via node 2 and some via
> node 3 does not work as events just do not propagate. It annoys most
> if you have several sets subscribed 1 -> 2 -> 4, and want to move
> the subscription to use 1 -> 3 -> 4.  Doing by changing subcription
> of sets on node4 to use node3 instead of node2 either fails
> completely or just loses data.

> Another confusing thing about this is, that subscribing of a new set
> works through the COPY phase, and then fails, when events from sl_log_1
> should start moving

I think I saw something like that occur once, and I think Jan was
aware of it...

> A) and possibly B) should be asy to fix. I dont know how hard C) is 

A) is already fixed; it has been in CVS HEAD for a long time now.

B) appears inconsistent with the code, and with observed behaviour; I
need to see more evidence before I believe that B) can happen.

C) I'll leave to Jan to comment on...
-- 
output = reverse("ofni.sailifa.ac" "@" "enworbbc")
<http://dba2.int.libertyrms.com/>
Christopher Browne
(416) 673-4124 (land)



More information about the Slony1-general mailing list