Florian G. Pflug fgp
Thu Mar 2 16:45:13 PST 2006
Christopher Browne wrote:
> Florian G. Pflug wrote
>>>By the way, the algorithm that you propose seems only to differ in one
>>>material way from the existing one:  The existing one will prefer more
>>>direct listen paths based on sl_subscribe to those of "lesser"
>>>status. That is what led to the 2 row difference above.  That difference
>>>shouldn't cause the problem you observed, in that all it does is to
>>>eliminate some paths that wouldn't have been useful anyways.
>>
>>No, I don't think so. My suggested algorith was the following:
>>
>>"Telling node X via sl_listen to ask neighbour-nodes
>>(Those for which a sl_path entry exists) for events from all other nodes,
>>apart from those for which the events must have travelled via node X to
>>reach the neighbour of X in question. aths that wouldn't have been
>>useful anyways."
>>
>>A more formal description, assume that the slony cluster forms a
>>directed graph
>>where each node represents a vertex, and each sl_path entries
>>represents a (directred)
>>edge from pa_server to pa_client.
>>1) Generate all possible traversals of the graph, in the following form:
>>   [origin, node1, node2, ... provider, receiver]
>>   (There must be edges origin->node1, node2->node2, ...
>>nodeN->provider, provider->receiver
>>   in the graph).
>>2) Remove all traversals that include circles, or in other words where a
>>   node is traversed more than once (e.g a->b->c->b->d).
>>3) Group those traversals by (origin, provider, receiver)
>>
>>According to someone here on this list, this will have to be refined a
>>bit,
>>because the events from a set-origin to the subscribers must travel by
>>the same
>>way as the data does. This amounts to adding the following step to the
>>algorithm
>>stated above
>>
>>4) Remove all traversals (origin,provider,receiver) where receiver
>>subscribes a set
>>   originating from origin, but where provider is not the set-provider
>>for receiver.
>>
>>The code I wrote, and that is included in the bugreport, was an
>>attempt to implement
>>that algorithm. The major problem is that the only "easy" way to do
>>this, given the current
>>structure of slonys system-tables, is an iterative approch, where the
>>_already_grouped_
>>traversals are build by extending them by one node in every step. This
>>already-grouped
>>representation, however, lacks the information needed to correctly
>>satisfy the "circle-freeness"
>>requirement. (Because you don't know exactly which nodes an event
>>traverses when travelling
>>from origin via provider to receiver).
>>
>>Maybe I working compromise would be to remove this "circle-freeness"
>>requirement - I don't know
>>how much of an performance impact a few unnecessary sl_listen entries
>>generate.
>>
>>Another solution would be to add a table sl_listen_raw, with the
>>fields (origin, way, receiver),
>>and _no_ primary key. Using this table, coding the algorithm above
>>would be trivial, and sl_listen
>>could afterwards be created from sl_listen_raw in just one step (Using
>>a group operation)
>>
>>I'd prefer the second solution, and would be willing to code it - but
>>I guess adding a table
>>to a stable release is quite a showstopper for you...
> 
> Well, one approach, for that, would be to create it as a temporary
> table, which means no need for schema change.  That approach seems quite
> attractive, regardless...
Well, temptables together with plpgsql are not so well behaved in postgresql -
you'd need to used the "execute" command to execute each query, because
otherwise postgres caches the queryplan, which, for a temptable, leads to
problems if the temptable is gone at the next function execution...

> This change is targeted for version 1.2, and there are schema changes in
> 1.2, already, so it's pretty fair game to add a table, if need be :-).
> 
> I whiteboarded a couple of attempts at algorithms for this, this
> afternoon.  It needs a bit more thought, I think, to come up with
> something final...
> 
> In any case, if you have some code in mind, it certainly is "fair game"
> to create an extra table or two for this.  I'm inclined to do an
> iteration or two on this via email discussion before trying to commit
> anything to CVS.
I'll try to implement my algorithm in plpgsql, using an additional table -
we'll see if there are any objections ;-)

> I think I need to come up with a query that detects broken
> configuration, e.g. - that the results listed above for receiver #10 is
> *wrong*...
Hm.. I believe I had something like that, but I'm not quite sure..
I'll grep by harddrive ;-)

greetings, Florian Pflug




More information about the Slony1-general mailing list