Thu Sep 27 12:58:00 PDT 2012
- Previous message: [Slony1-general] Issue when adding node to replication
- Next message: [Slony1-general] Issue when adding node to replication
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Follow up: I executed this on the master: mydatabase=# select * from _slony.sl_event where ev_origin not in (select no_id from _slony.sl_node); ev_origin | ev_seqno | ev_timestamp | ev_snapshot | ev_type | ev_data1 | ev_data2 | ev_data3 | ev_data4 | ev_data5 | ev_data6 | ev_data7 | ev_data8 -----------+------------+-------------------------------+--------------------+---------+----------+----------+----------+----------+----------+----------+----------+---------- 3 | 5000290161 | 2012-09-27 09:48:03.749424-04 | 40580084:40580084: | SYNC | | | | | | | | (1 row) There is a row in sl_event that shouldn't be there, because it's referencing a node that nolonger exists. I need to add this node back to replication, but I don't want to run into the same issue as before. I ran a cleanupEvent('10 minute') and it did nothing (even did it with 0 minutes). Will this row eventually go away? will it cause issue if we attempt to add a new node to replication with node = 3? How can I safely clean this up? thanks, - Brian F On 09/27/2012 01:28 PM, Brian Fehrle wrote: > On 09/27/2012 01:26 PM, Jan Wieck wrote: >> On 9/27/2012 2:34 PM, Brian Fehrle wrote: >>> Hi all, >>> >>> PostgreSQL v 9.1.5 - 9.1.6 >>> Slony version 2.1.0 >>> >>> I'm having an issue that's occurred twice now. I have 4 node slony >>> cluster, and one of the operations is to drop a node from replication, >>> do maintenance on it, then add it back to replication. >>> >>> Node 1 = master >>> Node 2 = slave >>> Node 3 = slave -> dropped then readded >>> Node 4 = slave >> First, why is the node actually dropped and readded so fast, instead >> of just doing the maintenance while it falls behind, then let it catch >> up? >> > We have several cases where it makes sense, such as re-installing the OS > or in todays case, we replaced the physical machine with a new one. > >> You apparently have a full blown path network from everyone to >> everyone. This is not good under normal circumstances since the >> automatic listen generation will cause every node to listen on every >> other node for events, from non-origins. Way too many useless database >> connections. > From my understanding, without this set-up, all events must then be > passed through the master node to relay it. So master node = 1, slave = > 2 and 3, 3 must communicate with 2, and without direct access it will > relay through the master. Is this understanding wrong? > >> What seems to happen here are some race conditions. The node is >> dropped and when it is added back again, some third node still didn't >> process the DROP NODE and when node 4 looks for events from node 3, it >> finds old ones somewhere else (like on 1 or 2). When node 3 then comes >> around to use those event IDs again, you get the dupkey error. >> >> What you could do if you really need to drop/readd it, use an explicit >> WAIT FOR EVENT for the DROP NODE to make sure all traces of that node >> are gone from the whole cluster. >> > Ok, I'll look into implementing that. Another thought was to issue a > cleanupEvent() on each of the nodes still attached to replication after > I do the dump. > > Thanks > - Brian F >> Jan >> > _______________________________________________ > Slony1-general mailing list > Slony1-general at lists.slony.info > http://lists.slony.info/mailman/listinfo/slony1-general -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-general/attachments/20120927/100d8e0c/attachment.htm
- Previous message: [Slony1-general] Issue when adding node to replication
- Next message: [Slony1-general] Issue when adding node to replication
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list