Fri Apr 16 11:01:53 PDT 2010
- Previous message: [Slony1-general] recreating a cluster when the master dies
- Next message: [Slony1-general] recreating a cluster when the master dies
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
albert wrote: > Greetings all, > > I have a master-slave setup and am trying to automate a recovery > situation where the master fails and it is recreated from scratch based > on a dump from the slave's database. You don't tell us which version of slony your using (this can be useful to know) > > Here's the flow of events I am using to test the transition: > > 1. the cluster is registered, the master and slave are in sync, all good. > 2. the master dies. the master database is recreated from scratch using > a dump from the slave's database When you take the dump of the slave database it still has slony installed on it. Once you've restored this on the master your master has the slaves slony configuration on it. It is probably a good idea to not start any slons up until after your uninstall node is finished (or to not restore the _my_cluster schema) though I don't think this is your problem. > 3. the master-slave replication cluster is deleted using the following > code snippet: > > TODO: ********** remoteWorkerThread: node 1 - EVENT 1,27 STORE_NODE - > unknown event type This is very strange, the error is saying that the big if/else block in remote_worker.c isn't matching the events even the event name as printed in the above message looks okay. If you have the ability I'd be curious attach a debugger to the slon process when it gets to this state and see what event->ev_type looks like at line 715 (in 1.2.21 source or the equivlent line on whatever version your on). The strcmp against "STORE_NODE" should be matching and it should be going into that if block instead of falling to the last else where it prints the above error message. > 2010-04-16 11:39:42 AST CONFIG storeListen: li_origin=1 li_receiver=2 > li_provider=1 > TODO: ********** remoteWorkerThread: node 1 - EVENT 1,28 ENABLE_NODE - > unknown event type > 2010-04-16 11:39:42 AST CONFIG storeListen: li_origin=1 li_receiver=2 > li_provider=1 > 2010-04-16 11:39:42 AST CONFIG storeListen: li_origin=1 li_receiver=2 > li_provider=1 > 2010-04-16 11:39:42 AST CONFIG remoteWorkerThread_1: update provider > configuration > > These log events are the same when the cluster is working flawlessly > (although more events are logged after these, of course). > It looks as thought the replication silently stops working with no > apparent reason. I would not expect to see those 'TODO: **************** ..... unknown event type ' lines when the cluster is working flawlessly, are you saying that you always get them? > Could anyone please help me understand what might be going wrong? > > Thanks > Albert > > > ------------------------------------------------------------------------ > > _______________________________________________ > Slony1-general mailing list > Slony1-general at lists.slony.info > http://lists.slony.info/mailman/listinfo/slony1-general -- Steve Singer Afilias Canada Data Services Developer 416-673-1142
- Previous message: [Slony1-general] recreating a cluster when the master dies
- Next message: [Slony1-general] recreating a cluster when the master dies
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list