Mon Jan 28 14:05:31 PST 2008
- Previous message: [Slony1-general] syslog output?
- Next message: [Slony1-general] Re: Issues troubleshooting failing replication - (Master - Slave)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi All: I have spent most of the day parsing through the documentation, but have been unsuccessful in solving the problem. Environment setup: 1 master (host=3Dpgmaster, db=3Dssm) 1 slave (host=3Dpgslave1, db=3Dssm_slave) Commands run on the master: Note: I will not include the db/user/pgplsql commands 1. run the schema script [root at caotta000381 scripts]# psql -U ssm -d ssm -f ~/sessionmanager- postgresql-create.sql 2. initialize the cluster (run on master node) [root at caotta000381 ~]# slonik_init_cluster # INIT CLUSTER cluster name =3D ssm_cluster; node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres port= =3D5432'; node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostgres port=3D5432'; init cluster (id =3D 1, comment =3D 'Node 1 - ssm at pgmaster'); # STORE NODE store node (id =3D 2, event node =3D 1, comment =3D 'Node 2 - ssm_slave at p= gslave1 '); echo 'Set up replication nodes'; # STORE PATH echo 'Next: configure paths for each node/origin'; store path (server =3D 1, client =3D 2, conninfo =3D 'host=3Dpgmaster dbn= ame=3Dssm user=3Dpostgres port=3D5432'); store path (server =3D 2, client =3D 1, conninfo =3D 'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostgres port=3D5432'); echo 'Replication nodes prepared'; echo 'Please start a slon replication daemon for each node'; [root at caotta000381 ~]# slonik_init_cluster | slonik <stdin>:10: Set up replication nodes <stdin>:13: Next: configure paths for each node/origin <stdin>:16: Replication nodes prepared <stdin>:17: Please start a slon replication daemon for each node 3. Start slon process [root at caotta000381 ~]# slon_start 1 Invoke slon for node 1 - /usr/bin/slon -s 1000 -d2 ssm_cluster 'host=3Dpgmaster dbname=3Dssm user=3Dpostgres port=3D5432' 2>&1 > /var/log/slony/slony1/node1/ssm-2008-01-28_13:47:29.log & Slon successfully started for cluster ssm_cluster, node node1 PID [12806] Start the watchdog process as well... [root at caotta000381 ~]# slon_start 2 Invoke slon for node 2 - /usr/bin/slon -s 1000 -d2 ssm_cluster 'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostgres port=3D5432' 2>&1 > /var/log/slony/slony1/node2/ssm_slave-2008-01-28_13:47:38.log & Slon successfully started for cluster ssm_cluster, node node2 PID [12849] Start the watchdog process as well... [root at caotta000381 ~]# 4. build skeleton for slon_tools.conf build the skeleton script for slon_tools.conf slonik_build_env -node pgmaster:ssm:postgres -node pgslave1:ssm_slave:postgres [root at caotta000381 pgsql]# slonik_build_env -node pgmaster:ssm:postgres -node pgslave1:ssm_slave:postgres &add_node(host =3D> 'pgmaster', dbname =3D> 'ssm', port =3D>5432, user=3D>'postgres', password=3D>'', node=3D>1 ); &add_node(host =3D> 'pgslave1', dbname =3D> 'ssm_slave', port =3D>5432, user=3D>'postgres', password=3D>'', node=3D>2 , parent=3D>1); @KEYEDTABLES=3D( "public.managednumberrouteset", "public.mediationstrategy", "public.nonmanagednumbestfitrouteset", "public.normalization", "public.normalizationtype", "public.numberforwarding", "public.outboundrouteorder", "public.outboundrouteset", "public.route", "public.routetype", "public.routetype_normalization", ); @SEQUENCES=3D( "public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq", "public.normalization_normalizationid_seq", ); 5. Add that data to the slon_tools.conf 6. Create the replication sets [root at caotta000381 etc]# slonik_create_set 1 cluster name =3D ssm_cluster; node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres port= =3D5432'; node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostgres port=3D5432'; # TABLE ADD KEY # CREATE SET try { create set (id =3D 1, origin =3D 1, comment =3D 'Set 1 for ssm_cluster'= ); } on error { echo 'Could not create subscription set 1 for ssm_cluster!'; exit -1; } # SET ADD TABLE echo 'Subscription set 1 created'; echo 'Adding tables to the subscription set'; set add table (set id =3D 1, origin =3D 1, id =3D 1, full qualified name =3D 'public.nonmanagednumbestfitroutes= et', key=3D'public.normalization', comment =3D 'Table public.nonmanagednumbestfitrouteset with candidate primary key public.normalization'); echo 'Add candidate primary keyed table public.nonmanagednumbestfitrouteset'; set add table (set id =3D 1, origin =3D 1, id =3D 2, full qualified name =3D 'public.outboundrouteorder', key= =3D' public.outboundrouteset', comment =3D 'Table public.outboundrouteorder with candidate primary key public.outboundrouteset'); echo 'Add candidate primary keyed table public.outboundrouteorder'; set add table (set id =3D 1, origin =3D 1, id =3D 3, full qualified name =3D 'public.normalizationtype', key=3D' public.numberforwarding', comment =3D 'Table public.normalizationtype with candidate primary key public.numberforwarding'); echo 'Add candidate primary keyed table public.normalizationtype'; set add table (set id =3D 1, origin =3D 1, id =3D 4, full qualified name =3D 'public.route', key=3D'public.rout= etype ', comment =3D 'Table public.route with candidate primary key public.routetype'); echo 'Add candidate primary keyed table public.route'; set add table (set id =3D 1, origin =3D 1, id =3D 5, full qualified name =3D 'public.managednumberrouteset', ke= y=3D' public.mediationstrategy', comment =3D 'Table public.managednumberrouteset with candidate primary key public.mediationstrategy'); echo 'Add candidate primary keyed table public.managednumberrouteset'; set add table (set id =3D 1, origin =3D 1, id =3D 6, full qualified name =3D 'public.routetype_normalization', key=3D'', comment =3D 'Table public.routetype_normalization with candidate primary key '); echo 'Add candidate primary keyed table public.routetype_normalization'; # SET ADD SEQUENCE echo 'Adding sequences to the subscription set'; set add sequence (set id =3D 1, origin =3D 1, id =3D 1, full qualified name =3D ' public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq', comment =3D 'Sequence public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq'); echo 'Add sequence public.nonmanagednumbestfitrouteset_outboundroutelogicid_seq'; set add sequence (set id =3D 1, origin =3D 1, id =3D 2, full qualified name =3D ' public.normalization_normalizationid_seq', comment =3D 'Sequence public.normalization_normalizationid_seq'); echo 'Add sequence public.normalization_normalizationid_seq'; echo 'All tables added'; [root at caotta000381 etc]# slonik_create_set 1 | slonik <stdin>:16: Subscription set 1 created <stdin>:17: Adding tables to the subscription set <stdin>:18: PGRES_FATAL_ERROR select "_ssm_cluster".determineIdxnameUnique(' public.nonmanagednumbestfitrouteset', 'public.normalization'); - ERROR: Slony-I: table "public"."nonmanagednumbestfitrouteset" has no unique index public.normalization Note: I ignored the error and moved on. Perhaps this was a bad thing? 7. Subscribed the nodes to set [root at caotta000381 etc]# slonik_subscribe_set 1 2 cluster name =3D ssm_cluster; node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres port= =3D5432'; node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssm_slave user=3Dpostgres port=3D5432'; try { subscribe set (id =3D 1, provider =3D 1, receiver =3D 2, forward =3D ye= s); } on error { exit 1; } echo 'Subscribed nodes to set 1'; [root at caotta000381 etc]# slonik_subscribe_set 1 2 | slonik <stdin>:10: Subscribed nodes to set 1 8. When using the altperl scripts to setup the cluster, it isn't clear what tasks need to be performed on the slave node. From what I read, the only thing that needs to be done on the slave nodes it what I read here: http://linuxfinances.info/info/addthings.html section: (12.4. Adding A Node To Replication). 9. Since I was a little confused I ran the following on the slave node assuming that I was doing the right thing (in accordance with the above reference): I ran this on the slave node: cluster name =3D ssm_cluster; node 1 admin conninfo=3D'host=3Dpgmaster dbname=3Dssm user=3Dpostgres port= =3D5432'; node 2 admin conninfo=3D'host=3Dpgslave1 dbname=3Dssmslave user=3Dpostgres port=3D5432'; try { subscribe set (id =3D 1, provider =3D 1, receiver =3D 2, forward =3D ye= s); } on error { exit 1; } echo 'Subscribed nodes to set 1'; 10. I inserted some data into the master db. 11. Checked the slon logs and didn't see any inserts getting replicated, so I knew something was wrong. 12. I ran the check_slony_cluster.sh on both nodes: Slave: -bash-3.00$ ./check_slony_cluster.sh ssm_cluster ssm_slave pgslave1 ERROR - 1 of 1 nodes not in sync Master: -bash-3.00$ ./check_slony_cluster.sh ssm_cluster ssm pgmaster OK - 1 nodes in sync Running yet another replication test script: -bash-3.00$ ./psql_replication_check.pl -h pgslave1 -d ssm_slave -p 5432 -U postgres ERROR: relation "replication_status" does not exist -bash-3.00$ ./psql_replication_check.pl -h pgmaster -d ssm -p 5432 -U ssm ERROR: relation "replication_status" does not exist At this point I have no idea were the problem lies. I can guess my way through based on the commands that I have run. But at this point I am not seeing any replication nor any sign after manipulating the data that any events were triggering replication. Here is a snippet of the slon log for the slave node: <snippet> 2008-01-28 16:57:05 EST DEBUG2 remoteWorkerThread_2: forward confirm 1,1140 received by 2 2008-01-28 16:57:13 EST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 1141 2008-01-28 16:57:13 EST DEBUG2 localListenThread: Received event 1,1141 SYNC 2008-01-28 16:57:16 EST DEBUG2 remoteListenThread_2: queue event 2,457 SYNC 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: Received event 2,457 SYNC 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: SYNC 457 processing 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: no sets need syncing for this event 2008-01-28 16:57:16 EST DEBUG2 remoteWorkerThread_2: forward confirm 1,1141 received by 2 2008-01-28 16:57:23 EST DEBUG2 syncThread: new sl_action_seq 1 - SYNC 1142 2008-01-28 16:57:23 EST DEBUG2 localListenThread: Received event 1,1142 SYNC 2008-01-28 16:57:27 EST DEBUG2 remoteListenThread_2: queue event 2,458 SYNC 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: Received event 2,458 SYNC 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: SYNC 458 processing 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: no sets need syncing for this event 2008-01-28 16:57:27 EST DEBUG2 remoteWorkerThread_2: forward confirm 1,1142 received by 2 2008-01-28 16:57:31 EST DEBUG2 remoteListenThread_2: queue event 2,459 SYNC 2008-01-28 16:57:31 EST DEBUG2 remoteWorkerThread_2: Received event 2,459 SYNC 2008-01-28 16:57:31 EST DEBUG2 remoteWorkerThread_2: SYNC 459 processing 2008-01-28 16:57:31 EST DEBUG2 remoteWorkerThread_2: no sets need syncing for this event </snippet> I also did a simple query on the table that I had added data to on both nodes: master: =3D=3D=3D=3D=3D=3D=3D=3D=3D ssm=3D# select * from outboundrouteset; outboundroutesetid | comments ------------------------+------------------ ATP_CMA_ROUTESET_ID_1 | ATP_TEST_COMMENT ATP_CMB_ROUTESET_ID_1 | ATP_TEST_COMMENT ATP_PSTN_ROUTESET_ID_1 | ATP_TEST_COMMENT (3 rows) slave: =3D=3D=3D=3D=3D=3D=3D=3D=3D ssm_slave=3D# select * from outboundrouteset; outboundroutesetid | comments --------------------+---------- (0 rows) >From what I can tell, no sign of any replication occurring, which I would expect to look something like this: <snippet> 2007-10-31 06:57:04 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D530 updates=3D530 deletes=3D0 2007-10-31 06:57:04 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D211 updates=3D211 deletes=3D0 2007-10-31 06:57:05 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D197 updates=3D197 deletes=3D0 2007-10-31 06:57:06 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D255 updates=3D255 deletes=3D0 2007-10-31 06:57:07 EDT DEBUG2 remoteHelperThread_1_1: inserts=3D116 updates=3D116 deletes=3D0 </snippet> Any help in diagnosing replication failures would be greatly appreciated. Kind Rgds, Ouray -- = Ouray Viney https://www.viney.ca -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-general/attachments/20080128/= 160aecba/attachment-0001.htm
- Previous message: [Slony1-general] syslog output?
- Next message: [Slony1-general] Re: Issues troubleshooting failing replication - (Master - Slave)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list