Guy Helmer ghelmer at palisadesystems.com
Thu Jul 29 12:22:55 PDT 2010
I have an interesting failure that I can't seem to resolve with slony1 1.2.20.  I have setup replication between two nodes and that seems to be working fine.  I wanted to add a new table to replication, so I followed the steps of creating a new set containing the new table, subscribing the second (slave) node to the set, and trying to merge the new set into the existing set. 

I'm using the slonik_create_set / slonik_subscribe_set / slonik_merge_sets scripts, and initially I made the mistake of leaving the "table_id" for the new set, "set2", set to 1 when I tried to create the new set containing the new table.  After correcting the mistake, I fixed the "table_id" for the new set to resolve conflicts with the table_id values for the existing set and successfully created the set and subscribed the set to my second node, #2.  However, when I tried to merge the set 2 into the existing set 1, the command failed:

sm1# slonik_merge_sets --config slon_tools_db-new.conf 1 1 2 | slonik
<stdin>:4: PGRES_FATAL_ERROR select "_replication".mergeSet(1, 2);  - ERROR:  Slony-I: set 2 has subscriptions in progress - cannot merge
<stdin>:6: Failure to merge sets 1 and 2 with origin 1

FWIW, at this time, I also had made the mistake of not having created the new table in the slave's database.

I tried unsubscribing node 2 from the set, and dropping the set, then recreating the set and resubscribing node 2 to the set, but the merge still failed.  Looking to the log file, I see this in the node 2's log:

2010-07-29 14:02:39 CDT DEBUG2 localListenThread: Received event 2,68315 SYNC
2010-07-29 14:02:42 CDT DEBUG2 remoteListenThread_1: queue event 1,69285 SYNC
2010-07-29 14:02:43 CDT DEBUG1 copy_set 2
2010-07-29 14:02:43 CDT ERROR  remoteWorkerThread_1: node -1 not found in runtime configuration
2010-07-29 14:02:43 CDT WARN   remoteWorkerThread_1: data copy for set 2 failed 211 times - sleep 60 seconds
2010-07-29 14:02:46 CDT DEBUG2 remoteListenThread_1: queue event 1,69286 SYNC
2010-07-29 14:02:49 CDT DEBUG2 syncThread: new sl_action_seq 1 - SYNC 68316
2010-07-29 14:02:49 CDT DEBUG2 localListenThread: Received event 2,68316 SYNC
2010-07-29 14:02:57 CDT DEBUG2 remoteListenThread_1: queue event 1,69287 SYNC

I've tried combinations of restarting the slon daemons, dropping the subscription & set then recreating them, but nothing seems to have helped.

I've found a couple of other threads about "node -1 not found in runtime configuration" errors, but I haven't seen any solutions.  Does it look like I'll have to completely rebuild my slony configuration?  If so, do I have to truncate the tables in the slave database before I restart the replication?

Thanks in advance for any help,
Guy

--------
This message has been scanned by ComplianceSafe, powered by Palisade's PacketSure.


More information about the Slony1-general mailing list