Fri Apr 7 11:16:36 PDT 2006
- Previous message: [Slony1-general] Slave server dies after a few days of replication
- Next message: [Slony1-general] Slave server dies after a few days of replication
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Aaron Randall <aaron.randall at visionoss.com> writes: > Hi all! > > I am seeing a problem occurring after a few days of replication between > two of my servers - they replicate fine and then suddenly the slon > process stops on the slave. Does the slon start back up happily after this? > The log file gives good information...I > just need help in understanding it. Here is the point in the slave logs > where the slon process shuts down: > > "2006-03-31 12:47:40 GMT DEBUG2 remoteHelperThread_1_1: 0.007 seconds > until close cursor > 2006-03-31 12:47:40 GMT DEBUG2 remoteWorkerThread_1: new sl_rowid_seq > value: 1000000000000000 > 2006-03-31 12:47:40 GMT DEBUG2 remoteWorkerThread_1: SYNC 244391 done in > 0.034 seconds > 2006-03-31 12:47:47 GMT DEBUG2 syncThread: new sl_action_seq 1 - SYNC 230540 > 2006-03-31 12:47:47 GMT DEBUG2 localListenThread: Received event > 2,230540 SYNC > 2006-03-31 12:47:47 GMT DEBUG2 remoteWorkerThread_1: forward confirm > 2,230540 received by 1 > 2006-03-31 12:47:50 GMT DEBUG2 remoteListenThread_1: queue event > 1,244392 SYNC > 2006-03-31 12:47:50 GMT DEBUG2 remoteWorkerThread_1: Received event > 1,244392 SYNC > 2006-03-31 12:47:50 GMT DEBUG2 remoteWorkerThread_1: SYNC 244392 processing > 2006-03-31 12:47:50 GMT DEBUG2 remoteWorkerThread_1: syncing set 1 with > 250 table(s) from mytable 1 > 2006-03-31 12:47:50 GMT DEBUG2 remoteHelperThread_1_1: 0.006 seconds > delay for first row > 2006-03-31 12:47:50 GMT DEBUG2 remoteHelperThread_1_1: 0.007 seconds > until close cursor > 2006-03-31 12:47:50 GMT DEBUG2 remoteWorkerThread_1: new sl_rowid_seq > value: 1000000000000000 > 2006-03-31 12:47:50 GMT DEBUG2 remoteWorkerThread_1: SYNC 244392 done in > 0.032 seconds > 2006-03-31 12:47:56 GMT FATAL syncThread: "start transaction;set > transaction isolation level serializable;select last_value from > "_my_replication".sl_action_seq;" - FATAL: terminating connection due > to administrator command > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > 2006-03-31 12:47:56 GMT DEBUG1 slon: shutdown requested > 2006-03-31 12:47:56 GMT DEBUG2 slon: notify worker process to shutdown > 2006-03-31 12:47:56 GMT DEBUG2 slon: wait for worker process to shutdown > 2006-03-31 12:47:56 GMT INFO remoteListenThread_1: disconnecting from > 'host=1.1.1.2 dbname=mydb user=slonyuser port=5432' > 2006-03-31 12:47:56 GMT DEBUG1 remoteListenThread_1: thread done > 2006-03-31 12:47:56 GMT DEBUG1 localListenThread: thread done > 2006-03-31 12:47:56 GMT DEBUG1 cleanupThread: thread done > 2006-03-31 12:47:56 GMT DEBUG1 main: scheduler mainloop returned > 2006-03-31 12:47:56 GMT DEBUG2 main: wait for remote threads > 2006-03-31 12:47:56 GMT DEBUG2 sched_wakeup_node(): no_id=1 (0 threads + > worker signaled) > 2006-03-31 12:47:56 GMT DEBUG1 remoteWorkerThread_1: helper thread for > provider 1 terminated > 2006-03-31 12:47:56 GMT DEBUG1 remoteWorkerThread_1: disconnecting from > data provider 1 > 2006-03-31 12:47:56 GMT DEBUG1 remoteWorkerThread_1: thread done > 2006-03-31 12:47:56 GMT DEBUG2 main: notify parent that worker is done > 2006-03-31 12:47:56 GMT DEBUG1 main: done > 2006-03-31 12:47:56 GMT DEBUG2 slon: worker process shutdown ok > 2006-03-31 12:47:56 GMT DEBUG2 slon: exit(-1) > " Something sent a SIGTERM signal to the backend supporting the syncThread, which, if memory serves, could mean that *any* of the backends that slon is listening to were terminated. You should figure out why something is sending SIGTERM signals to your databases; this isn't a Slony-I issue per se. Out of memory problems have historically caused this; you should check database logs to see what's up. Slony-I won't fix your database problems; it is simply vulnerable to them :-(. -- let name="cbbrowne" and tld="ca.afilias.info" in String.concat "@" [name;tld];; <http://dba2.int.libertyrms.com/> Christopher Browne (416) 673-4124 (land)
- Previous message: [Slony1-general] Slave server dies after a few days of replication
- Next message: [Slony1-general] Slave server dies after a few days of replication
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list