[Slony1-general] Initial COPY works, but no new data since.

Fri Mar 31 09:26:07 PST 2006

Gavin Hamill wrote:
> Oh dear, I appear to be in a bit over my head, and I hope someone can 
> help :)
>
> The situation is a simple 2-node setup, replicating from 194.24.250.137 
> -> 194.24.250.143 - configuration of 'set1' and submission via slonik 
> went smoothly, and the initial set COPY took 2 hours, which is about 
> right for 3G of data..
>
> My problem is that I have not seen any new data on the slave since the 
> initial copy. The master (node 1) is has text logs in /var/log/slony as 
> specified with
>
> $ /usr/lib/postgresql/8.1/bin/slon -a /var/log/slony replication 
> "host=194.24.250.137 dbname=laterooms user=XXX port=5432 password=XXXXXX"
>
>   
I see a first problem here...

It's useless to use the "-a" option against the "master" node.  The
master node is not subscribing to any sets, so nothing will be recorded
in the "log shipping" logs there.
> the most recent slon console output on the master is:
>
> 2006-03-31 11:06:47 BST CONFIG main: slon version 1.1.5 starting up
> 2006-03-31 11:06:47 BST CONFIG main: local node id = 1
> 2006-03-31 11:06:47 BST CONFIG main: launching sched_start_mainloop
> 2006-03-31 11:06:47 BST CONFIG main: loading current cluster configuration
> 2006-03-31 11:06:47 BST CONFIG storeNode: no_id=2 no_comment='Node 2 - 
> Jalapeno'
> 2006-03-31 11:06:47 BST CONFIG storePath: pa_server=2 pa_client=1 
> pa_conninfo="host=194.24.250.143 dbname=laterooms user=XXX port=5432 
> password=XXXXX" pa_connretry=10
> 2006-03-31 11:06:47 BST CONFIG storeListen: li_origin=2 li_receiver=1 
> li_provider=2
> 2006-03-31 11:06:47 BST CONFIG storeSet: set_id=1 set_origin=1 
> set_comment='All laterooms tables and sequences'
> 2006-03-31 11:06:47 BST CONFIG main: configuration complete - starting 
> threads
> 2006-03-31 11:06:47 BST DEBUG1 localListenThread: thread starts
> NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=17670
> NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=20424
> NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=20425
> NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=21028
> NOTICE:  Slony-I: cleanup stale sl_nodelock entry for pid=21030
> 2006-03-31 11:06:47 BST CONFIG enableNode: no_id=2
> 2006-03-31 11:06:47 BST DEBUG1 main: running scheduler mainloop
> 2006-03-31 11:06:47 BST DEBUG1 remoteWorkerThread_2: thread starts
> 2006-03-31 11:06:47 BST DEBUG1 cleanupThread: thread starts
> 2006-03-31 11:06:47 BST DEBUG1 remoteListenThread_2: thread starts
> 2006-03-31 11:06:47 BST DEBUG1 syncThread: thread starts
> 2006-03-31 11:06:47 BST DEBUG1 remoteListenThread_2: connected to 
> 'host=194.24.250.143 dbname=laterooms user=XXX port=5432 password=XXXXXX'
>
> There is a slony1_log_2_00000000000000XXXX.sql in /var/log/slony on the 
> master every 10 seconds, as I'd expect, but each and every one of these 
> files is 288 bytes long with a boilerplate:
>
> =================================
> -- Slony-I log shipping archive
> -- Node 2, Event 2435
> start transaction;
>
> ------------------------------------------------------------------
> -- End Of Archive Log
> ------------------------------------------------------------------
> commit;
> vacuum analyze "_replication".sl_setsync_offline;
> =================================
>   
That seems pretty normal.  Node 1 is the master; it gets empty SYNCs
every so often from node 2.
> Finally, there's been an 'idle in transaction' postgres process on it 
> for hours.
>
> There is IP traffic between the two - PG packets containing stuff like
>
> select ev_origin, ev_seqno, ev_timestamp,        ev_minxid, ev_maxxid, 
> ev_xip,        ev_type,        ev_data1, ev_data2,        ev_data3, 
> ev_data4,        ev_data5, ev_data6,        ev_data7, ev_data8 from 
> "_replication".sl_event e where (e.ev_origin = '1' and e.ev_seqno > '23359')
>
>
> The slave is more interesting, however. First, the 'slon' process is 
> consuming 100% CPU time (on a single CPU, of course)... second there has 
> been an 'idle in transaction' postgres process on it for hours...
>
> in the slave's 'slon log' dir, there's 2GB worth of logfiles...
>
> many "slony1_log_1_000000000000000XXXXX.sql" containing incremental 
> updates to the DB, I can see INSERT/UPDATE/DELETEs in these files and 
> the queries are definately coming from the master db.
>
> Worrying is the 1GB slony1_log_2_00000000000000000261.sql which is the 
> entirety of the initial dump - should this still be here?
>   
Yeah, that all seems pretty normal.
> As you can no doubt tell, I'm very green with this, and would just 
> appreciate some reassurance and pointing in the right direction! :(
>   
The part I'm wondering about is thus:

I have not seen any new data on the slave since the initial copy.

It sounds as though you are seeing log shipping logs that contain new
data; are you not seeing the corresponding data loaded onto the
subscriber?  That would be quite surprising...