James Black jfb
Tue Jan 11 19:48:01 PST 2005
Hello, all,

We're having a very similar problem; our replication is happening in a 
timely fashion, but the load on the machine providing the set is 
uncomfortably high, given the load.  A 'top -c' reveals the slony 
process, more or less pegging one of the CPUs, running the "FETCH 100 
..." query.

We have a lot of rows in sl_log_1, on the order of 400,000.  We don't 
have anything sticking around in sl_confirm; or extra nodes that aren't 
being serviced; or anything else particularly suspicious.

The cleanupThread vacuums vary in length from .3 to 1.2 seconds; 
however, our "delay for first rows" times have grown from 1.7 seconds 
at daemon startup, to around 6.1 seconds.  The slon daemons have been 
running for about 20 hours.

I wonder if we're just running into a limitation of the system; that 
we're seeing traffic high enough that the standard 10 minute cleanup 
latency is too long.  Is this a modifiable value?  Is that even a 
fruitful avenue for investigation?  Our slony daemons are set with -g24 
and -s1000; are those reasonable values?  Is there something I'm 
missing?

Sorry for all the beginner's questions,
jfb

-- 
James Felix Black
Programmer, iParadigms LLC
(510) 287-9720 x 250



More information about the Slony1-general mailing list