[Slony1-general] hanging slony replication

Mon Jun 29 10:18:52 PDT 2009

On Mon, Jun 29, 2009 at 06:30:20PM +0200, Burak Bender wrote:
> Hi Andrew,
> 
> thanks for your replies.
> 
> Unfortunately it seems that the load issue has something to do with
> slony because the queries are simple select statements and the
> database server is quite powerfull. Otherwise there would be the
> same performance issue when querying the slave.

No, that's not true.  The data origin is under read-write load, which
means that there are effects on the origin which are not being seen on
the replicas.  

It's more or less impossible to debug this for you without access to
the machines, and certainly on the basis of the sketchy info you've
provided.  But given that there are differences in performance for the
same queries on two different systems, you have plenty of diagnostic
tools at your disposal:

 - Look at the system tables on both systems.  In recent releases,
 there's all kinds of data about the maintenance of the system as
 well.  You want to look at the information about vacuum performance
 and the data about table organization.  You may also want to see
 whether updates to the tables are being done as efficiently as
 possible.

 - Look to see about cache efficiency.  How well is the cache being
 used.

 - What's the cache set like anyway?  Do you have the right number of
 buffers for the machine?  Are they balanced right?

 - What does iostat and other such system-level monitor tools tell
 you?  Saying "the server is powerful" is not meaningful: I have seen
 1/2 million dollar servers brought to their knees by I/O storms.
 (Sometimes these storms were even known about in advance, but were
 ignored by the manager in charge of acquisition or the system
 administrators who set up the server without understanding its
 purpose).

Contrary to what you seem to believe, SELECT statements do use a
transaction.  In the most recent versions of Postgres, they don't
represent a barrier to vacuum in the way they once did.  I forget
which version you're using, however.  Also, SELECT statements have to
go to the disk if the record they want isn't in cache.  If you're
suffering from cache churn because you're rolling through too much
data at a time, then every record has to be retrieved from disk.  That
can be enough to slow down a system noticably.

A

> The queries seem to hang for some reason, which does not seem to be a load issue on the database server.
> 
> Any idea of other causes?
> 
> Best regards
> Burak
> 
> -------- Original-Nachricht --------
> > Datum: Mon, 29 Jun 2009 09:27:53 -0400
> > Von: Andrew Sullivan <ajs at crankycanuck.ca>
> > An: Burak Bender <Burak.Bender at gmx.net>
> > CC: ajs at crankycanuck.ca, slony1-general at lists.slony.info
> > Betreff: Re: [Slony1-general] hanging slony replication
> 
> > On Mon, Jun 29, 2009 at 10:07:05AM +0200, Burak Bender wrote:
> > 
> > > If the script runs against a slony slave, everything is fine. But
> > > against the master database, the replication as well as other
> > > queries get very slow.
> > 
> > This sounds like a pure load issue -- you just have too many queries
> > on your database.  Slony works by performing queries too, so it will
> > be affected if the origin is overloaded.
> > 
> > A
> > 
> 

-- 
Andrew Sullivan
ajs at crankycanuck.ca