[Slony1-general] Slony on High Load

Tue Apr 2 06:42:44 PDT 2013

On 13-04-02 04:37 AM, Pruteanu Dragos wrote:
> Hi Slony admins,
> Here I have a problem for a Slony setup on a really loaded primary
> database.
> I try to build the slony and got from time to time an error.
> Maybe is related to the high load we have. I hope you can help.
>
>
> PGRES_FATAL_ERROR ERROR: stack depth limit exceeded
> HINT:  Increase the configuration parameter "max_stack_depth", after
> ensuring the platform's stack depth limit is adequate.
> The line before this error message in the slony logs has ~11MB worth of
> text consisting mainly in a long concatenation of:
> ... and log_actionseq <> '...'
> This data is also present in the sl_setsync table.
> The problem happens immediately after the slave finishes syncing the
> set, enables the subscription and tries to do the first sync.
> I found a thread about it here:
> http://old.nabble.com/Slave-can%27t-catch-up,-postgres-error-%
> <http://old.nabble.com/Slave-can%27t-catch-up,-postgres-error-%25>
> 27stack-depth-limit-exceeded%27-td33182661.html
> We're running on postgres 9.0.10 and slony1 version 2.0.7, and upgrading
> is not an option in the near future (eventually we will upgrade both
> postgres and slony).
> The problem is that we hit this issue now more and more regularly - and
> it is a killer for the slony replication, as it is not possible to
> reliably set it up...
> What I already tried and didn't help:
>   * set max_stack_depth up to ridiculous amounts (10s of GB) - not sure
> if I got the OS side of it right, but I did my best;
>   * decrease the slon deamon's SYNC_CHECK_INTERVAL to 1 second;
> With both those I still get the error regularly...
> I wonder if this is fixed in newer slony releases, or if there's any
> chance I can get some help/directions on how to fix/patch it in the
> version we use to avoid this problem ?
> Jan Wieck mentions in the thread cited above that the a solution would
> be:
> <quote>
> The improvement for a future release would be to have the remote worker
> get the log_actionseq list at the beginning of copy_set. If that list is
> longer than a configurable maximum, it would abort the subscribe and
> retry in a few seconds. It may take a couple of retries, but it should
> eventually hit a moment where a SYNC event was created recently enough
> so that there are only a few hundred log rows to ignore.
> </quote>
> Was this already implemented in a newer release ?
> If not I would like to work on it, including back-patch for the 2.0.7
> version we use...
> I would appreciate any help/hints on how to approach this !
> Cheers,

See bug 264 http://www.slony.info/bugzilla/show_bug.cgi?id=264 and the 
patches referenced.

>
>
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general
>