[Slony1-general] Slony Tuning, heavy slon backup every AM

Thu Aug 4 08:39:29 PDT 2016

Another problem we recently ran into is that after bulk operations, the
queries selecting the log data on the data provider can degrade. A
workaround
in one case was to ALTER the slony user and set work_mem to 512MB. The
work_mem of 100MB, used by the application, was causing the scans on
the sl_log table to become sequential scans.

The problem is amplified if the tables are spread across many sets. Merging
the sets into few larger ones causes fewer scans.

Regards, Jan

On Thu, Aug 4, 2016 at 7:52 AM, Glyn Astill <glynastill at yahoo.co.uk> wrote:

>
> > From: Tory M Blue <tmblue at gmail.com>
> >To: Jan Wieck <jan at wi3ck.info>
> >Cc: slony <slony1-general at lists.slony.info>
> >Sent: Wednesday, 3 August 2016, 21:45
> >Subject: Re: [Slony1-general] Slony Tuning, heavy slon backup every AM
> >
> >
> >
> >
> >
> >
> >
> >On Wed, Aug 3, 2016 at 12:30 PM, Jan Wieck <jan at wi3ck.info> wrote:
> >
> >
> >>
> >>
> >>
> >>On Wed, Aug 3, 2016 at 11:14 AM, Tory M Blue <tmblue at gmail.com> wrote:
> >>
> >>Hey folks
> >>>
> >>>Running into some more issues and i'm not finding a definitive tuning
> guide nor can I tell why i'm backing up and why it takes so long to catch
> up (so long is relative).
> >>>
> >>>
> >>>Running Slony 2.2.3 (but this has been an issue for a long time).
> >>>
> >>>
> >>>Image to show you where I am this AM , but this is a constant every
> AM.  I have large updates about 2.5 million between 12am and 4am, the job
> is done but it will take Slon a couple of hours to catch up.
> >>
> >>
> >>Are there long running transactions that start around the time, the
> sl_log starts
> >>growing? Any long running transaction prevents a log switch from
> finishing.
> >>
> >>
> >>
> >>
> >>Jan
> >
> >
> >Hi Jan,
> >
> >
> >There are really not many long queries during the big import, but things
> start getting longer when the slon logs have 10m+ rows in them.
> >
> >
> >I actually log any transaction that takes over a second and there is
> nothing that stands out, nothing that runs all that long. During this
> period, I move my reporting to point to the master, because of the delay in
> the standby, so there are some longer report queries, but the longest is 16
> minutes
> >
>
>
> Do you see any long running copy statements from the slon on the
> subscriber?  If so one avenue to investigate is if there are any replicated
> tables with sub-optimal indexes on them.
>
> I mention it purely because I've been bitten by this myself and seen very
> similar symptoms to what you're reporting here.  There's every chance this
> is way off the mark, but I think it's worth mentioning.
>
> The issue I saw was down to a bunch of pretty ugly indexes created by an
> application, and the result was inaccurate selectivity estimates causing
> the planner to choose one of the ugly indexes instead of the primary key
> when applying the changes on the subscriber.  Multivariate planner
> statistics that PostgreSQL doesn't yet have might have mitigated this, but
> that's besides the point.
>
>
> For example I'd see something like 200,000 records getting updated by a
> single update statement on the origin taking about 20 seconds, but when
> applied as 200,000 separate update statements on the subscriber it took
> more like 200 minutes.  An update using an index scan on the ugly index
> would take 62ms vs 0.1ms on the primary key, but outwardly all I saw until
> I enabled autoexplain was a long running copy.
>
> Glyn
>

-- 
Jan Wieck
Senior Postgres Architect
http://pgblog.wi3ck.info
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20160804/511f97bf/attachment-0001.htm