Junaili Lie junaili
Mon Oct 16 11:55:31 PDT 2006
Problem solved.
although there's no old transaction in pg_stat_activity (no <IDLE> in
transaction), I checked the pg_locks and found the transactionid = 746264948
(ev_minxid).
So, I terminated the process associated with that transactionid, and a big
chunk of sl_log_1 entries got deleted.
:D
I am not too sure if this is a bug related to slony or our application.

J



On 10/16/06, Junaili Lie <junaili at gmail.com> wrote:
>
> Hi Christopher
> I didn't see anything from the cleanup thread. most are pretty fast, with
> the exception of vacuuming.
> Number of records on sl_log_1 has been increasing since yesterday.
> on the sl_event, the ev_minxid (for ev_origin=1) has been the same since
> yesterday also.
>
> MONSOON=# select ev_origin, ev_seqno, ev_minxid from
> "_MONSOONCLUSTER".sl_event where (ev_origin, ev_seqno) in   (select
> ev_origin, min(ev_seqno)  from "_MONSOONCLUSTER".sl_event
>     where ev_type = 'SYNC'    group by ev_origin);
>  ev_origin | ev_seqno | ev_minxid
> -----------+----------+-----------
>          1 |    13952 | MONSOON=# select ev_origin, ev_seqno, ev_minxid
> from "_MONSOONCLUSTER".sl_event where (ev_origin, ev_seqno) in   (select
> ev_origin, min(ev_seqno)  from "_MONSOONCLUSTER".sl_event
>     where ev_type = 'SYNC'    group by ev_origin);
>  ev_origin | ev_seqno | ev_minxid
> -----------+----------+-----------
>          1 |    13952 | 746264948
>          2 |     6881 | 82628418
> (2 rows)
>
> How do I know whether the slon processes aren't 'living' long enough to do
> a good clean up?
>
> this is the cleanup thread from
> - Master
> 2006-10-16 08:23:31 PDT DEBUG1 cleanupThread:    0.230 seconds for
> cleanupEvent()
> 2006-10-16 08:23:32 PDT DEBUG1 cleanupThread:    0.543 seconds for delete
> logs
> 2006-10-16 08:34:04 PDT DEBUG1 cleanupThread:    0.237 seconds for
> cleanupEvent()
> 2006-10-16 08:34:05 PDT DEBUG1 cleanupThread:    0.507 seconds for delete
> logs
> 2006-10-16 08:44:22 PDT DEBUG1 cleanupThread:    0.278 seconds for
> cleanupEvent()
> 2006-10-16 08:44:24 PDT DEBUG1 cleanupThread:    1.357 seconds for delete
> logs
> 2006-10-16 08:45:06 PDT DEBUG2 cleanupThread:   42.636 seconds for
> vacuuming
> 2006-10-16 08:55:44 PDT DEBUG1 cleanupThread:    0.127 seconds for
> cleanupEvent()
> 2006-10-16 08:55:45 PDT DEBUG1 cleanupThread:    0.056 seconds for delete
> logs
> 2006-10-16 09:06:02 PDT DEBUG1 cleanupThread:    0.370 seconds for
> cleanupEvent()
> 2006-10-16 09:06:02 PDT DEBUG1 cleanupThread:    0.473 seconds for delete
> logs
> 2006-10-16 09:16:51 PDT DEBUG1 cleanupThread:    0.692 seconds for
> cleanupEvent()
> 2006-10-16 09:16:53 PDT DEBUG1 cleanupThread:    1.158 seconds for delete
> logs
> 2006-10-16 09:17:43 PDT DEBUG2 cleanupThread:   50.542 seconds for
> vacuuming
> 2006-10-16 09:28:02 PDT DEBUG1 cleanupThread:    0.222 seconds for
> cleanupEvent()
> 2006-10-16 09:28:02 PDT DEBUG1 cleanupThread:    0.103 seconds for delete
> logs
> 2006-10-16 09:38:31 PDT DEBUG1 cleanupThread:    0.069 seconds for
> cleanupEvent()
> 2006-10-16 09:38:32 PDT DEBUG1 cleanupThread:    0.060 seconds for delete
> logs
> 2006-10-16 09:49:14 PDT DEBUG1 cleanupThread:    1.182 seconds for
> cleanupEvent()
> 2006-10-16 09:49:16 PDT DEBUG1 cleanupThread:    2.004 seconds for delete
> logs
> 2006-10-16 09:50:05 PDT DEBUG2 cleanupThread:   49.172 seconds for
> vacuuming
> 2006-10-16 10:00:27 PDT DEBUG1 cleanupThread:    0.490 seconds for
> cleanupEvent()
> 2006-10-16 10:00:27 PDT DEBUG1 cleanupThread:    0.107 seconds for delete
> logs
> 2006-10-16 10:11:03 PDT DEBUG1 cleanupThread:    0.453 seconds for
> cleanupEvent()
> 2006-10-16 10:11:04 PDT DEBUG1 cleanupThread:    0.368 seconds for delete
> logs
> 2006-10-16 10:21:33 PDT DEBUG1 cleanupThread:    0.374 seconds for
> cleanupEvent()
> 2006-10-16 10:21:33 PDT DEBUG1 cleanupThread:    0.126 seconds for delete
> logs
> 2006-10-16 10:22:29 PDT DEBUG2 cleanupThread:   55.574 seconds for
> vacuuming
> 2006-10-16 10:33:21 PDT DEBUG1 cleanupThread:    3.556 seconds for
> cleanupEvent()
> 2006-10-16 10:33:21 PDT DEBUG1 cleanupThread:    0.178 seconds for delete
> logs
> 2006-10-16 10:43:52 PDT DEBUG1 cleanupThread:    0.125 seconds for
> cleanupEvent()
> 2006-10-16 10:43:53 PDT DEBUG1 cleanupThread:    0.868 seconds for delete
> logs
> 2006-10-16 10:54:33 PDT DEBUG1 cleanupThread:    0.134 seconds for
> cleanupEvent()
> 2006-10-16 10:54:34 PDT DEBUG1 cleanupThread:    0.650 seconds for delete
> logs
> 2006-10-16 10:55:38 PDT DEBUG2 cleanupThread:   64.382 seconds for
> vacuuming
> 2006-10-16 11:05:56 PDT DEBUG1 cleanupThread:    0.179 seconds for
> cleanupEvent()
> 2006-10-16 11:05:56 PDT DEBUG1 cleanupThread:    0.124 seconds for delete
> logs
>
> - Slave
> 2006-10-16 08:31:16 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 08:31:16 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 08:42:00 PDT DEBUG1 cleanupThread:    0.001 seconds for
> cleanupEvent()
> 2006-10-16 08:42:00 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 08:52:27 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 08:52:27 PDT DEBUG1 cleanupThread:    0.004 seconds for delete
> logs
> 2006-10-16 08:52:47 PDT DEBUG2 cleanupThread:   20.650 seconds for
> vacuuming
> 2006-10-16 09:03:19 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 09:03:19 PDT DEBUG1 cleanupThread:    0.032 seconds for delete
> logs
> 2006-10-16 09:13:44 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 09:13:44 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 09:24:06 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 09:24:06 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 09:24:34 PDT DEBUG2 cleanupThread:   27.288 seconds for
> vacuuming
> 2006-10-16 09:34:53 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 09:34:53 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 09:45:24 PDT DEBUG1 cleanupThread:    0.001 seconds for
> cleanupEvent()
> 2006-10-16 09:45:24 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 09:55:58 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 09:55:58 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 09:56:52 PDT DEBUG2 cleanupThread:   54.306 seconds for
> vacuuming
> 2006-10-16 10:07:20 PDT DEBUG1 cleanupThread:    0.001 seconds for
> cleanupEvent()
> 2006-10-16 10:07:20 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 10:17:46 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 10:17:46 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 10:28:13 PDT DEBUG1 cleanupThread:    0.001 seconds for
> cleanupEvent()
> 2006-10-16 10:28:13 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 10:29:10 PDT DEBUG2 cleanupThread:   57.198 seconds for
> vacuuming
> 2006-10-16 10:39:55 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 10:39:55 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 10:50:39 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 10:50:39 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 11:01:21 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 11:01:21 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
> 2006-10-16 11:02:18 PDT DEBUG2 cleanupThread:   57.428 seconds for
> vacuuming
> 2006-10-16 11:12:45 PDT DEBUG1 cleanupThread:    0.002 seconds for
> cleanupEvent()
> 2006-10-16 11:12:45 PDT DEBUG1 cleanupThread:    0.003 seconds for delete
> logs
>
> Thanks for your help.
>
> j
>
> On 10/16/06, Christopher Browne <cbbrowne at ca.afilias.info> wrote:
> >
> > Junaili Lie wrote:
> > > This is the result from test_slony_state-dbi.pl
> > >
> > > DSN: dbi:Pg:dbname=MONSOON;host=localhost;port=5432
> > >
> > Pulling out some bits...
> > >
> > >
> > --------------------------------------------------------------------------------
> > > Summary of event info
> > >  Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
> > >
> > ================================================================================
> > >
> > >       2      6872      6879     00:00:00     00:14:00    0
> > >       1     13934     13947     00:00:00     00:13:00    0
> > >
> > >
> > >
> > ---------------------------------------------------------------------------------
> > > Summary of sl_confirm aging
> > >    Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
> >
> > > eldest SYNC
> > >
> > =================================================================================
> > >
> > >         1          2      13934      13946      00:01:00
> > > 00:13:00    0
> > >         2          1       6872       6879      00:00:00
> > > 00:14:00    0
> > >
> > >
> > >
> > --------------------------------------------------------------------------------
> > > Summary of event info
> > >  Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
> > >
> > ================================================================================
> > >
> > >       2      6872      6879     00:00:00     00:14:00    0
> > >       1     13934     13947     00:00:00     00:13:00    0
> > >
> > >
> > >
> > ---------------------------------------------------------------------------------
> > > Summary of sl_confirm aging
> > >    Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
> >
> > > eldest SYNC
> > >
> > =================================================================================
> > >
> > >         1          2      13934      13946      00:01:00
> > > 00:13:00    0
> > >         2          1       6872       6879      00:00:00
> > > 00:14:00    0
> > >
> >
> > Well, the fact that you don't have any cases of elderly information in
> > sl_confirm or sl_event indicates that there doesn't appear to be any
> > problem with the flow of events, which represents a major class of
> > "kinds of problems."
> >
> > It doesn't look as though anything is broken with your cluster that
> > would prevent cleaning out sl_log_1.
> >
> > The next suggestion:  Grep for cleanup thread entries in the logs for
> > both slons.
> >
> > Here's a sample excerpt for a not-too-busy set of nodes.  Note the
> > longer times at 16:30/16:31; there was evidently some "blip" of a bunch
> > of data to clean out.
> >
> > 2006-10-16 16:09:39 UTC DEBUG1 cleanupThread:     0.046 seconds for
> > delete logs
> > 2006-10-16 16:20:23 UTC DEBUG1 cleanupThread:    0.115 seconds for
> > cleanupEvent()
> > 2006-10-16 16:20:23 UTC DEBUG1 cleanupThread:    0.221 seconds for
> > delete logs
> > 2006-10-16 16:30:43 UTC DEBUG1 cleanupThread:     0.068 seconds for
> > cleanupEvent()
> > 2006-10-16 16:30:46 UTC DEBUG1 cleanupThread:    3.914 seconds for
> > delete logs
> > 2006-10-16 16:31:08 UTC DEBUG2 cleanupThread:   21.869 seconds for
> > vacuuming
> > 2006-10-16 16:41:34 UTC DEBUG1 cleanupThread:     0.081 seconds for
> > cleanupEvent()
> > 2006-10-16 16:41:34 UTC DEBUG1 cleanupThread:    0.066 seconds for
> > delete logs
> > 2006-10-16 16:52:03 UTC DEBUG1 cleanupThread:    0.069 seconds for
> > cleanupEvent()
> > 2006-10-16 16:52:03 UTC DEBUG1 cleanupThread:     0.059 seconds for
> > delete logs
> > 2006-10-16 17:02:36 UTC DEBUG1 cleanupThread:    0.070 seconds for
> > cleanupEvent()
> > 2006-10-16 17:02:36 UTC DEBUG1 cleanupThread:    0.059 seconds for
> > delete logs
> > 2006-10-16 17:02:37 UTC DEBUG2 cleanupThread:     0.841 seconds for
> > vacuuming
> > 2006-10-16 17:13:02 UTC DEBUG1 cleanupThread:    0.070 seconds for
> > cleanupEvent()
> > 2006-10-16 17:13:02 UTC DEBUG1 cleanupThread:    0.123 seconds for
> > delete logs
> > 2006-10-16 17:23:19 UTC DEBUG1 cleanupThread:     0.079 seconds for
> > cleanupEvent()
> > 2006-10-16 17:23:19 UTC DEBUG1 cleanupThread:    0.081 seconds for
> > delete logs
> > 2006-10-16 17:33:37 UTC DEBUG1 cleanupThread:    0.073 seconds for
> > cleanupEvent()
> > 2006-10-16 17:33:37 UTC DEBUG1 cleanupThread:     0.411 seconds for
> > delete logs
> > 2006-10-16 17:33:45 UTC DEBUG2 cleanupThread:    7.732 seconds for
> > vacuuming
> > 2006-10-16 17:44:29 UTC DEBUG1 cleanupThread:    0.116 seconds for
> > cleanupEvent()
> > 2006-10-16 17:44:29 UTC DEBUG1 cleanupThread:     0.058 seconds for
> > delete logs
> > 2006-10-16 17:55:03 UTC DEBUG1 cleanupThread:    0.071 seconds for
> > cleanupEvent()
> > 2006-10-16 17:55:03 UTC DEBUG1 cleanupThread:    0.059 seconds for
> > delete logs
> > 2006-10-16 18:05:40 UTC DEBUG1 cleanupThread:     0.071 seconds for
> > cleanupEvent()
> > 2006-10-16 18:05:40 UTC DEBUG1 cleanupThread:    0.063 seconds for
> > delete logs
> > 2006-10-16 18:05:51 UTC DEBUG2 cleanupThread:   11.087 seconds for
> > vacuuming
> >
> > A problem could arise if your slon processes aren't "living" long enough
> >
> > to do a good cleanup, if you're not actually seeing these sorts of steps
> > complete.  (Which should happen about every 10 minutes.)
> >
> > It could be that you did a whole lot of updates a little while ago, and
> > there's simply some waiting before the log tables get cleaned out.
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://gborg.postgresql.org/pipermail/slony1-general/attachments/20061016/7b7f9cb5/attachment-0001.html 



More information about the Slony1-general mailing list