Scott G. Miller sgmiller at gmail.com
Thu Jan 8 09:07:31 PST 2009
On Thu, Jan 8, 2009 at 9:38 AM, cbbrowne <cbbrowne at ca.afilias.info> wrote:

> Scott G. Miller wrote:
>
>  On Thu, Jan 8, 2009 at 8:19 AM, Scott G. Miller <sgmiller at gmail.com<mail=
to:
>> sgmiller at gmail.com>> wrote:
>>
>>    After enabling replication for a large database, I'm consistently
>>    seeing a state where logs will not switch.  sl_log_1 contains some
>>    decent number of rows (a few thousand), and sl_log_2 is growing.
>>  Executing _replication.logswitch_start() and finish() has no
>>    effect, as both report switch still in progress.  This continues
>>    for days, until eventually there are millions of rows in sl_log_2,
>>    and sync times take 15s or more just to fetch rows to apply to the
>>    slave.
>>
>>    Thoughts?
>>
>>
>> Oh, sorry, this is Slony-I 2.0 running on Postgres 8.3.3, simple
>> master/slave configuration with one database and 5 sets.
>>
>>  What does test_slony_state.pl (or the DBI version,
> test_slony_state-dbi.pl) report?
>
> The truncate won't happen if events aren't propagating properly everywher=
e;
> the output of that script may give clues as to what's wrong.
>
> It is a good practice to run that script regularly, as it will notice and
> report on a number of sorts of problems that have caused people to report
> errors, sometimes when they had some configuration problem...
>

After much fiddling around getting that script working (perl-pg is not
readily available for RedHat, bugs in the script for perl 5.8, etc), I get
no errors except complaining that the log sizes are large and that a node
might be down.  Both nodes are up, and the number of lag events ranges
between 0 and about 12, over about 10 seconds.  Everything appears to be
well:

--snip--
DSN: dbname=3Ditem host=3Dlocalhost user=3Dslony password=3Dxxxxxxxx
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D
Rummage for DSNs
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
Query:

   select p.pa_server, p.pa_conninfo

   from _replication.sl_path p
--   where exists (select * from _replication.sl_subscribe s where
--                          (s.sub_provider =3D p.pa_server or s.sub_receiv=
er
=3D p.pa_server) and
--                          sub_active =3D 't')
   group by pa_server, pa_conninfo;


Tests for node 1 - DSN =3D host=3Dpgmaster dbname=3Ditem user=3Dslony port=
=3D5432
password=3Dxxxxxxxx
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
pg_listener info:
Pages: 0
Tuples: 0

Size Tests
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
       sl_log_1     12982 637024.000000
       sl_log_2       587 29320.000000
      sl_seqlog         1 27.000000
Listen Path Analysis
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D
No problems found with sl_listen

---------------------------------------------------------------------------=
-----
Summary of event info
 Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D


---------------------------------------------------------------------------=
------
Summary of sl_confirm aging
   Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
eldest SYNC
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D


---------------------------------------------------------------------------=
---

Listing of old open connections on node 1
       Database             PID            User    Query Age
Query
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D

Tests for node 2 - DSN =3D host=3Dpgslave dbname=3Ditem user=3Dslony port=
=3D5432
password=3Dxxxxxxxx
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
pg_listener info:
Pages: 0
Tuples: 0

Size Tests
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
       sl_log_1     12982 637024.000000
       sl_log_2       587 29320.000000
      sl_seqlog         1 27.000000

Listen Path Analysis
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D
No problems found with sl_listen

---------------------------------------------------------------------------=
-----
Summary of event info
 Origin  Min SYNC  Max SYNC Min SYNC Age Max SYNC Age
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D


---------------------------------------------------------------------------=
------
Summary of sl_confirm aging
   Origin   Receiver   Min SYNC   Max SYNC  Age of latest SYNC  Age of
eldest SYNC
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D

---------------------------------------------------------------------------=
---

Listing of old open connections on node 2
       Database             PID            User    Query Age
Query
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D



Sending message thus - | -s "Slony State Test Warning - Cluster
_replication"
Message:


Node: 1 sl_log_1 tuples =3D 637024 > 200000
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Number of tuples in Slony-I table sl_log_1 is 637024 which
exceeds 200000.

You may wish to investigate whether or not a node is down, or perhaps
if sl_confirm entries have not been propagating properly.


Node: 2 sl_log_1 tuples =3D 637024 > 200000
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Number of tuples in Slony-I table sl_log_1 is 637024 which
exceeds 200000.

You may wish to investigate whether or not a node is down, or perhaps
if sl_confirm entries have not been propagating properly.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-bugs/attachments/20090108/9c0=
a526e/attachment.htm


More information about the Slony1-bugs mailing list