JP Fletcher jpfletch at ca.afilias.info
Tue Sep 16 12:21:14 PDT 2008
Hi,

I've just compiled v 1.2.15 with pg 8.1.13, and found that the testcase 
'testmultiplemoves' fails about 50% of the time.

The subscription of node2 to set 2 eventually times out when it doesn't 
get a confirmation from node 3.

./run_test.sh: ERROR: Slonik error see 
/tmp/slony-regress.xbeqaimyk/slonik.log for details
pgdba at ydb1.int.libertyrms.com:/opt/rg/data_dba/src/slony1-1.2.15/tests $ 
cat /tmp/slony-regress.xbeqaimyk/slonik.log
<stdin>:25: timeout exceeded while waiting for event confirmation

slonyregress3=# SELECT * from _slony_regress1.sl_event where ev_type <> 
'SYNC' order by ev_seqno desc limit 1;
 ev_origin | ev_seqno |       ev_timestamp        | ev_minxid | 
ev_maxxid | ev_xip |       ev_type       | ev_data1 | ev_data2 | 
ev_data3 | ev_data4 | ev_data5 | ev_data6 | ev_data7 | ev_data8
-----------+----------+---------------------------+-----------+-----------+--------+---------------------+----------+----------+----------+----------+----------+----------+----------+----------
         1 |       26 | 2008-09-16 19:02:24.31597 | 97730     | 
97731     |        | ENABLE_SUBSCRIPTION | 2        | 1        | 
2        | t        |          |          |          |
(1 row)


This is caused by the presence of an '<IDLE IN TRANSACTION>' connection 
on node 3 that is holding an AccessExclusiveLock on sl_config_lock:

slonyregress3=# SELECT * from pg_locks where not granted;
 locktype | database | relation | page | tuple | transactionid | classid 
| objid | objsubid | transaction |  pid   |        mode         | granted
----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+--------+---------------------+---------
 relation |    29184 |    30112 |      |       |               |         
|       |          |       98030 | 979420 | AccessExclusiveLock | f
(1 row)

slonyregress3=# SELECT relname from pg_class where oid = 30112;
    relname    
----------------
 sl_config_lock
(1 row)

slonyregress3=# SELECT * from pg_stat_activity where procpid = (SELECT 
pid from pg_locks where not granted);
 datid |    datname    | procpid | usesysid | usename 
|                                                 
current_query                                                  
|          query_start          |         backend_start         | 
client_addr | client_port
-------+---------------+---------+----------+---------+----------------------------------------------------------------------------------------------------------------+-------------------------------+-------------------------------+-------------+-------------
 29184 | slonyregress3 |  979420 |       10 | pgdba   | begin 
transaction; set transaction isolation level serializable; lock table 
"_slony_regress1".sl_config_lock;  | 2008-09-16 19:02:37.567212+00 | 
2008-09-16 19:01:23.220162+00 | 127.0.0.1   |       57649
(1 row)

slonyregress3=# SELECT * from pg_locks where relation = 30112 and granted; 
locktype | database | relation | page | tuple | transactionid | classid 
| objid | objsubid | transaction |  pid   |        mode         | granted
----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+--------+---------------------+---------
 relation |    29184 |    30112 |      |       |               |         
|       |          |       98029 | 950632 | AccessExclusiveLock | t
(1 row)

slonyregress3=# SELECT * from pg_stat_activity where procpid = 950632;
 datid |    datname    | procpid | usesysid | usename |     
current_query     |          query_start          |         
backend_start         | client_addr | client_port
-------+---------------+---------+----------+---------+-----------------------+-------------------------------+-------------------------------+-------------+-------------
 29184 | slonyregress3 |  950632 |       10 | pgdba   | <IDLE> in 
transaction | 2008-09-16 19:02:37.568543+00 | 2008-09-16 
19:01:23.201131+00 | 127.0.0.1   |       57644
(1 row)

Killing the '<IDLE IN TRANSACTION>' connection makes the problem go 
away.  What's going on here?


-- 
JP Fletcher
Database Administrator
Afilias Canada
voice: 416.646.3304 ext. 4123
fax: 416.646.3305
mobile: 416.561.4763
jpfletch at ca.afilias.info




More information about the Slony1-general mailing list