Richard Yen dba at richyen.com
Tue Dec 4 14:06:20 PST 2007
On Dec 4, 2007, at 1:39 PM, Christopher Browne wrote:

> Richard Yen <dba at richyen.com> writes:
>> Not sure if this is a slony issue or a postgres issue...I'm posting  
>> on
>> both.
>>
>> I'm running slony on a one master/two subscriber system. One of the
>> subscribers seems to get stuck on a group of queries, and I can't  
>> seem
>> to figure out why.
>>
>> If I do a select on pg_stat_activity, I get the following:
>>
>> datid | datname | procpid | usesysid | usename | current_query |
>> waiting | query_start | backend_start | client_addr | client_port
>> -------+---------+---------+----------+----------
>> +
>> --------------------------------------------------------------------------------------------------------------+
>> ---------+-------------------------------
>> +-------------------------------+--------------+-------------
>> 16384 | tii | 12204 | 16392 | slony | update only
>> "public"."m_report_stats" set date_start='2007-12-03 13:27:05.661155'
>> where objectid='56917411'; | f | 2007-12-04 11:20:23.839088-08 |
>> 2007-12-04 11:20:23.005228-08 | | -1
>> : update only "public"."m_object_paper" set overwriteflag='t' where
>> id='56069688';
>> : insert into "public"."m_search_list" (nodeid,id) values
>> ('0','45844662');
>> : insert into "public"."m_search_list" (nodeid,id) values
>> ('1','45844662');
>> : insert into "public"."m_search_list" (nodeid,id) values
>> ('4','45844662');
>> : update only "public"."m_dg_read" set delete_flag='t' where
>> id='1474821';
>> : insert into "public"."m_search_list" (nodeid,id) values
>> ('5','45844662');
>> : insert into "public"."m_search_list" (nodeid,id) values
>> ('14','45844662');
>> : update only "public"."m_user" set duration='02:52:24.744252' where
>> id='10369924';
>> : insert into "public"."m_search_list" (nodeid,id) values
>> ('32','45844662');
>> :
>
> 1.  Are you certain that it is in fact the same query each time?
>
>    It could be that it's working on a rather large SYNC (or set of
>    grouped SYNCs), and you're seeing a set of queries that *look*
>    similar, but it's just *looking* like it's not progressing.
>
Yes, I'm certain it's the same query every time.  When I first  
noticed, the query_start in pg_stat_activity was approx. 4 hours  
before.  The current attempt is about 3 hours old.

>  I've had DBAs come to me 'asking why slon had stopped,' and what
>    we discovered is that there was a frequently repeating pattern of
>    queries so that it *looked* like it wasn't progressing, but
>    actually was.
>
> 2.  If it is stuck, could it be because of an ungranted lock?
>
No, it's not from an ungranted lock.  The query below:
tii=# select * from pg_locks where not granted;
  locktype | database | relation | page | tuple | transactionid |  
classid | objid | objsubid | transaction | pid | mode | granted
----------+----------+----------+------+-------+--------------- 
+---------+-------+----------+-------------+-----+------+---------
(0 rows)

tii=#


>    I'd look for any ungranted locks requested by process 16392, and
>    see if I can correlate this with something else that has already
>    grabbed some object...

You mean process 12204?  Just making sure...

Also, this process is taking up 100% of CPU, if that gives any clue.   
The frustrating part is that it takes up 100% CPU, but there's no  
activity in the postgres log or under strace... :/

Thanks!
--Richard


More information about the Slony1-general mailing list