Craig A. James cjames at emolecules.com
Mon Feb 25 06:59:29 PST 2008
I sent the attached email ony to Jan by mistake, here it is, I hope Jan or somebody has an idea what's going on.

A quick summary:
  1. Shut of Apache
  2. Restart Postgres and Slony
  3. lock set (id = 1, origin = 1);
     wait for event (origin = 1, confirmed = 2);
     move set (id = 1, old origin = 1, new origin = 2);

When I do this, Slony starts a SELECT that never finishes ("never" means I waited 10 minutes, 35 minutes, and 10 minutes in three different tests, with no sign of progress).

Per Jan's request, I've attached the output of two SQL queries:

    select relname, granted, pid, mode from pg_locks L, pg_class C
        where C.oid = L.relation and locktype = 'relation';

    select * from pg_stat_activity;

In addition, I've attached the output of vmstat, a few snapshots taken over a 30-minute peroid.  It's looks to me like postgres is doing some sort of huge join or sort, based on the changing amounts of CPU and I/O during the 30 minutes I monitored it.  No other processes were active; this was entirely due to the backend Postgres process being used by Slony.

>From these, you can see that a Slon process is doing a "SELECT moveSet(...)".  ps(1) confirms that this process is the one using all the CPU time shown by vmstat.

I sure could use some help on this one, it's become a critical roadblock to our operation.

Thanks,
Craig

------------------------------------

Jan,

Thanks for all your help with this, I really appreciate it.  Your additional instructions regarding pg_stat_activity made more sense. I also had to enable the stats_command_string in the postgresql.conf file.

I'm also appending some output from vmstat to show that the system was busy doing something -- I let it run for about ten minutes, and the vmstat results are for various snapshots during that 10 minutes.

Craig



emol_warehouse_1=# select datid, datname, procpid, usename, current_query, query_start, backend_start from pg_stat_activity;
 datid   |     datname      | procpid | usename  |                   current_query                    |          query_start          |         backend_start         
----------+------------------+---------+----------+----------------------------------------------------+-------------------------------+-------------------------------
  164853 | emol_warehouse_1 |   24747 | postgres | <IDLE>                                             | 2008-02-23 18:44:09.995787-08 | 2008-02-23 18:41:55.334255-08
   18368 | global           |   24741 | postgres | <IDLE>                                             | 2008-02-23 18:47:30.099887-08 | 2008-02-23 18:41:38.458081-08
20209283 | accounting       |   24740 | postgres | <IDLE>                                             | 2008-02-23 18:47:29.540243-08 | 2008-02-23 18:41:38.421729-08
  164853 | emol_warehouse_1 |   24739 | postgres | <IDLE>                                             | 2008-02-23 18:47:29.195255-08 | 2008-02-23 18:41:38.386496-08
  164853 | emol_warehouse_1 |   24738 | postgres | select "_emol_warehouse_1_cluster".moveSet(1, 2);  | 2008-02-23 18:41:35.215066-08 | 2008-02-23 18:41:35.154836-08
   18368 | global           |   24731 | postgres | <IDLE>                                             | 2008-02-23 18:41:26.398179-08 | 2008-02-23 18:41:26.394739-08
   18368 | global           |   24729 | postgres | <IDLE>                                             | 2008-02-23 18:47:36.747641-08 | 2008-02-23 18:41:26.394256-08
   18368 | global           |   24730 | postgres | <IDLE>                                             | 2008-02-23 18:47:35.972451-08 | 2008-02-23 18:41:26.392295-08
20209283 | accounting       |   24724 | postgres | <IDLE>                                             | 2008-02-23 18:41:26.388876-08 | 2008-02-23 18:41:26.385995-08
20209283 | accounting       |   24723 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.797022-08 | 2008-02-23 18:41:26.380974-08
20209283 | accounting       |   24722 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.420454-08 | 2008-02-23 18:41:26.378364-08
  164853 | emol_warehouse_1 |   24717 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.080398-08 | 2008-02-23 18:41:26.356934-08
  164853 | emol_warehouse_1 |   24716 | postgres | <IDLE>                                             | 2008-02-23 18:41:26.360746-08 | 2008-02-23 18:41:26.358003-08
  164853 | emol_warehouse_1 |   24715 | postgres | <IDLE>                                             | 2008-02-23 18:47:36.216511-08 | 2008-02-23 18:41:26.352855-08
20209283 | accounting       |   24710 | postgres | <IDLE>                                             | 2008-02-23 18:47:34.69257-08  | 2008-02-23 18:41:26.319118-08
  164853 | emol_warehouse_1 |   24706 | postgres | <IDLE>                                             | 2008-02-23 18:47:34.692909-08 | 2008-02-23 18:41:26.301898-08
   18368 | global           |   24709 | postgres | <IDLE>                                             | 2008-02-23 18:47:34.728589-08 | 2008-02-23 18:41:26.316133-08
20209283 | accounting       |   24695 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.6937-08   | 2008-02-23 18:41:26.24967-08
  164853 | emol_warehouse_1 |   24694 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.38218-08  | 2008-02-23 18:41:26.24526-08
   18368 | global           |   24693 | postgres | <IDLE>                                             | 2008-02-23 18:47:32.19262-08  | 2008-02-23 18:41:26.241983-08
(20 rows)


emol_warehouse_1=# select relname, granted, pid, mode from pg_locks L, pg_class C
emol_warehouse_1-#         where C.oid = L.relation and locktype = 'relation'; 
                    relname                      | granted |  pid  |        mode         
--------------------------------------------------+---------+-------+---------------------
pg_index                                         | t       | 24738 | AccessShareLock
pg_index                                         | t       | 24738 | RowShareLock
PartInd_emol_warehouse_1_cluster_sl_log_2-node-1 | t       | 24738 | AccessExclusiveLock
db_tables                                        | t       | 24738 | AccessExclusiveLock
db_columns                                       | t       | 24738 | AccessExclusiveLock
sl_table                                         | t       | 24738 | AccessShareLock
sl_table                                         | t       | 24738 | RowShareLock
sl_table                                         | t       | 24738 | RowExclusiveLock
pg_locks                                         | t       | 24747 | AccessShareLock
sl_set                                           | t       | 24738 | AccessShareLock
sl_set                                           | t       | 24738 | RowShareLock
sl_set                                           | t       | 24738 | RowExclusiveLock
logp                                             | t       | 24738 | AccessExclusiveLock
catalogue_issue                                  | t       | 24738 | AccessExclusiveLock
supplier_priority                                | t       | 24738 | AccessExclusiveLock
sl_setsync                                       | t       | 24738 | RowExclusiveLock
pg_indexes                                       | t       | 24738 | AccessShareLock
sl_node                                          | t       | 24738 | AccessShareLock
sl_node                                          | t       | 24738 | RowShareLock
supplier_detail                                  | t       | 24738 | AccessExclusiveLock
chmoogle_thesaurus                               | t       | 24738 | AccessExclusiveLock
parent_properties                                | t       | 24738 | AccessExclusiveLock
db_names                                         | t       | 24738 | AccessExclusiveLock
sl_listen                                        | t       | 24738 | AccessShareLock
sl_listen                                        | t       | 24738 | RowExclusiveLock
pg_class                                         | t       | 24738 | AccessShareLock
pg_class                                         | t       | 24738 | RowShareLock
version_properties                               | t       | 24738 | AccessExclusiveLock
parent                                           | t       | 24738 | AccessExclusiveLock
str_conntab                                      | t       | 24738 | AccessExclusiveLock
sl_log_status                                    | t       | 24738 | AccessShareLock
pg_class                                         | t       | 24747 | AccessShareLock
pg_tablespace                                    | t       | 24738 | AccessShareLock
sl_path                                          | t       | 24738 | AccessShareLock
sl_path                                          | t       | 24738 | RowShareLock
chmoogle_thesaurus_types                         | t       | 24738 | AccessExclusiveLock
catalogue_info                                   | t       | 24738 | AccessExclusiveLock
chmoogle_molkeys                                 | t       | 24738 | AccessExclusiveLock
pg_class_oid_index                               | t       | 24747 | AccessShareLock
version                                          | t       | 24738 | AccessExclusiveLock
sl_event                                         | t       | 24738 | AccessShareLock
sl_log_2                                         | t       | 24738 | AccessShareLock
sl_log_2                                         | t       | 24738 | ShareLock
sl_log_2                                         | t       | 24738 | AccessExclusiveLock
chmoogle_statistics                              | t       | 24738 | AccessExclusiveLock
sl_subscribe                                     | t       | 24738 | AccessShareLock
sl_subscribe                                     | t       | 24738 | RowExclusiveLock
chmoogle_cansmiles                               | t       | 24738 | AccessExclusiveLock
pg_namespace                                     | t       | 24738 | AccessShareLock
pg_namespace                                     | t       | 24738 | RowShareLock
sample                                           | t       | 24738 | AccessExclusiveLock
sl_config_lock                                   | t       | 24738 | AccessExclusiveLock
(52 rows)



$ vmstat 2
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
1  0 138112 138244  20156 3577564    0    1     0     0    0     0  5  1 92  2
1  0 138112 139988  20084 3575816    0    0  1968    12  413   179 24  1 75  0
1  0 138112 140964  19984 3575136    0    0  2232    20  440   362 24  1 75  0
1  0 138112 141756  19888 3574972    0    0  2340    75  527   497 25  0 75  0
1  0 138112 138216  19888 3578612    0    0  1774     4  420   174 24  0 75  0
1  0 138112 141464  19808 3575572    0    0  1436    42  397   321 24  0 75  0
0  1 138112 137388  19808 3579732    0    0  2078     8  455   277 21  1 74  5
1  0 138112 140700  19808 3576612    0    0  1312    20  392   377 20  1 74  5
1  0 138112 137048  19808 3580512    0    0  1980    57  392   539 23  0 72  5
[snip]
2  0 138112 141432  19248 3578472    0    0     0    61  322   533 25  1 74  0
1  0 138112 141448  19248 3578472    0    0     0    27  274   181 25  0 75  0
1  0 138112 141448  19248 3578472    0    0     0    25  259   111 25  0 75  0
1  0 138112 141456  19248 3578472    0    0     0     8  263   290 25  0 75  0
1  0 138112 141456  19248 3578472    0    0     0    20  282   304 25  0 75  0
1  0 138112 141580  19256 3578464    0    0     0    64  318   443 25  0 75  0
1  0 138112 141580  19256 3578464    0    0     0     4  268   296 25  0 75  0
1  0 138112 141584  19256 3578464    0    0     0    15  259   174 25  0 75  0
1  0 138112 141584  19256 3578464    0    0     0     8  263   141 25  0 75  0
1  0 138112 141600  19256 3578464    0    0     0    20  284   317 25  0 75  0
1  0 138112 141600  19256 3578464    0    0     0    85  321   441 25  0 75  0
1  0 138112 141600  19256 3578464    0    0     0    10  269   143 25  0 75  0
1  0 138112 141600  19256 3578464    0    0     0    27  259   220 25  0 75  0
1  0 138112 141476  19256 3578464    0    0     0     8  261   185 25  0 75  0
1  0 138112 141476  19256 3578464    0    0     0    20  285   210 25  0 75  0
1  0 138112 141476  19256 3578464    0    0     0    59  318   440 25  0 75  0
1  0 138112 141476  19256 3578464    0    0     0     4  267   141 25  0 75  0
1  0 138112 141476  19256 3578464    0    0     0    31  260   103 25  0 75  0
1  0 138112 141476  19256 3578464    0    0     0     8  263   237 25  0 75  0
1  0 138112 141476  19256 3578464    0    0     0    33  285   282 25  0 75  0
[snip]
1  0 138112 140856  19304 3578416    0    0     0    27  259   250 25  0 75  0
1  0 138112 140856  19304 3578416    0    0     0     8  262   200 25  0 75  0
1  0 138112 140856  19304 3578416    0    0     0    20  283   236 25  0 75  0
1  0 138112 140856  19304 3578416    0    0     0    63  318   508 25  0 75  0
1  0 138112 140856  19304 3578416    0    0     0    10  265   149 25  0 75  0
1  0 138112 140732  19304 3578416    0    0     0    25  259   108 25  0 75  0
1  0 138112 140732  19304 3578416    0    0     0     8  262   269 25  0 75  0
0  1 138112 137440  18736 3582364    0    0  4184    26  575   815 17  1 75  7
0  2 138112 138184  17360 3584000    0    0  5216    89  755   969  0  1 70 28
0  1 138112 139956  16176 3584144    0    0  4564    12  775   902  1  1 73 25
1  0 138112 138400  15164 3586976    0    0  6368   115  764   642  1  1 75 23
0  1 138112 141136  14620 3585180    0    0  6936     8  731   331  1  1 74 24
0  2 138112 140068  14564 3587056    0    0  6168    32  804   448  1  1 74 24
1  2 138112 139960  14240 3587640    0    0  5364    61  770   741  1  1 73 25
0  1 138112 136492  14120 3591660    0    0  7036    20  872  1511  1  1 73 24
0  1 138112 140772  13412 3588728    0    0  8544    24 1043  1951  1  2 75 23
0  1 138112 136460  13104 3593716    0    0 10052     8  933  1615  1  2 73 24
0  1 138112 139356  12840 3591380    0    0  9156    16  891  1511  1  1 72 26
0  2 138112 137084  12704 3594116    0    0  8928    47  929  1718  1  2 72 26
0  1 138112 140276  12688 3591272    0    0  8836    20  965  1757  1  1 72 26
0  1 138112 138052  12688 3593092    0    0  8800    19  999  1679  1  1 75 23
[snip]
0  2 138112 139288  12056 3595024    0    0  7804    32  722  1207  1  1 73 25
0  1 138112 137544  11840 3597060    0    0  7804    55  662  1096  1  1 73 25
0  1 138112 138468  11784 3596596    0    0  6972    32  625  1031  1  1 72 26
0  1 138112 137520  11476 3597944    0    0  8564    28  767  1291  1  1 75 23
0  1 138112 141640  11120 3594920    0    0  9220    14  655   933  1  1 74 24
0  1 138112 138464  11092 3598328    0    0 29424    12 1340  1471  2  4 75 19
0  2 138112 139628  10572 3597808    0    0 23832    49 1540  2858  2  3 73 21
0  1 138112 138560  10436 3599244    0    0 13844    32 1172  2170  1  2 71 25
0  1 138112 139348  10428 3598732    0    0 12774    35 1010  1781  2  2 75 22
0  1 138112 137676  10276 3600444    0    0 13680    68 1089  2035  1  2 74 23
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
0  2 138112 141024   9952 3597908    0    0 13844    12 1371  2562  1  3 75 21
0  1 138112 139704   9884 3599276    0    0 13460    55 1099  2014  1  2 71 26
1  1 138112 139540   9824 3599872    0    0 13596    32 1185  2430  1  3 73 23
0  1 138112 138348   9696 3601040    0    0 11212   118 1139  2031  1  2 75 22
0  1 138112 137564   9572 3602464    0    0 25148    69  595   949  2  2 73 23
1  1 138112 139960   9496 3600460    0    0 15644    12  660  1117  1  2 73 24
1  1 138112 140912   9460 3599716    0    0  7292   156 1017  1782  1  2 73 24
0  1 138112 139664   9380 3601356    0    0 23516    28 1518  2840  3  3 72 22
0  1 138112 139108   8708 3603068    0    0 40538    16 1205  2320  5  4 75 17
0  1 138112 141668   7420 3602016    0    0 30164    23 1312  2418  3  4 74 19
0  1 138112 138516   6212 3606604    0    0 20012    18 1075  1855  2  3 74 20
0  2 138112 140756   6192 3604804    0    0  4176    49  501   704  0  0 72 27
0  1 138112 140940   6164 3604572    0    0  5300    32  613  1006  1  1 72 26
0  1 138112 141064   6056 3604940    0    0  5136     0  629   913  0  1 75 24
0  1 138112 141972   6044 3604172    0    0  4788    49  705  1216  1  1 73 25
0  1 138112 138004   5780 3608076    0    0  6972    12  633   986  0  1 74 24
0  2 138112 141496   5512 3604964    0    0  6360    39  786   414  1  1 72 25
0  1 138112 141016   5468 3605788    0    0  8316    32  874   878  1  1 71 26
0  1 138112 141832   5332 3605664    0    0 37456     0  553  1051  4  2 75 19
0  1 138112 140940   5332 3606452    0    0 19016    41  985  1674  2  3 74 21
0  1 138112 141720   5332 3605672    0    0 15964    12  619   740  2  2 73 24
0  2 138112 137296   5292 3610132    0    0 29388    53 1069   877  3  3 74 19
0  1 138112 138600   4556 3617372    0    0 29104    32 1214  1237  3  5 70 22



More information about the Slony1-general mailing list