Jan Wieck JanWieck at Yahoo.com
Thu Feb 28 05:33:47 PST 2008
On 2/25/2008 9:59 AM, Craig A. James wrote:
> I sent the attached email ony to Jan by mistake, here it is, I hope Jan or somebody has an idea what's going on.

What I still don't understand is what that stored procedure moveSet() is 
waiting for. Are there any pg_locks entries with granted='f' at all?


Jan


> 
> A quick summary:
>   1. Shut of Apache
>   2. Restart Postgres and Slony
>   3. lock set (id = 1, origin = 1);
>      wait for event (origin = 1, confirmed = 2);
>      move set (id = 1, old origin = 1, new origin = 2);
> 
> When I do this, Slony starts a SELECT that never finishes ("never" means I waited 10 minutes, 35 minutes, and 10 minutes in three different tests, with no sign of progress).
> 
> Per Jan's request, I've attached the output of two SQL queries:
> 
>     select relname, granted, pid, mode from pg_locks L, pg_class C
>         where C.oid = L.relation and locktype = 'relation';
> 
>     select * from pg_stat_activity;
> 
> In addition, I've attached the output of vmstat, a few snapshots taken over a 30-minute peroid.  It's looks to me like postgres is doing some sort of huge join or sort, based on the changing amounts of CPU and I/O during the 30 minutes I monitored it.  No other processes were active; this was entirely due to the backend Postgres process being used by Slony.
> 
>>From these, you can see that a Slon process is doing a "SELECT moveSet(...)".  ps(1) confirms that this process is the one using all the CPU time shown by vmstat.
> 
> I sure could use some help on this one, it's become a critical roadblock to our operation.
> 
> Thanks,
> Craig
> 
> ------------------------------------
> 
> Jan,
> 
> Thanks for all your help with this, I really appreciate it.  Your additional instructions regarding pg_stat_activity made more sense. I also had to enable the stats_command_string in the postgresql.conf file.
> 
> I'm also appending some output from vmstat to show that the system was busy doing something -- I let it run for about ten minutes, and the vmstat results are for various snapshots during that 10 minutes.
> 
> Craig
> 
> 
> 
> emol_warehouse_1=# select datid, datname, procpid, usename, current_query, query_start, backend_start from pg_stat_activity;
>  datid   |     datname      | procpid | usename  |                   current_query                    |          query_start          |         backend_start         
> ----------+------------------+---------+----------+----------------------------------------------------+-------------------------------+-------------------------------
>   164853 | emol_warehouse_1 |   24747 | postgres | <IDLE>                                             | 2008-02-23 18:44:09.995787-08 | 2008-02-23 18:41:55.334255-08
>    18368 | global           |   24741 | postgres | <IDLE>                                             | 2008-02-23 18:47:30.099887-08 | 2008-02-23 18:41:38.458081-08
> 20209283 | accounting       |   24740 | postgres | <IDLE>                                             | 2008-02-23 18:47:29.540243-08 | 2008-02-23 18:41:38.421729-08
>   164853 | emol_warehouse_1 |   24739 | postgres | <IDLE>                                             | 2008-02-23 18:47:29.195255-08 | 2008-02-23 18:41:38.386496-08
>   164853 | emol_warehouse_1 |   24738 | postgres | select "_emol_warehouse_1_cluster".moveSet(1, 2);  | 2008-02-23 18:41:35.215066-08 | 2008-02-23 18:41:35.154836-08
>    18368 | global           |   24731 | postgres | <IDLE>                                             | 2008-02-23 18:41:26.398179-08 | 2008-02-23 18:41:26.394739-08
>    18368 | global           |   24729 | postgres | <IDLE>                                             | 2008-02-23 18:47:36.747641-08 | 2008-02-23 18:41:26.394256-08
>    18368 | global           |   24730 | postgres | <IDLE>                                             | 2008-02-23 18:47:35.972451-08 | 2008-02-23 18:41:26.392295-08
> 20209283 | accounting       |   24724 | postgres | <IDLE>                                             | 2008-02-23 18:41:26.388876-08 | 2008-02-23 18:41:26.385995-08
> 20209283 | accounting       |   24723 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.797022-08 | 2008-02-23 18:41:26.380974-08
> 20209283 | accounting       |   24722 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.420454-08 | 2008-02-23 18:41:26.378364-08
>   164853 | emol_warehouse_1 |   24717 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.080398-08 | 2008-02-23 18:41:26.356934-08
>   164853 | emol_warehouse_1 |   24716 | postgres | <IDLE>                                             | 2008-02-23 18:41:26.360746-08 | 2008-02-23 18:41:26.358003-08
>   164853 | emol_warehouse_1 |   24715 | postgres | <IDLE>                                             | 2008-02-23 18:47:36.216511-08 | 2008-02-23 18:41:26.352855-08
> 20209283 | accounting       |   24710 | postgres | <IDLE>                                             | 2008-02-23 18:47:34.69257-08  | 2008-02-23 18:41:26.319118-08
>   164853 | emol_warehouse_1 |   24706 | postgres | <IDLE>                                             | 2008-02-23 18:47:34.692909-08 | 2008-02-23 18:41:26.301898-08
>    18368 | global           |   24709 | postgres | <IDLE>                                             | 2008-02-23 18:47:34.728589-08 | 2008-02-23 18:41:26.316133-08
> 20209283 | accounting       |   24695 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.6937-08   | 2008-02-23 18:41:26.24967-08
>   164853 | emol_warehouse_1 |   24694 | postgres | <IDLE>                                             | 2008-02-23 18:47:37.38218-08  | 2008-02-23 18:41:26.24526-08
>    18368 | global           |   24693 | postgres | <IDLE>                                             | 2008-02-23 18:47:32.19262-08  | 2008-02-23 18:41:26.241983-08
> (20 rows)
> 
> 
> emol_warehouse_1=# select relname, granted, pid, mode from pg_locks L, pg_class C
> emol_warehouse_1-#         where C.oid = L.relation and locktype = 'relation'; 
>                     relname                      | granted |  pid  |        mode         
> --------------------------------------------------+---------+-------+---------------------
> pg_index                                         | t       | 24738 | AccessShareLock
> pg_index                                         | t       | 24738 | RowShareLock
> PartInd_emol_warehouse_1_cluster_sl_log_2-node-1 | t       | 24738 | AccessExclusiveLock
> db_tables                                        | t       | 24738 | AccessExclusiveLock
> db_columns                                       | t       | 24738 | AccessExclusiveLock
> sl_table                                         | t       | 24738 | AccessShareLock
> sl_table                                         | t       | 24738 | RowShareLock
> sl_table                                         | t       | 24738 | RowExclusiveLock
> pg_locks                                         | t       | 24747 | AccessShareLock
> sl_set                                           | t       | 24738 | AccessShareLock
> sl_set                                           | t       | 24738 | RowShareLock
> sl_set                                           | t       | 24738 | RowExclusiveLock
> logp                                             | t       | 24738 | AccessExclusiveLock
> catalogue_issue                                  | t       | 24738 | AccessExclusiveLock
> supplier_priority                                | t       | 24738 | AccessExclusiveLock
> sl_setsync                                       | t       | 24738 | RowExclusiveLock
> pg_indexes                                       | t       | 24738 | AccessShareLock
> sl_node                                          | t       | 24738 | AccessShareLock
> sl_node                                          | t       | 24738 | RowShareLock
> supplier_detail                                  | t       | 24738 | AccessExclusiveLock
> chmoogle_thesaurus                               | t       | 24738 | AccessExclusiveLock
> parent_properties                                | t       | 24738 | AccessExclusiveLock
> db_names                                         | t       | 24738 | AccessExclusiveLock
> sl_listen                                        | t       | 24738 | AccessShareLock
> sl_listen                                        | t       | 24738 | RowExclusiveLock
> pg_class                                         | t       | 24738 | AccessShareLock
> pg_class                                         | t       | 24738 | RowShareLock
> version_properties                               | t       | 24738 | AccessExclusiveLock
> parent                                           | t       | 24738 | AccessExclusiveLock
> str_conntab                                      | t       | 24738 | AccessExclusiveLock
> sl_log_status                                    | t       | 24738 | AccessShareLock
> pg_class                                         | t       | 24747 | AccessShareLock
> pg_tablespace                                    | t       | 24738 | AccessShareLock
> sl_path                                          | t       | 24738 | AccessShareLock
> sl_path                                          | t       | 24738 | RowShareLock
> chmoogle_thesaurus_types                         | t       | 24738 | AccessExclusiveLock
> catalogue_info                                   | t       | 24738 | AccessExclusiveLock
> chmoogle_molkeys                                 | t       | 24738 | AccessExclusiveLock
> pg_class_oid_index                               | t       | 24747 | AccessShareLock
> version                                          | t       | 24738 | AccessExclusiveLock
> sl_event                                         | t       | 24738 | AccessShareLock
> sl_log_2                                         | t       | 24738 | AccessShareLock
> sl_log_2                                         | t       | 24738 | ShareLock
> sl_log_2                                         | t       | 24738 | AccessExclusiveLock
> chmoogle_statistics                              | t       | 24738 | AccessExclusiveLock
> sl_subscribe                                     | t       | 24738 | AccessShareLock
> sl_subscribe                                     | t       | 24738 | RowExclusiveLock
> chmoogle_cansmiles                               | t       | 24738 | AccessExclusiveLock
> pg_namespace                                     | t       | 24738 | AccessShareLock
> pg_namespace                                     | t       | 24738 | RowShareLock
> sample                                           | t       | 24738 | AccessExclusiveLock
> sl_config_lock                                   | t       | 24738 | AccessExclusiveLock
> (52 rows)
> 
> 
> 
> $ vmstat 2
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
> 1  0 138112 138244  20156 3577564    0    1     0     0    0     0  5  1 92  2
> 1  0 138112 139988  20084 3575816    0    0  1968    12  413   179 24  1 75  0
> 1  0 138112 140964  19984 3575136    0    0  2232    20  440   362 24  1 75  0
> 1  0 138112 141756  19888 3574972    0    0  2340    75  527   497 25  0 75  0
> 1  0 138112 138216  19888 3578612    0    0  1774     4  420   174 24  0 75  0
> 1  0 138112 141464  19808 3575572    0    0  1436    42  397   321 24  0 75  0
> 0  1 138112 137388  19808 3579732    0    0  2078     8  455   277 21  1 74  5
> 1  0 138112 140700  19808 3576612    0    0  1312    20  392   377 20  1 74  5
> 1  0 138112 137048  19808 3580512    0    0  1980    57  392   539 23  0 72  5
> [snip]
> 2  0 138112 141432  19248 3578472    0    0     0    61  322   533 25  1 74  0
> 1  0 138112 141448  19248 3578472    0    0     0    27  274   181 25  0 75  0
> 1  0 138112 141448  19248 3578472    0    0     0    25  259   111 25  0 75  0
> 1  0 138112 141456  19248 3578472    0    0     0     8  263   290 25  0 75  0
> 1  0 138112 141456  19248 3578472    0    0     0    20  282   304 25  0 75  0
> 1  0 138112 141580  19256 3578464    0    0     0    64  318   443 25  0 75  0
> 1  0 138112 141580  19256 3578464    0    0     0     4  268   296 25  0 75  0
> 1  0 138112 141584  19256 3578464    0    0     0    15  259   174 25  0 75  0
> 1  0 138112 141584  19256 3578464    0    0     0     8  263   141 25  0 75  0
> 1  0 138112 141600  19256 3578464    0    0     0    20  284   317 25  0 75  0
> 1  0 138112 141600  19256 3578464    0    0     0    85  321   441 25  0 75  0
> 1  0 138112 141600  19256 3578464    0    0     0    10  269   143 25  0 75  0
> 1  0 138112 141600  19256 3578464    0    0     0    27  259   220 25  0 75  0
> 1  0 138112 141476  19256 3578464    0    0     0     8  261   185 25  0 75  0
> 1  0 138112 141476  19256 3578464    0    0     0    20  285   210 25  0 75  0
> 1  0 138112 141476  19256 3578464    0    0     0    59  318   440 25  0 75  0
> 1  0 138112 141476  19256 3578464    0    0     0     4  267   141 25  0 75  0
> 1  0 138112 141476  19256 3578464    0    0     0    31  260   103 25  0 75  0
> 1  0 138112 141476  19256 3578464    0    0     0     8  263   237 25  0 75  0
> 1  0 138112 141476  19256 3578464    0    0     0    33  285   282 25  0 75  0
> [snip]
> 1  0 138112 140856  19304 3578416    0    0     0    27  259   250 25  0 75  0
> 1  0 138112 140856  19304 3578416    0    0     0     8  262   200 25  0 75  0
> 1  0 138112 140856  19304 3578416    0    0     0    20  283   236 25  0 75  0
> 1  0 138112 140856  19304 3578416    0    0     0    63  318   508 25  0 75  0
> 1  0 138112 140856  19304 3578416    0    0     0    10  265   149 25  0 75  0
> 1  0 138112 140732  19304 3578416    0    0     0    25  259   108 25  0 75  0
> 1  0 138112 140732  19304 3578416    0    0     0     8  262   269 25  0 75  0
> 0  1 138112 137440  18736 3582364    0    0  4184    26  575   815 17  1 75  7
> 0  2 138112 138184  17360 3584000    0    0  5216    89  755   969  0  1 70 28
> 0  1 138112 139956  16176 3584144    0    0  4564    12  775   902  1  1 73 25
> 1  0 138112 138400  15164 3586976    0    0  6368   115  764   642  1  1 75 23
> 0  1 138112 141136  14620 3585180    0    0  6936     8  731   331  1  1 74 24
> 0  2 138112 140068  14564 3587056    0    0  6168    32  804   448  1  1 74 24
> 1  2 138112 139960  14240 3587640    0    0  5364    61  770   741  1  1 73 25
> 0  1 138112 136492  14120 3591660    0    0  7036    20  872  1511  1  1 73 24
> 0  1 138112 140772  13412 3588728    0    0  8544    24 1043  1951  1  2 75 23
> 0  1 138112 136460  13104 3593716    0    0 10052     8  933  1615  1  2 73 24
> 0  1 138112 139356  12840 3591380    0    0  9156    16  891  1511  1  1 72 26
> 0  2 138112 137084  12704 3594116    0    0  8928    47  929  1718  1  2 72 26
> 0  1 138112 140276  12688 3591272    0    0  8836    20  965  1757  1  1 72 26
> 0  1 138112 138052  12688 3593092    0    0  8800    19  999  1679  1  1 75 23
> [snip]
> 0  2 138112 139288  12056 3595024    0    0  7804    32  722  1207  1  1 73 25
> 0  1 138112 137544  11840 3597060    0    0  7804    55  662  1096  1  1 73 25
> 0  1 138112 138468  11784 3596596    0    0  6972    32  625  1031  1  1 72 26
> 0  1 138112 137520  11476 3597944    0    0  8564    28  767  1291  1  1 75 23
> 0  1 138112 141640  11120 3594920    0    0  9220    14  655   933  1  1 74 24
> 0  1 138112 138464  11092 3598328    0    0 29424    12 1340  1471  2  4 75 19
> 0  2 138112 139628  10572 3597808    0    0 23832    49 1540  2858  2  3 73 21
> 0  1 138112 138560  10436 3599244    0    0 13844    32 1172  2170  1  2 71 25
> 0  1 138112 139348  10428 3598732    0    0 12774    35 1010  1781  2  2 75 22
> 0  1 138112 137676  10276 3600444    0    0 13680    68 1089  2035  1  2 74 23
> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id wa
> 0  2 138112 141024   9952 3597908    0    0 13844    12 1371  2562  1  3 75 21
> 0  1 138112 139704   9884 3599276    0    0 13460    55 1099  2014  1  2 71 26
> 1  1 138112 139540   9824 3599872    0    0 13596    32 1185  2430  1  3 73 23
> 0  1 138112 138348   9696 3601040    0    0 11212   118 1139  2031  1  2 75 22
> 0  1 138112 137564   9572 3602464    0    0 25148    69  595   949  2  2 73 23
> 1  1 138112 139960   9496 3600460    0    0 15644    12  660  1117  1  2 73 24
> 1  1 138112 140912   9460 3599716    0    0  7292   156 1017  1782  1  2 73 24
> 0  1 138112 139664   9380 3601356    0    0 23516    28 1518  2840  3  3 72 22
> 0  1 138112 139108   8708 3603068    0    0 40538    16 1205  2320  5  4 75 17
> 0  1 138112 141668   7420 3602016    0    0 30164    23 1312  2418  3  4 74 19
> 0  1 138112 138516   6212 3606604    0    0 20012    18 1075  1855  2  3 74 20
> 0  2 138112 140756   6192 3604804    0    0  4176    49  501   704  0  0 72 27
> 0  1 138112 140940   6164 3604572    0    0  5300    32  613  1006  1  1 72 26
> 0  1 138112 141064   6056 3604940    0    0  5136     0  629   913  0  1 75 24
> 0  1 138112 141972   6044 3604172    0    0  4788    49  705  1216  1  1 73 25
> 0  1 138112 138004   5780 3608076    0    0  6972    12  633   986  0  1 74 24
> 0  2 138112 141496   5512 3604964    0    0  6360    39  786   414  1  1 72 25
> 0  1 138112 141016   5468 3605788    0    0  8316    32  874   878  1  1 71 26
> 0  1 138112 141832   5332 3605664    0    0 37456     0  553  1051  4  2 75 19
> 0  1 138112 140940   5332 3606452    0    0 19016    41  985  1674  2  3 74 21
> 0  1 138112 141720   5332 3605672    0    0 15964    12  619   740  2  2 73 24
> 0  2 138112 137296   5292 3610132    0    0 29388    53 1069   877  3  3 74 19
> 0  1 138112 138600   4556 3617372    0    0 29104    32 1214  1237  3  5 70 22
> 
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general


-- 
Anyone who trades liberty for security deserves neither
liberty nor security. -- Benjamin Franklin



More information about the Slony1-general mailing list