Brian Hirt bhirt
Thu Jan 27 17:02:21 PST 2005
Hi,

I work with a really busy database, and ever since i've install slony, 
it's been next to impossible to make ddl changes.  In the past, I was 
always able to make ddl changes without any problems, even complex 
changes -- without deadlocks.   Sometimes clients would block for a 
minute or two while the transaction completed, but eventually they 
would run without failing.

Now, even something as simple as dropping an unused index fails with a 
deadlock.  I typically have to run a script about 30 or 40 times at 
before it works (at our lowest usage point).  The thing that is really 
annoying is that it's causing other application processes to die from a 
deadlocks when i attempt to run the scripts, causing errors for our 
users, aborted reports, etc.  Our applications don't handle deadlocks 
very gracefully (in fact the first deadlock we've ever had since we 
started using postgres 6.3 in 1998 was when we install slony) because 
they've been written in a way that we don't get deadlocks.

It's gotten so bad, that I've had to resort to dropping both nodes, 
applying ddl to the master and then setting replication back up.   This 
has it's own set of drawbacks because the initial sync takes a long 
time and hurts performance while it's happening and the slave database 
is offline so reports have to be run against the master which further 
hurts performance.  I'm also not protected from failures during this 
period.

I hope I'm just doing something stupid, and there is an easy solution 
to this problem.   We currently run slony 1.0.5 and postgres 7.4.6 on 
linux 2.4 systems.  I look  forward to any advice people might have.  
It seems like part of the problem stems from the fact that all tables 
in the set are locked by slon during this process, even if only one 
table is being updated by the execute script (ref; 
http://cbbrowne.com/info/stmtddlscript.html).

possibly i could "set drop table", apply ddl on both systems, clear 
data on the slave and then "set add table" -- but this seems to  be a 
very very ugly hack which will get very complicated with larger ddl 
scripts and prone to errors.

http://cbbrowne.com/info/stmtsetaddtable.html mentions something about 
the table id and deadlocks, but I don't really understand what i'm 
supposed to do with that.

anyway... Here's an example of a failure:

Here's the sql script c2.slq:
	drop index moby_user_developer_developer_idx;

Here's the slony script (node 1 is master, node2 is slave)
         cluster name = mobycluster;
         node 1 admin conninfo = '...conninfo....';
         node 2 admin conninfo = '...conninfo...';

         execute script (
                 set id = 1,
                 filename='/tmp/c2.sql',
                 event node=1);

Here's the output:
<stdin>:7: PGRES_FATAL_ERROR select "_mobycluster".ddlScript(1, 'drop 
index moby_user_developer_developer_idx;
');  - ERROR:  deadlock detected
DETAIL:  Process 2239 waits for AccessExclusiveLock on relation 20808 
of database 20649; blocked by process 2172.
Process 2172 waits for AccessShareLock on relation 21124 of database 
20649; blocked by process 2239.
CONTEXT:  PL/pgSQL function "altertablerestore" line 47 at execute 
statement
PL/pgSQL function "ddlscript_int" line 39 at perform
PL/pgSQL function "ddlscript" line 30 at perform

--------------------------------------------
MobyGames
http://www.mobygames.com
The world's largest and most comprehensive?
gaming database project
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 3480 bytes
Desc: not available
Url : http://gborg.postgresql.org/pipermail/slony1-general/attachments/20050127/4212d5ad/attachment.bin


More information about the Slony1-general mailing list