Mon Nov 15 03:16:33 PST 2004
- Previous message: [Slony1-general] Re: slon won't start after EXECUTE QUERY
- Next message: [Slony1-general] Re: slon won't start after EXECUTE QUERY
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>===== Original Message From Jan Wieck <JanWieck at Yahoo.com> ===== >On 11/13/2004 4:42 PM, David Pitkin wrote: >> I guess my little problem was a bit vague, but I'm somewhat surprised no-one >> thought of what now seems like the obvious reason that slon wouldn't start. I >> was right in that node 2 died on the DDL SCRIPT. Then, everytime I tried to >> restart slon, it re-received the event, and died again. I'm still quite >> confused as to why the script fails on only this node (especially since the >> exact same script succeeds when run from psql). > >On the same node? Hmmm ... it might have to do with different >permissions or search paths. What is the exact error message you get in >the postmaster log ... the one that causes the script to abort the >transaction? I thought of that. So I ran the script from psql, logging in as the exact same user that slony logs in as. The script still worked. (Admittedly, I had to add some sql commands before running the script, such as starting a transaction, locking tables I needed to modify, and dropping slony's 'denyaccess' triggers. But that wouldn't explain why the script succeeds in psql, but not in slon). The error message I get is 'Column field_new does not exist'. This would seem to suggest that the command which creates the temporary column 'field_new' is failing, but that doesn't make any sense. > >> >> Anyways, I need to get this problem fixed, which means one of three things: >> deleting the DDL SCRIPT event from node 1, adding a fake confirmation from >> node 2, or changing the script itself (the one saved in sl_event) to something >> that will definitely succeed. > >What about fixing the root of the problem? And without really knowing >what causes the script to fail, the next question is a bit hard to answer. > > >Jan > I'd love to fix the root of the problem, but so far the cause is unclear. I'm about ready to blame postgreSQL; this isn't entirely farfetched, because node 2 is running 8beta3, whereas node 3 is 8beta2 and node 1 is 7.4.6. But if it IS postgreSQL, then the only solution is to convince the master node to stop sending the DDL SCRIPT event. Is there a way to do that? David Pitkin >> >> Can anyone tell me which would be the best solution, and more importantly, how >> to do it safely? >> >> David Pitkin >> >> >> -------Original Message------------------------- >> Hello, >> >> I'm brand new to SlonyI. Someone else set it up with a master node (node 1) >> and two slaves (node 2 and node 3). I needed to change the schema, and have >> successfully managed to break node 2 in the process (happily this is still in >> the development stage). Here's what happened. Hopefully someone can tell me >> what I did wrong: >> >> 1. First, I should mention that node 1 and node 2 are on the same machine >> (Linux), with node 3 on a seperate machine. I needed to change the data type >> of a column, using sql like this: >> ALTER TABLE table ADD COLUMN field_new; >> UPDATE table SET field_new = field; >> ALTER TABLE table DROP COLUMN field; >> ALTER TABLE table RENAME COLUMN field_new TO field. >> >> 2. I ran this script using the EXECUTE QUERY command in slonik. It failed >> initially, because I forgot that the schema containing the table I needed to >> modify was not in the search path for the 'slony' user. It failed on node 1, >> and appeared to be isolated there (i.e. the event did not get sent to the >> other two nodes). I've checked the Schemadoc, and this seems to be what >> happens. I also double checked the process list at that point, and verified >> that two slon processes were still running (for nodes 1 and 2). >> >> 3. I fixed the script and ran it a second time. It succeeded on node 1, and on >> node 3. But node 2 was unchanged, and further investigation showed that the >> corresponding slon process was dead. I tried restarting it, and it complained >> a few times about there being no remote worker thread for node 1, and died >> with an empty error message. >> >> 4. I manually fixed the schema on node 2, and started slon again. Slon died in >> the same way. >> >> I checked the slonyI tables, and it appears the node 2 confirmed the SYNC >> event sent by node 1 just before the DDL_SCRIPT event (the timestamps of both >> events match). This suggests that the script killed node 2, and a quick glance >> at the remote worker thread source code suggests that if a script were to >> fail, the thread would immediately die. But I can't figure out why the slon >> process refuses to restart. >> >> Does anyone have any thoughts? >> >> David Pitkin >> >> >> _______________________________________________ >> Slony1-general mailing list >> Slony1-general at gborg.postgresql.org >> http://gborg.postgresql.org/mailman/listinfo/slony1-general > > >-- >#======================================================================# ># It's easier to get forgiveness for being wrong than for being right. # ># Let's break this rule - forgive me. # >#================================================== JanWieck at Yahoo.com #
- Previous message: [Slony1-general] Re: slon won't start after EXECUTE QUERY
- Next message: [Slony1-general] Re: slon won't start after EXECUTE QUERY
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list