Jan Wieck JanWieck
Sat Jul 22 03:54:47 PDT 2006
On 7/22/2006 12:59 PM, Gavin Hamill wrote:

> On Sat, 22 Jul 2006 05:19:14 -0400
> Jan Wieck <JanWieck at Yahoo.com> wrote:
>  
>> Please get me the output of the following two queries for any table 
>> where this still occurs:
> 
> Attached.. 

It is evident from that output that the table was altered after setup 
for replication. The log trigger configuration knows the table as having 
27 user attributes, whereas the table currently has 42 columns.

The problem here is not a dropped column, but that the log trigger 
doesn't check the length of the attribute kind string and is simply 
accessing random memory following it, finding a 'k' and trying to add 
that column to the where clause of the log entry.

Please read carefully about EXECUTE SCRIPT and advise all DBA's (or 
whoever has the ability to execute DDL statements on your servers) that 
this can be as harmfull as just experienced.

Note to self: Add a check to logTrigger to ensure attkind string is 
exactly as long as the number of non-dropped user columns.


> 
>> as well as a full schema description of the table, including primary 
>> keys, unique and not null constraints. 
> 
> I have attached the output of 'pg_dump -s -t User laterooms' - is this all that is required?
> 
> The Booking table is probably more important, but I am interested to hear your views on the data from the User table - hopefully then I'll be able to learn and apply the same technique to the Booking table.
> 
> In fact, given that I've dropped Booking from replication, it might even be likely that it 'fixes itself' when I add a new temporary set, then merge back into set 1.

Executing anything via EXECUTE SCRIPT would have caused the triggers 
key/value list to be rebuilt according to the current definition. Even 
an empty DDL script.

However, I would not trust the replica at this point and rather rebuild 
the entire slave by dropping the whole set, recreating it and resubscribing.

> 
>> The only way I could imagine how slony got into that stage is that you 
>> have executed DDL like ALTER TABLE DROP COPUMN directly while the table 
>> was replicated (directly meaning not done via EXECUTE SCRIPT). Doing so 
>> would get the key attributes out of sync with respect to their position 
>> within all non-attisdroped columns.
> 
> Interesting, it's entirely possible, but the thing that doesn't make sense is why this only started happening at 4am this morning, just after the nightly backups + vacuum :)

Things like that never happen at 10am on Tuesday ... not in this world.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck at Yahoo.com #



More information about the Slony1-general mailing list