Jeff threshar at torgo.978.org
Thu May 20 08:02:30 PDT 2010
On May 19, 2010, at 10:36 PM, Jan Wieck wrote:

> If the sl_action_seq has a cache_value of 1 (as it should be), then  
> this bump would mean that between the DELETE and the INSERT, which  
> are consecutive statements inside the same PL/pgSQL function, other  
> transactions were able to call nextval('sl_action_seq') 1,000+ times.
>

cache is set to 1. and they are consecutive statements. like I said,  
sometimes the db is pretty write heavy.
Looking at my stats collector collector, yesterday I got the following  
counts for insert, upd, and del:
  46249049 | 1859054 | 42489642

>> oes that happen inside of a transaction context that involves other  
>> triggers and/or constraints? As per my tests, AFTER user triggers  
>> are fired from the queue at the end of each statement. The slon  
>> logger firing out of order of those statements could still point  
>> towards a bug in the trigger queue, which would be extremely serious.
>

Yes.

"sourcereport" has an FK to another table, in addition to 3 triggers -  
the slon logger, the event adder and the summary updater.  Since PG  
executes triggers in alpha order the ordering is slony, event trig,  
summary trig.

if an event is added that table also has an fk and some triggers on it  
(and is replicated)

the summary table has an FK back to the 'sourcereport' table.  Here's  
the whole sl_log for that txn in question (tablenames have been  
changed to protect the innocent)

the exact chain of events was they deleted from sourcereport, which  
caused a cascade of events being removed (a report can cause multiple  
events), then the summary being updated.

   log_xid   | log_actionseq | log_tableid |    tab_relname    |  
log_cmdtype
------------+---------------+-------------+------------------- 
+-------------
  1153890130 |    2800679112 |        7600 | soucereport    | D
  1153890130 |    2800679113 |        6600 | events     | D
  1153890130 |    2800679114 |        7900 | events_log | I
  1153890130 |    2800679115 |        6600 | events     | D
  1153890130 |    2800679116 |        7900 | events_log | I
  1153890130 |    2800679117 |        6600 | events     | D
  1153890130 |    2800679118 |        7900 | events_log | I
  1153890130 |    2800679119 |        5700 | summary      | I
  1153890130 |    2800679120 |        5700 | summary      | D

Like I said, this only happens once in a while - happened twice  
yesterday (db was a bit less loaded than normal actually) before that  
maybe a week or two previous?

Here's most of the body of the trigger that fires to update summary  
(from sourcereport) - again changed some var names
(it is an after trigger)

DECLARE
	v_sourceid int;
BEGIN
	
	if TG_OP = 'DELETE' then
		v_sourceid := OLD.sourceid;
	else
		v_sourceid := NEW.sourceid;
	end if;

	delete from summary
		where sourceid = v_cik;

	insert into summary (sourceid, ... )
		select sourceid, ....
		from sourcereport
		where sourceid = v_sourceid
		order by publishdate desc, id desc limit 1;

	return NULL;
END

huh. return null is a bit odd, but since it is an after it works.

--
Jeff Trout <jeff at jefftrout.com>
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/





More information about the Slony1-general mailing list