[Slony1-general] Slonik segfaulting.

Thu Jan 10 20:07:54 PST 2008

Christopher Browne skrev:
> Marcus Gustafsson <marcus.gustafsson at visionten.net> writes:
>> Hi all!
>>
>> Are there any known issues with the slony RPM
>> postgresql-slony1-engine-1.1.5-1_PG8.1.4.i686.rpm?
>>
>> I have a small slony cluster where I removed node 1 for reinstallation,
>> but when I try to reinsert it (as a new node) with:
>>
>> cluster name = MyCluster;
>> node 2 admin conninfo = 'dbname=mycluster host=db2.internal user=slony
>> password=secret';
>> node 3 admin conninfo = 'dbname=mycluster host=db3.internal user=slony
>> password=secret';
>> node 4 admin conninfo = 'dbname=mycluster host=db1.internal user=slony
>> password=secret';
>> store node(id=4, event node=2);
>>
>> slonik crashes with a segmentation fault.
>> GDB gives me a backtrace which looks like this:
>>
>> #0  0x0804ed5a in slon_appendquery_int ()
>> #1  0x0804f047 in slon_mkquery ()
>> #2  0x0804cf31 in slonik_store_node ()
>> #3  0x0804e8dd in script_exec_stmts ()
>> #4  0x0804ea41 in script_exec ()
>> #5  0x0804ebed in main ()
>>
>> I'll obviously keep debugging, but before I jump headfirst in the code
>> someone else might have an idea of what is wrong?
> 
> Nothing jumps out at me; this sounds consistent with the notion of
> slonik trying to build a query, and finding that one of the components
> was NULL.
> 
> I don't see anything reported as fixed subsequent to 1.1.5; I vaguely
> remember fixing a problem with this in the 1.2 series, but don't see
> anything expressly in the release notes.  I would suggest considering
> a much newer version of Slony-I; you're on a version that was released
> 2 years ago, and a great number of bugs have been fixed since.

Oh! I wasn't aware that the release we were running was that far behind.
 I probably should schedule an upgrade rather than spending time on
debugging problems which might already be solved.
Which release is currently recommended for production usage? I'd prefer
not to run a bleeding-edge version on this particular cluster. Is it the
1.2.12-1 one available as a source rpm on the slony webpage a good idea?

> Looking at the code directly... I notice slon_mkquery() gets called 16
> times in slonik_store_node(), and slon_appendquery() gets called 3
> times (which may be totally a red herring!).
> 
> Does GDB give you any idea as to where it was in slonik_store_node()?

Not really, I don't have the source for that version on the machine I
was running slonik on at the moment. The function starts at 0x804cd60
and calls slon_mkquery at 0x804cf2c (slonik_store_node+460) which should
be just before it calls the first "db_exec_command".

Anyways, I'll try to upgrade first and see if that solves the problem.

Thanks for the help!

/Marcus