David Sainty David.Sainty
Tue Oct 19 22:48:01 PDT 2004
Hmmm, I guess I saw some form of bug then.

The database setup will include replication health checks by
comparing the data (or rather, a hash of the data).  So if it
becomes a problem I'll know about it.  I'll grab debug details if
I can replicate it.

Perhaps the global errno is to allow old shared libraries to be
used by threaded code without a (run time) link error.  But they
will malfunction if errno is ever intended to be set, so
obviously it isn't a good solution :)

>>> Jan Wieck <JanWieck at Yahoo.com> 19/10/2004 19:00:38 >>>
On 10/18/2004 11:35 PM, David Sainty wrote:

> As usual, I'm having trouble reproducing it now...  I've also
> discovered the user error with using 7.4.2 (as a side effect
of
> re-working through to reproduce my results :) so I have that
test
> case working too.
> 
> I'm starting to have doubts now, it is just possible that the
> original 7.4.5 failure was against a build without
> --enable-thread-safety...
> 
> The original fault was definitely stuck, I even left it
> "replicating" overnight to make sure, and the databases never
> sync'd up.  But if it WAS built without
--enable-thread-safety, I
> imagine that would explain it?  Correct?
> 
> Is your expectation that 7.4.5 doesn't need to be patched for
> --enable-thread-safety to work properly on Solaris?

Not at all. The issue is that a libpq configured "whitout" 
--enable-thread-safety is not looking at the right errno
variable on 
Solaris, since the compiler directives given with that cause
errno to be 
defined as some strange integer pointer resolving function call
leading 
the current thread to the right errno value, whereas the
"global" errno 
still exists in a program but as soon as it is linked against
the thread 
safe libc, it will stay zero forever. Thus, a program linked
against the 
thread safe libc but with a shared libpq linked against the non
thread 
safe libc will cause a libc call to report an error and libpq
cannot 
figure out the errno and bail out with Error "0". It is stupid,
and in 
my opinion the thread safe libc should not define a global errno
... but 
there are for sure a lot of smart guy's working for Sun who can
explain 
why it is a good thing to do so.


Jan

> 
> Thanks,
> 
> Dave
> 
>>>> Jan Wieck <JanWieck at Yahoo.com> 19/10/2004 13:18:25 >>>
> On 10/18/2004 7:25 PM, David Sainty wrote:
> 
>> :) Yes, I am aware.  In both cases (7.4.2 and 7.4.5) I used
> the
>> --enable-thread-safety flag and built from scratch.
> 
> I'd like to see the output of the slon process from the
> subscriber 
> (slave) started with -d2, from the beginning to the point
where
> you 
> think where it got stuck.
> 
> 
> Jan



More information about the Slony1-general mailing list