Mon Feb 7 21:47:32 PST 2005
- Previous message: [Slony1-commit] By cbbrowne: More commentary about pathologies
- Next message: [Slony1-commit] By cbbrowne: Reshaped sectioning a bit to make things flow better
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Log Message: ----------- Evidently the "dup key" problem isn't SIG 11 and isn't (in an obvious way) a corrupted index... Modified Files: -------------- slony1-engine/doc/adminguide: faq.sgml (r1.14 -> r1.15) -------------- next part -------------- Index: faq.sgml =================================================================== RCS file: /usr/local/cvsroot/slony1/slony1-engine/doc/adminguide/faq.sgml,v retrieving revision 1.14 retrieving revision 1.15 diff -Ldoc/adminguide/faq.sgml -Ldoc/adminguide/faq.sgml -u -w -r1.14 -r1.15 --- doc/adminguide/faq.sgml +++ doc/adminguide/faq.sgml @@ -14,7 +14,7 @@ <para>Recheck the connection configuration. By the way, since <link linkend="slon"><application>slon</application></link> links to libpq, you could have password information stored in <filename> -<envar>$HOME</envar>/.pgpass</filename>, partially filling in +$HOME/.pgpass</filename>, partially filling in right/wrong authentication information there.</para> </answer> </qandaentry> @@ -127,8 +127,7 @@ can see passwords on the command line.</para></question> <answer> <para>Take the passwords out of the Slony configuration, and -put them into -<filename><envar>$(HOME)</envar>/.pgpass.</filename></para> +put them into <filename>$(HOME)/.pgpass.</filename></para> </answer></qandaentry> <qandaentry> @@ -679,41 +678,26 @@ to diminish the number of network round trips.</para></question> <answer><para> A <emphasis>certain</emphasis> cause for this has not -yet been arrived at. The factors that <emphasis>appear</emphasis> to -go together to contribute to this scenario are as follows: +yet been arrived at. -<itemizedlist> - -<listitem><para> The <quote>glitch</quote> has occasionally coincided -with some sort of outage; it has been observed both in cases where -databases were suffering from periodic <quote>SIG 11</quote> problems, -where backends were falling over, as well as when temporary network -failure seemed likely.</para></listitem> - -<listitem><para> The scenario seems to involve a delete transaction -having been missed by <productname>Slony-I</productname>. </para> -</listitem> - -</itemizedlist></para> - -<para>By the time we notice that there is a problem, the missed delete -transaction has been cleaned out of <envar>sl_log_1</envar>, so there -is no recovery possible.</para> - -<para>What is necessary, at this point, is to drop the replication set -(or even the node), and restart replication from scratch on that +<para>By the time we notice that there is a problem, the seemingly +missed delete transaction has been cleaned out of +<envar>sl_log_1</envar>, so there appears to be no recovery possible. +What has seemed necessary, at this point, is to drop the replication +set (or even the node), and restart replication from scratch on that node.</para> <para>In <productname>Slony-I</productname> 1.0.5, the handling of -purges of sl_log_1 are rather more conservative, refusing to purge +purges of sl_log_1 became more conservative, refusing to purge entries that haven't been successfully synced for at least 10 minutes -on all nodes. It is not certain that that will prevent the +on all nodes. It was not certain that that will prevent the <quote>glitch</quote> from taking place, but it seems likely that it will leave enough sl_log_1 data to be able to do something about recovering from the condition or at least diagnosing it more exactly. And perhaps the problem is that sl_log_1 was being purged too aggressively, and this will resolve the issue completely.</para> </answer> + <answer><para> Unfortunately, this problem has been observed in 1.0.5, so this still appears to represent a bug still in existence.</para> @@ -722,7 +706,21 @@ to break replication down into multiple sets in order to diminish the work involved in restarting replication. If only one set has broken, you only unsubscribe/drop and resubscribe the one set. -</para></answer> +</para> + +<para> In one case we found two lines in the SQL error message in the +log file that contained <emphasis> identical </emphasis> insertions +into <envar> sl_log_1 </envar>. This <emphasis> ought </emphasis> to +be impossible as is a primary key on <envar>sl_log_1</envar>. The +latest punctured theory that comes from <emphasis>that</emphasis> was +that perhaps this PK index has been corrupted (representing a +<productname>PostgreSQL</productname> bug), and that perhaps the +problem might be alleviated by running the query: +<programlisting> +# reindex table _slonyschema.sl_log_1; +</programlisting> + +</answer> </qandaentry> <qandaentry> @@ -788,6 +786,7 @@ delete the new rows in the child as well. </para> </answer> +</qandaentry> <qandaentry><question><para> What happens with rules and triggers on <productname>Slony-I</productname>-replicated tables?</para>
- Previous message: [Slony1-commit] By cbbrowne: More commentary about pathologies
- Next message: [Slony1-commit] By cbbrowne: Reshaped sectioning a bit to make things flow better
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-commit mailing list