<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Title" content="">
<meta name="Keywords" content="">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"Apple Color Emoji";
        panose-1:0 0 0 0 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:Calibri;
        color:windowtext;}
span.msoIns
        {mso-style-type:export-only;
        mso-style-name:"";
        text-decoration:underline;
        color:teal;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style>
</head>
<body bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black"><o:p> </o:p></span></b></p>
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">
</span></b><span style="font-family:Calibri;color:black">Hello Slony-I community,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:Calibri;color:black"> Hoping someone can advise on a strange and serious problem. We performed a slony service failover yesterday. For the first time ever, our slony service FAILOVER op errored out. We
recently expanded our cluster to 7 consumers from a single provider. There are no load issues during normal operations. As the error output below shows, though, our node 4 and node 5 consumers never got the events they needed. Here’s where it gets weird: closer
inspection has shown that node 2->4 and node 2->5 path data went missing out of the service at some point. It seems clear that’s the main issue, but in spite of that, both node 4 and node 5 continued to find and process node 2 SYNC events for a full week!
The logs show this happened in spite of multiple restarts. <o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:.5in"><span style="font-family:Calibri;color:black">How can this happen? If missing path data stymies the failover, wouldn’t it also prevent normal SYNC processing?<o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:.5in"><span style="font-family:Calibri;color:black">In the case where a failover is begun with inadequate path data, what’s the best resolution? Can path data be quickly applied to allow failover to succeed?<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:Calibri;color:black"> Thanks in advance for any insights.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-family:Calibri;color:black"><o:p> </o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">---- failover error ----<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: NOTICE: calling restart node 1<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:55: 2017-06-26 18:33:02<o:p></o:p></p>
<p class="MsoNormal">executing preFailover(1,1) on 2<o:p></o:p></p>
<p class="MsoNormal">executing preFailover(1,1) on 3<o:p></o:p></p>
<p class="MsoNormal">executing preFailover(1,1) on 4<o:p></o:p></p>
<p class="MsoNormal">executing preFailover(1,1) on 5<o:p></o:p></p>
<p class="MsoNormal">executing preFailover(1,1) on 6<o:p></o:p></p>
<p class="MsoNormal">executing preFailover(1,1) on 7<o:p></o:p></p>
<p class="MsoNormal">executing preFailover(1,1) on 8<o:p></o:p></p>
<p class="MsoNormal">NOTICE: executing "_ams_cluster".failedNode2 on node 2<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 8 only on event 5000061654, node 4 only on event 5000061654, node 5 only on event 5000061655, node 3 only on event 5000061662, node
6\<o:p></o:p></p>
<p class="MsoNormal"> only on event 5000061654, node 7 only on event 5000061656<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061657, node 5 only on event 5000061663, node 3 only on event 5000061663, node 6 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663, node 6 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal">/tmp/ams-tool/ams-slony1-fastfailover-1-FR_80.67.75.105.slk:56: waiting for event (2,5000061664). node 4 only on event 5000061663, node 5 only on event 5000061663<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">---- node 4 log archive ----<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">bos-mpt5c:odin-9353 ttignor$ egrep 'disableNode: no_id=2|storePath: pa_server=2 pa_client=4|restart notification' prod4/node4-pathconfig.out <o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:14:00 UTC [5688] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:14:10 UTC [8431] CONFIG storePath: pa_server=2 pa_client=4 pa_conninfo="dbname=ams<o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:53:00 UTC [8431] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:53:10 UTC [23701] CONFIG storePath: pa_server=2 pa_client=4 pa_conninfo="dbname=ams<o:p></o:p></p>
<p class="MsoNormal">2017-06-16 17:29:13 UTC [10253] CONFIG storePath: pa_server=2 pa_client=4 pa_conninfo="dbname=ams<o:p></o:p></p>
<p class="MsoNormal">2017-06-16 20:43:42 UTC [2707] CONFIG storePath: pa_server=2 pa_client=4 pa_conninfo="dbname=ams<o:p></o:p></p>
<p class="MsoNormal">2017-06-19 15:11:45 UTC [2707] CONFIG disableNode: no_id=2<o:p></o:p></p>
<p class="MsoNormal">2017-06-19 15:11:45 UTC [2707] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-20 18:40:15 UTC [31224] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-21 14:31:42 UTC [6253] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-21 14:35:26 UTC [32367] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-26 18:21:25 UTC [9278] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-26 18:33:04 UTC [28839] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-26 18:33:30 UTC [1785] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">bos-mpt5c:odin-9353 ttignor$ <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">---- node 5 log archive ----<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">bos-mpt5c:odin-9353 ttignor$ egrep 'disableNode: no_id=2|storePath: pa_server=2 pa_client=5|restart notification' prod5/node5-pathconfig.out <o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:13:56 UTC [20700] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:14:06 UTC [20374] CONFIG storePath: pa_server=2 pa_client=5 pa_conninfo="dbname=ams<o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:53:01 UTC [20374] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-15 15:53:11 UTC [2859] CONFIG storePath: pa_server=2 pa_client=5 pa_conninfo="dbname=ams<o:p></o:p></p>
<p class="MsoNormal">2017-06-16 17:28:19 UTC [2859] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-16 17:28:29 UTC [10753] CONFIG storePath: pa_server=2 pa_client=5 pa_conninfo="dbname=ams<o:p></o:p></p>
<p class="MsoNormal">2017-06-19 15:11:40 UTC [10753] CONFIG disableNode: no_id=2<o:p></o:p></p>
<p class="MsoNormal">2017-06-19 15:11:40 UTC [10753] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-20 18:40:11 UTC [450] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-21 14:31:41 UTC [22300] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-21 14:35:28 UTC [26777] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-26 18:21:27 UTC [28366] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-26 18:33:04 UTC [29345] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">2017-06-26 18:33:27 UTC [1299] INFO localListenThread: got restart notification<o:p></o:p></p>
<p class="MsoNormal">bos-mpt5c:odin-9353 ttignor$ <o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"> Tom <span style="font-family:"Apple Color Emoji"">
☺</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>