Richard Yen dba at richyen.com
Wed May 11 13:29:38 PDT 2011
Thanks Steve.  I've put the dumps for the master and my two slaves at
http://richyen.com/slony/

Thanks again,
--Richard



On Wed, May 11, 2011 at 1:21 PM, Steve Singer <ssinger at ca.afilias.info>wrote:

> On 11-05-11 04:18 PM, Richard Yen wrote:
> > Thanks Jan,
> >
> > I'll go ahead and do as recommended.  I was actually hoping that maybe
> > we could scour the sl_confirm and/or sl_event tables for clues, but if
> > you think it's not possible to find out what happened, I guess I'll just
> > delete the row and move on.
> >
> > Thanks again,
> > --Richard
>
> If you can get a dump of sl_event and sl_confirm of both the master and
> the slave and send it to one of us we can take a look.
>
> If you happen to have slon logs of the period it can't hurt to send
> those as well.
>
>
> >
> >
> >
> > On Wed, May 11, 2011 at 1:12 PM, Jan Wieck <JanWieck at yahoo.com
> > <mailto:JanWieck at yahoo.com>> wrote:
> >
> >     On 5/11/2011 2:56 PM, Richard Yen wrote:
> >
> >         Hello,
> >
> >         running slon version 2.0.6 here...
> >
> >
> >     This is the first time I've heard about an operation missing. I
> >     remember in some old version of PostgreSQL (8.1 I think) that we
> >     experienced duplicate sl_log rows due to index corruption. I was
> >     actually able to get a result set with a duplicate ctid. But never
> >     was a row missing so far.
> >
> >     To fix your replica(s), you should be able to manually DELETE the
> >     offending row using psql and doing
> >
> >         set session_replication_role to "replica";
> >         delete from cron_lock where ...
> >
> >     You need to be a superuser to do so. After that SET statement, the
> >     psql prompt will bypass the deny_access triggers and the DELETE
> >     statement will behave exactly as if it was coming from slon.
> >
> >     I assume that by now, a log switch has probably destroyed all traces
> >     that could be used to debug the problem further.
> >
> >
> >     Jan
> >
> >
> >
> >         Would anyone be able to help me track down why slony missed a
> >         DELETE?
> >           It seems that my replication is broken as the subscribers are
> >         trying
> >         to process an INSERT, but a primary key is being violated.  The
> >         origin
> >         machine does not have the offending tuple, which leads me to
> believe
> >         that a DELETE was processed, but wasn't propagated to the
> >         subscribers
> >
> >           From my origin machine:
> >         my_db=# select * from cron_lock;
> >                  id         |      lock_until_time
> >         -------------------+----------------------------
> >           anonymous_marking | 2011-05-11 11:40:02.456091
> >         (1 row)
> >
> >           From my subscriber machines:
> >         my_db=# select * from cron_lock ;
> >                  id         |      lock_until_time
> >         -------------------+----------------------------
> >           anonymous_marking | 2011-05-11 10:40:02.123721
> >         (1 row)
> >
> >           From the origin's sl_log_* tables:
> >         my_db=# select * from _sac_uk.sl_log_2 where log_tableid =190;
> >           log_origin |  log_txid  | log_tableid | log_actionseq |
> >         log_cmdtype |
> >                                                      log_cmddata
> >
> ------------+------------+-------------+---------------+-------------+-----------------------------------------------------------------------------------------------------
> >                    1 | 1369072247 |         190 |     239698918 | I
> >                |
> >         ("id","lock_until_time") values ('anonymous_marking','2011-05-11
> >         11:40:02.456091')
> >                    1 | 1369182578 |         190 |     239728797 | I
> >                |
> >         ("id","lock_until_time") values
> >         ('process_past_due_pm_assignments.pl
> >         <http://process_past_due_pm_assignments.pl>
> >         <http://process_past_due_pm_assignments.pl>','2011-05-11
> >         11:00:23.944101')
> >
> >                    1 | 1369182587 |         190 |     239728806 | D
> >                |
> >         "id"='process_past_due_pm_assignments.pl
> >         <http://process_past_due_pm_assignments.pl>
> >         <http://process_past_due_pm_assignments.pl>'
> >
> >                    1 | 1369182626 |         190 |     239728830 | I
> >                |
> >         ("id","lock_until_time") values
> >         ('process_past_due_pm_assignments.pl
> >         <http://process_past_due_pm_assignments.pl>
> >         <http://process_past_due_pm_assignments.pl>','2011-05-11
> >
> >         11:00:24.525818')          1 | 1369182671 |         190 |
> >         239728833
> >         | D           | "id"='process_past_due_pm_assignments.pl
> >         <http://process_past_due_pm_assignments.pl>
> >         <http://process_past_due_pm_assignments.pl>'
> >
> >         (5 rows)
> >
> >         my_db=# select * from _sac_uk.sl_log_1 where log_tableid =190;
> >           log_origin |  log_txid  | log_tableid | log_actionseq |
> >         log_cmdtype |
> >                                             log_cmddata
> >
> ------------+------------+-------------+---------------+-------------+------------------------------------------------------------------------------------
> >                   1 | 1369393174 |         190 |     239789790 | D
> >              |
> >         "id"='anonymous_marking'
> >                    1 | 1369403276 |         190 |     239793047 | I
> >                |
> >         ("id","lock_until_time") values ('anonymous_marking','2011-05-11
> >         12:40:02.970433')(2 rows)
> >
> >         On the subscriber logs:May 11 11:54:40 uk-sdb2 postgres[30851]:
> >         [926-1]
> >         2011-05-11 11:54:40.755 PDT [user=slony,db=my_db
> 10.1.0.149(47318)
> >         PID:30851 XID:1509911291]ERROR:  duplicate key value violates
> unique
> >         constraint "cron_lock_pkey"
> >         May 11 11:54:40 uk-sdb2 postgres[30851]: [926-2] 2011-05-11
> >         11:54:40.755
> >         PDT [user=slony,db=my_db 10.1.0.149(47318) PID:30851
> >         XID:1509911291]STATEMENT:  update only "public"."m_user" set
> >         "last_login"='2011-05-11 10:40:02.429308' where "id"='2459339';
> >
> >         Any help would be much appreciated.
> >         --Richard
> >
> >
> >
> >         _______________________________________________
> >         Slony1-general mailing list
> >         Slony1-general at lists.slony.info
> >         <mailto:Slony1-general at lists.slony.info>
> >         http://lists.slony.info/mailman/listinfo/slony1-general
> >
> >
> >
> >     --
> >     Anyone who trades liberty for security deserves neither
> >     liberty nor security. -- Benjamin Franklin
> >
> >
> >
> >
> > _______________________________________________
> > Slony1-general mailing list
> > Slony1-general at lists.slony.info
> > http://lists.slony.info/mailman/listinfo/slony1-general
>
> _______________________________________________
> Slony1-general mailing list
> Slony1-general at lists.slony.info
> http://lists.slony.info/mailman/listinfo/slony1-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.slony.info/pipermail/slony1-general/attachments/20110511/d3e3d0f6/attachment.htm 


More information about the Slony1-general mailing list