Tue Jun 2 13:17:19 PDT 2009
- Previous message: [Slony1-general] How to shut-down slony replication
- Next message: [Slony1-general] How to rename a database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Jeff Frost wrote: > Andrew Sullivan wrote: >> On Tue, Jun 02, 2009 at 10:30:45AM -0500, Sean Staats wrote: >> = >>> I created a new replication cluster. It turns out that starting the = >>> table IDs at id=3D1 and the sequence IDs at id=3D1001 didn't make any = >>> difference as slony gave me the same error (sequence ID 1001 has alread= y = >>> been assigned.) Increasing the log verbosity to 4 doesn't produce any = >>> more useful debugging information. Time for another approach. >>> >>> Would it make sense to create 2 different sets - one to replicate the = >>> tables and one to replicate the sequences? Is there a downside to this= = >>> kind of workaround? >>> = >> >> It'd be better to figure out what the duplication is caused by. Have >> a look in the _slony tables and check to see what's in there. Where's >> the collision? >> >> = > I've seen this issue recently when the initial sync fails. If you > scroll further back in your logs do you have a failure for the initial > copy_set? When this happens to me, it seems that slony leaves the > slave DB in a half replicated state, but reattempts to do the initial > sync and finds that the sequences are already in _cluster.sl_sequence > table, then errors out. This requires dropping the node and starting > over. This is with version 1.2.16. I recall previous versions being > able to recover from a failed initial sync without intervention, but > my memory could be mistaken. In fact, here's how it looks in my logs: Jun 2 13:09:36 localhost slon[1867]: [274-1] 2009-06-02 13:09:36 PDT ERROR remoteWorkerThread_1: "select "_engage_cluster".tableHasSerialKey('"archive"."invitation"');" Jun 2 13:09:36 localhost slon[1867]: [274-2] could not receive data from server: Connection timed out Jun 2 13:09:36 localhost slon[1867]: [275-1] 2009-06-02 13:09:36 PDT WARN remoteWorkerThread_1: data copy for set 1 failed - sleep 30 seconds Jun 2 13:09:36 localhost postgres[1880]: [26-1] NOTICE: there is no transaction in progress Jun 2 13:10:06 localhost slon[1867]: [276-1] 2009-06-02 13:10:06 PDT DEBUG1 copy_set 1 Jun 2 13:10:06 localhost slon[1867]: [277-1] 2009-06-02 13:10:06 PDT DEBUG1 remoteWorkerThread_1: connected to provider DB Jun 2 13:10:09 localhost slon[1867]: [278-1] 2009-06-02 13:10:09 PDT ERROR remoteWorkerThread_1: "select "_engage_cluster".setAddSequence_int(1, 4, Jun 2 13:10:09 localhost slon[1867]: [278-2] = '"public"."tracking_sequence"', 'public.tracking_sequence sequence')" PGRES_FATAL_ERROR ERROR: Slony-I: setAddSequence_int(): Jun 2 13:10:09 localhost slon[1867]: [278-3] sequence ID 4 has already been assigned Jun 2 13:10:09 localhost slon[1867]: [279-1] 2009-06-02 13:10:09 PDT WARN remoteWorkerThread_1: data copy for set 1 failed - sleep 60 seconds The DB in question is 144GB and it's being replicated over a relatively slow link. It seems to do about 1GB/hr, but never gets past 10GB. It always dies at that same point. = -- = Jeff Frost, Owner <jeff at frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 916-647-6411 FAX: 916-405-4032 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.slony.info/pipermail/slony1-general/attachments/20090602/= 99bc85a5/attachment.htm
- Previous message: [Slony1-general] How to shut-down slony replication
- Next message: [Slony1-general] How to rename a database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list