Thu Mar 18 22:19:58 PDT 2010
- Previous message: [Slony1-general] Slony replication problem - logswitch failure
- Next message: [Slony1-general] Slony Install Issues - undefined reference to `pg_qsort'
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Brad, Thanks for the quick reply. Please see inline. On Thu, Mar 18, 2010 at 5:58 PM, Brad Nicholson <bnichols at ca.afilias.info> wrote: > Venkatraju wrote: >> >> Hi, >> >> I have a two node Slony cluster running version 1.2.14 on PostgreSQL >> 8.1.11 on CentOS 5. I noticed that the subscriber had not been >> replicating for almost 6 >> hours now (from sl_status). Slony logs on publisher and subscriber >> contain the following messages: >> >> NOTICE: Slony-I: log switch to sl_log_2 still in progress - sl_log_1 >> not truncated >> WARN cleanupThread: "select "_cluster".logswitch_weekly(); " - >> ERROR: Previous logswitch still in progress >> CONTEXT: SQL statement "SELECT "_cluster".logswitch_start()" >> >> Both sl_log_1 and sl_log_2 tables contain a large number of rows >> (~500,000 rows in sl_log_1 and ~8 million in sl_log_2). Some >> questions: >> >> 1) We found some long running transactions active on the publisher. >> Could this cause logswitch to fail? The long running transactions were >> against a database that is not used in replication - could those >> queries hold up the logswitch? >> >> > > The long running transaction could block the log switch. The truncate from > of sl_log_1 and switch to sl_log_2 is waiting. > Check pg_locks for ungranted locks - I imagine that will find the backend > that's doing the log switch is probably waiting on a lock. I did not find any entries in pg_locks with granted = false. Is there something else I should be looking for? Also, would a query running on database2 affect Slony replicating database1? Trying to understand this better to see if I should look for some other root cause for logswitch failure (other than the long running queries). I know that a COPY_SET is blocked by transactions on other databases; does the same hold for cleanup and/or logswitch as well? > I would kill the long running transactions. Slony does not function well > with systems have long running transactions against the provider. > > Run a vacuum on pg_listener after killing them as well. >> >> 2) Slony maintenance page >> (http://www.slony.info/documentation/maintenance.html) says: >> >> "That means that on a regular basis, these tables are completely >> cleared out, so that you will not suffer from them having grown to >> some significant size, due to heavy load, after which they are >> incapable of shrinking back down." >> >> "incapable of shrinking back down"? Does this mean there is no way to >> recover from this state without rebuilding the DBs? >> > > You don't have to worry about that for the sl_log_1/2 tables. Truncate gets > rid of the dead space altogether. Other tables in your system however may > be bloated though. > > You don't need to rebuild your database to recover this, but you do need to > run some disruptive operations - Vaccum full+reindex (really slow), cluster > (not MVCC safe in 8.1, be 100% sure there are no old transactions accessing > your tables before you start if you don't want to lose data), or a no-op > alter table. All of these operations will block access to your table while > running. > > Depending on the size of your DB and how bad it is bloated - reloading may > be quicker. > > -- > Brad Nicholson 416-673-4106 > Database Administrator, Afilias Canada Corp. > >
- Previous message: [Slony1-general] Slony replication problem - logswitch failure
- Next message: [Slony1-general] Slony Install Issues - undefined reference to `pg_qsort'
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list