Mon Feb 28 22:49:15 PST 2005
- Previous message: [Slony1-general] Nagios plugin to check slony replication
- Next message: [Slony1-general] Some basic newbie-like questions - long
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
I'm trying to get a handle on how this stuff all works so I can explain it to some team members. I know that it does work, but it still looks like magic to me :-) I apologize in advance for the length of this message, but I would greatly appreciate any help. The best answer(s) would point me to some already existing documentation. Thanks in advance. Here's the scenario: Let's say that I have a small chain of auto-part stores. One in Homeville, one in Nearville, and one in Farville (three imaginary cities). Each store keeps an inventory table that keeps track of the parts in that store. The Homeville store keeps its inventory table in a database named homeville, the Nearville store keeps its inventory in a database named Nearville, and the Farville store keeps its inventory in a database named Farville. I want the Homeville store to replicate its homeville.inventory table to Nearville and Farville. Likewise, Nearville and Farville each replicate their databases to the other two. That way, any store can find (reasonably) current stock levels for all three stores. Question 1 - Is this do-able? Can a single node be both a master and a slave (Homeville would be a master for homeville.inventory and would subscribe to nearville.inventory and farville.inventory)? Assuming the answer to question 1 is yes, here's how I think I should set up my replication cluster (I'll call it 'stores'). First, I need a single cluster with three nodes (homeville, nearville, and farville). The Homeville node creates a single SET(set number 1) that exposes homeville.inventory to anyone who's interested. The Nearville node creates another SET (set number 2) that exposes nearville.inventory. The Farville node creates a third SET (set number 3) that exposes farville.inventory. Homeville subscribes to sets 2 and 3, nearville subscribes to sets 1 and 3, and farville subscribes to sets 1 and 2. Question 2 - Did I get that right? Can I do all of this in a single replication cluster or do I need three separate clusters? Question 3 - Since Farville is a long way away from Homeville, I want Farville to pull it's copy of homeville.inventory from Nearville instead of directly from Homeville. That's a cascading (or chained) slave, right? How do I do that? Now the meaty stuff... I've been browsing through the Slony source trying to understand how the magic works. Here's my understanding so far: When I define set number 1 (homeville.inventory), slonik adds an AFTER UPDATE/INSERT/DELETE trigger to my homeville.inventory table. When I modify a row in homeville.inventory, that trigger (_Slony_I_logTrigger) records my changes in _stores.sl_log_1. When the trigger writes to sl_log_1, it also increments the sl_action_seq sequence. Question 4 - That's it - the trigger doesn't NOTIFY anybody and it doesn't push any data to the slaves, is that correct? The slon process that services node homeville (I'll call that slon.homeville) wakes up every 10 seconds and checks sl_action_seq to see if any modifications have been made to tables exposed by node homeville. If no modifications have been made, slon.homeville goes back to sleep. Except, every sixty seconds, slon.homeville throws out a SYNC event even if there isn't any work to do. Question 5 - Why does slon.homeville sleep for a pre-determined (but adjustable) amount of time? Why doesn't it LISTEN for a NOTIFY thrown by _Slony_I_logTrigger instead? My guess is that slon.homeville would get flooded by notifications (and the postgres processes would waste a lot of time sending the notifications), is that right? If slon.homeville wakes up and finds that sl_action_seq has been changed (or that 60 seconds have elapsed), it creates a SYNC event. From what I can see, slon.homeville writes a row to the _stores.sl_event table and then does a NOTIFY _storesEvent. Here's where I start to get a little confused. I know that slon.nearville has a connection to the homeville node (while slon.nearville is running, I can see a backend process running on homeville that's connected to slon.nearville). I also know that the node_1_listen thread in slon.nearville is LISTENing for a notification on homeville._storesEvent (I can see that in homeville.pg_listenters). So, when homeville does a NOTIFY _storesEvent, slon.nearville gets the notification. When slon.nearville receives the SYNC event, it reads through the sl_log_1 table on homeville, converts each entry into an UPDATE, INSERT, or DELETE statement, and executes that statement, effectively replicating the modifications from homeville to nearville. Question 6 - Why does slon.homeville have to write a SYNC event to sl_event and then do a NOTIFY? Why isn't the NOTIFY sufficient to tell nearville that it needs to read through sl_log_1? I guess it's because the _storesEvent notification can signal replication data or it can signal an administrative message - the sl_event record tells slon.nearville which event actually occurred. Right so far? Question 7 - If slon.nearville isn't running, it obviously misses the NOTIFY that slon.homeville signalled. Is that what the 60-second timeout thing is all about? When slon.nearville finally gets up and running, it gets a SYNC notification so that it can catch up. Question 8 - How does homeville know that it can start cleaning out sl_log_1? homeville can't remove a modification record from sl_log_1 until it knows that all subscribers have pulled that modification. Is that what confirmations are all about? Does homeville batch together modification records (maybe each batch corresponds to a SYNC event?) and keep track of which subscribers have pulled each batch? Question 9 - How does homeville know that it can remove an entry from sl_event? Thanks again ++ Murphy Question 10 - (sorry, one more bonus question). Instead of keeping three databases on each system, can I keep a single database and three schemas? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://gborg.postgresql.org/pipermail/slony1-general/attachments/20050228/7ef3f173/attachment.html
- Previous message: [Slony1-general] Nagios plugin to check slony replication
- Next message: [Slony1-general] Some basic newbie-like questions - long
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list