[Slony1-general] sl_status what does it mean?

Tue Nov 18 12:19:51 PST 2008

Geoffrey <lists at serioustechnology.com> writes:
> Can someone tell me what is meant by the following data from sl_status?
>
>  st_origin | st_received | st_last_event |      st_last_event_ts
> | st_last_received |    st_last_received_ts    |
> st_last_received_event_ts | st_lag_num_events | st_lag_time
> -----------+-------------+---------------+----------------------------+------------------+---------------------------+----------------------------+-------------------+-------------
>          1 |           2 |         73226 | 11/18/2008 09:52:03.588658
> |          73226 | 11/18/2008 09:54:45.17624 | 11/18/2008
> 09:52:03.588658 |                 0 | @ 2.25 secs
> (1 row)
>
> That is, 'st_origin' == 1.  Does 1 indicate a particular status?  Same
> for 'st_received', what does the value 2 tell me?
>
> Is 2.25 secs st_lag_time good?
>
> Thanks for any info.

As per the documentation...

   This view shows the local nodes last event sequence number and how
   far all remote nodes have processed events.

- st_origin is the ID of the origin node for events, and will always
  be the node ID of the node where this query is being run.

- st_received is the ID of the subscriber node.

- st_last_event indicates what is the last event processed on the
  subscriber.

- st_last_event_ts indicates the date that event was generated (which
  will be somewhat in the past)

You can look in src/backend/slony1_funcs.sql to see more details about
how the data is derived.

If the subscriber is lagging by 0 events (as is the case in the
example), then the subscriber is clearly keeping up well.

st_lag_time indicates how far back events are lagging, and it is quite
likely that 2.25s is pretty reasonable.  There are three scenarios
that seem interesting, offhand:

1.  If updates are going in on the origin more or less continually,
then lag time will likely be pretty low, as long as replication is
keeping up.

2.  If the origin sees long periods where there are no updates, lag
time could get pretty high, but you may still be in a position of
having no "lagging events."

That's what I see in your example.

3.  It could be that there's a lot of load, and that replication is
lagging behind.  In *that* case, you'll see the event number lag go up
(e.g. - st_lag_num_events), as well as lag time (st_lag_time).

That's not what I see in your example.
-- 
select 'cbbrowne' || '@' || 'linuxfinances.info';
http://www3.sympatico.ca/cbbrowne/sgml.html
"How should I know if it  works?  That's what beta testers are for.  I
only  coded  it."   (Attributed  to  Linus Torvalds,  somewhere  in  a
posting)