Tue Nov 18 14:22:34 PST 2008
- Previous message: [Slony1-general] Do finishTableAfterCopy and ANALYZE need to be serialized with data copy?
- Next message: [Slony1-general] Do finishTableAfterCopy and ANALYZE need to be serialized with data copy?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Richard Yen <dba at richyen.com> writes: > This might be moot with the coming release of Slony 2.0.0, but I was > wondering if there are any thoughts about the following question: > > Do the finishTableAfterCopy() and ANALYZE of each table need to happen > in serial with the data copy from stdin? i.e., can we create a new > thread that will do these two things while slon proceeds to copy the > data of the next table? > > I raise this question because for large data sets, I think the > copy_set process time could be improved by 30-40% if we can split > these two stages. I have some large tables that take 30 min or so to > copy, then another 15-20 min to finishTableAfterCopy() and ANALYZE. > > Thought I'd throw this out to get some feedback, before I go and > mangle code...any thoughts? I don't think the point is moot; no, indeed, there is considerable value to this idea. Jan and I have been bouncing this one around for a while. We took the idea further in concept (if not the implementation!); the further thought is to do this two steps cleverer than you describe... Step 1. Allow as many extra connections as the administrator requests. Thus, we have a "number_of_finish_connections" parameter (which presumably has a better name than that), and throw the finishTableAfterCopy()/ANALYZE requests to a "connection pool." I'd expect this to have diminishing returns, and that the useful maximum would be around 4. Step 2. Order the requests so as to maximize parallelism. Thus, we subscribe to tables in reverse order of their estimated size (pg_class.relpages should be a reasonable approximation). This means that we tend to push the bigger tables onto the "reindex queue" as early as possible in the subscription process. Haven't had the Round Tuits to get to it; if you could provide the beginnings of it, that would make it easier to find the (hopefully fewer, if effort is shared!) hours of implementation effort. -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://linuxdatabases.info/info/internet.html As of next Monday, MACLISP will no longer support list structure. Please downgrade your programs.
- Previous message: [Slony1-general] Do finishTableAfterCopy and ANALYZE need to be serialized with data copy?
- Next message: [Slony1-general] Do finishTableAfterCopy and ANALYZE need to be serialized with data copy?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list