Jan Wieck JanWieck
Wed Jun 7 14:08:18 PDT 2006
On 6/7/2006 2:48 PM, Christopher Browne wrote:
> Andreas Pflug wrote:
>> Christopher Browne wrote:
>>
>>>
>>> I don't see there being a material benefit to be found in switching over
>>> to a BINARY format. 
>>
>> This might be true if the binary data is only small; in case of large
>> data and small indexes (something like 100GB TOAST, 20MB index size)
>> this is certainly not the case.
> 
>>
>>  Certainly not when it would lead to a loss of
>>> portability.
>>
>> I agree, though for my use-case (only i86) it is acceptable, I used a
>> hard-coded slon version. I'd prefer a better COPY format anyway.
>>
>> Regards,
>> Andreas
> 
> On an otherwise-idle system, I took that same table and dumped/loaded in
> both formats.
> 
> org=# select now(); copy trans_log to '/opt/rg/data_org/etl-text';
> select now(); copy trans_log to '/opt/rg/data_org/etl-binary' with
> binary; select now();
>              now            
> -----------------------------
>  2006-06-07 17:17:02.0727+00
> (1 row)
> 
> COPY
>              now             
> ------------------------------
>  2006-06-07 17:32:40.91574+00
> (1 row)
> 
> COPY
>               now             
> -------------------------------
>  2006-06-07 17:37:34.133093+00
> (1 row)
> 
> org=# select now(); copy etl2 from '/opt/rg/data_org/etl-text'; select
> now(); copy etl2 from '/opt/rg/data_org/etl-binary' with binary; select
> now();
>               now             
> -------------------------------
>  2006-06-07 18:15:10.604745+00
> (1 row)
> 
> COPY
>              now             
> ------------------------------
>  2006-06-07 18:23:07.64762+00
> (1 row)
> 
> COPY
>               now             
> -------------------------------
>  2006-06-07 18:30:21.711226+00
> (1 row)
> 
> 
> pgorg at TOR-570-DB9011:/opt/home/pgorg $ ls -l /opt/rg/data_org/etl*
> -rw-r--r--   1 pgorg    postgres 2840059926 Jun  7 17:37
> /opt/rg/data_org/etl-binary
> -rw-r--r--   1 pgorg    postgres 2482854457 Jun  7 17:32
> /opt/rg/data_org/etl-text
> 
> Note that the binary form is about 14% larger than the text form; that
> would add to network costs...
> 
> Summarizing the timings observed:
> 
>             Text           Binary
>           --------------------------
> Dump time   15:38          04:53
> Load time   07:57          07:14
> 
> It is interesting to note that the binary dump was over 3x faster than
> the text dump.  Mind you, since this is, in Slony-I, streaming from
> source to destination, that savings is entirely illusory.  The actual
> savings would be more likely to be the 9.9% savings in loading time.  To
> that we must add indexing time; in the case of this particular table, we
> know it takes about 44 minutes, which means that the savings to be had,
> here, is 43 seconds over a cost of ~52 minutes, which amounts to a
> savings of about 1.3%.
> 
> I don't have anything handy that makes interesting use of TOAST tables. 
> 
> I'd want to see a benchmark showing a *compelling* benefit before
> considering it.  The above certainly isn't compelling.

Which means that the easiest way to save some significant time would be 
to copy the data of all tables, rollback the transaction on the provider 
immediately once all data has been copied, then reindex all the tables 
in that set.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck at Yahoo.com #



More information about the Slony1-general mailing list