Christopher Browne cbbrowne at ca.afilias.info
Tue Jul 31 07:53:31 PDT 2007
Weslee Bilodeau <weslee.bilodeau at hypermediasystems.com> writes:
> Christopher Browne wrote:
>> Weslee Bilodeau <weslee.bilodeau at hypermediasystems.com> writes:
>>> Small question on this.
>>>
>>> I would imagine one could easily find a lot of edge cases that can break
>>> the current parser.
>>>
>>> I'm guessing they found them in psql, which is why psql stole the lexer
>>> from the backend itself.
>>>
>>> Is there any reason why slony didn't go the same route?
>>>
>>> http://developer.postgresql.org/cvsweb.cgi/pgsql/src/bin/psql/psqlscan.l?rev=1.21;content-type=text%2Fx-cvsweb-markup
>>>
>>> It was written for the exact same task -
>>>
>>> ------------------------------------------------------------
>>> This code is mainly needed to determine where the end of a SQL statement
>>> is: we are looking for semicolons that are not within quotes, comments,
>>> or parentheses.  The most reliable way to handle this is to borrow the
>>> backend's flex lexer rules, lock, stock, and barrel.  The rules below
>>> are (except for a few) the same as the backend's, but their actions are
>>> just ECHO whereas the backend's actions generally do other things.
>>> ------------------------------------------------------------
>>>
>>>
>>> Or has this already been discussed and dismissed?
>> 
>> If memory serves, I initially tried to introduce a second parser, but
>> ran into trouble with the notion of linking in 2 parsers.
>> 
>> I then reviewed syntax, and figured I had enough of it covered, which
>> was apparently not correct.
>> 
>> I'll take another crack at it...
>
> I figure the $_$, $$, etc edge-casees would be another fun one to roll
> into a custom parser.
>
> CREATE FUNCTION test( ) RETURNS text AS $_$ SELECT ';', E'\';\'',
> '"";""', E'"\';' ; SELECT 'OK'::text ; $_$ LANGUAGE SQL ;
>
> SELECT $_$ hello; this ; - is '\" a '''' test $_$ ;
>
> SELECT $$ $ test ; $ ;  $$ ;
>
> All really funky, but perfectly valid.
>
>
> And yeah, rolling two lexers into one, that does have its own challenges.
>
> Maybe expand the first lexer and add an SQL state, so it can parse SQL
> within the first one directly?

Actually, the existing logic *is* smart enough to cope with all of
these cases; here's the output from the parser test program when I
added these in:

-- Some more torturing per Weslee Bilodeau

-- I figure the $_$, $$, etc edge-casees would be another fun one to roll
-- into a custom parser.

CREATE FUNCTION test( ) RETURNS text AS $_$ SELECT ';', E'\';\'',
'"";""', E'"\';' ; SELECT 'OK'::text ; $_$ LANGUAGE SQL ;
statement 19
-------------------------------------------


SELECT $_$ hello; this ; - is '\" a '''' test $_$ ;
statement 20
-------------------------------------------


SELECT $$ $ test ; $ ;  $$ ;
statement 21
-------------------------------------------


-- All really funky, but perfectly valid.

-- Force a query to be at the end...

create table foo;
statement 22
-------------------------------------------

For now, I'm not inclined to change *everything* around to a new
parser, not when it's only about 3 pages of code...
-- 
output = ("cbbrowne" "@" "acm.org")
http://cbbrowne.com/info/advocacy.html
"End users  are just test loads  for verifying that  the system works,
kind of like resistors in an electrical circuit."
-- Kaz Kylheku in c.o.l.d.s


More information about the Slony1-general mailing list