Quasi-Literal Expressions and Patterns in E

mzukowski@bco.com mzukowski@bco.com
Tue, 20 Apr 1999 14:44:22 -0700


The cool thing about Ka-Ping's Python example is that he builds regular
expressions by combining objects.  The label.*() method calls introduce
labels into the regular expression which hold the data like your @ids.
Addition combines regular expressions.  Multiplication repeats them.  Some
symbolics like 'digit' are predefined, as well as functions like maybe() and
anybut().  In this example he's parsing the output from 'ls'.  I find this
style more readable than even your heavily annotated updoc example.


At 02:19 PM 4/16/99 , Ka-Ping Yee wrote: 

flag = member(letters, '-') 
listing = (begline + 

        label.mode(
            flag +                            # file type
            flag*3 + flag*3 + flag*3) +       # owner, group, world perms
        label.data(
            somespace + anything +            # links, owner, grp, sz, date
            somespace +
            digit*2 + maybe(':') + digit*2 +  # year or hh:mm
            somespace) +
        label.file(
            anybut('->')) +                   # anything before symlink
        maybe(label.link(
            somespace + '->' + anything)) +   # possible symlink
        endline)

nto this: 

print listing 
<Pattern
^\\(<mode>[A-Za-z-][A-Za-z-][A-Za-z-][A-Za-z-][A-Za-z-][A-Za-z-][A-Za-z-][A-
Za-z-][A-Za-z-][A-Za-z-]\\)\\(<data>[ \011\015\012\014]+.*[
\011\015\012\014]+[0-9][0-9]\\(:\\)?[0-9][0-9][
\011\015\012\014]+\\)\\(<file>\\([^-]|-[^>]\\)*\\)\\(\\(<l ink>[
\011\015\012\014]+->.*\\)\\)?$> 


>On Thu, 15 Apr 1999, Mark S. Miller wrote:
>> If I'd tried to write this without regex's, it might 
>> be locally more readable, but would also be a lot more code -- an 
>> interesting tradeoff. How does it compare with regex use in Perl or 
>> Python?
>
>It looks pretty much par for the course, except for the cryptic
>(?ms) option in front and the unfamiliarity of the @bindings.

The (?m) flag allows multi-line input. The (?s) flag allows newlines to be
treated as a normal character, so that, for example, . will match it. (?x)
as mentioned in a previous message,
allows embedded whitespace and comments. If I wasn't worried about the
installed base of expectations, I'd turn all these on by default. I'm
puzzled why anyone would want them off. 

>It's surprising how much it causes E to look like Perl. I guess
>even despite all the punctuation that Perl uses everywhere else,
>a sufficiently large fraction is in regular expressions to cause
>E-with-regexes to have a similar look.

Horrible though it may seem, given my goals, this is a good thing. Maximize
contagion! Minimize incubation period! 

>> Here's a readability test: By reading this code, can you easily tell what

>> syntax it parses?
>
>Not in a brief glance. I wanted to study it a little more before
>replying, but i thought i ought to point you at something i did a
>little while ago to make regular expressions more Python-like
>(i.e. readable).

How do you feel about the overly commented version? 

I didn't understand your Python example. 

        Cheers,
        --MarkM