More Perl regex stuff in E

Mark S. Miller markm@caplet.com
Thu, 15 Apr 1999 23:21:39 -0700


I don't know whether to laugh or cry.  Before I started *using* Perl regular 
expressions in E, I felt my code was readable.  The following code doesn't 
look readable to me, but is this just because I'm not used to reading this 
style?  In any case, it's small, was easy to write, worked the first time, 
and does something a bit complicated: parsing updoc scripts.  It 
successfully parsed the output of html2txt() applied to some of my 
documentation pages.  If I'd tried to write this without regex's, it might 
be locally more readable, but would also be a lot more code -- an 
interesting tradeoff.  How does it compare with regex use in Perl or 
Python?

Here's a readability test: By reading this code, can you easily tell what 
syntax it parses?

(If your mailer wrapped the lines, please unwrap them before reading)


define parseUpdoc(script) {
    define result := [] diverge
    while (script =~ rx`(?ms)(@comment.*?)\n?[ \t]*\?[
\t]?(@code.*?)\n(@rest.*)`) {
        script := rest
        while (script =~ rx`(?ms)[ \t]*>[ \t]?(@moreCode.*?)\n(@rest.*)`) {
            code += "\n" + moreCode
            script := rest
        }
        if (script =~ rx`(?ms)[ \t]*#[ \t]+?(@label\w*):[ \t]*(@output.*?)[
\t]*\n(@rest.*)`) {
            script := rest
            while (script =~ rx`(?ms)[ \t]*#[ \t]*(@moreOutput.*?)[
\t]*\n(@rest.*)`) {
                output += "\n" + moreOutput
                script := rest
            }
            result push(CarrotMaker(comment, code, label, output))
        } else {
            result push(CarrotMaker(comment, code, null, null))
        }
    }
    result snapshot
}