[e-lang] Is the SuchThatPattern always a bad idea?

Mark Miller markm at cs.jhu.edu
Mon Aug 8 01:10:53 EDT 2005

Regarding Kevin's old bug "Preserving match failure info"
here's a thought.

If we dispense with getSynEnv(), then there's no reason to keep the guard 
qualifier coupled to a variable definition pattern. Instead, without 
increasing the size of the language, and while being upwards compatible from 
the present language, we could generalize the guard qualifier to appear to the 
right of any pattern.

If '<p1>' is the pattern '<p2> :<e2>', then the meaning is, first evaluate 
<e2> to a guard, coerce the specimen by this guard, and then match the 
coercion result against <p2>. Note that both the coercion and the match 
against <p2> both use the optEjector used for <p1> as a whole, so match 
failure info is preserved by this composition.

Given this, I propose the SuchThatPattern, '<p3> ? <e3>' (read "<p3> such that 
<e3> evaluates to true"), is unnecessary, should be deprecated, and eventually 
removed from E and Kernel-E.

I think Dean made a suggestion along these lines, which I had previously 
rejected so that getSynEnv() could provide enough information for auditing -- 
so the contained variable definition patterns would contain the guard 
expressions that I imagined auditors would examine.

To examine this hypothesis, I enumerate below all the SuchThatPatterns 
introduced by expanding E to Kernel-E. These turned out much better than I 
expected. Each expansion is shown in the following format:

     # E pattern
=>  # current expansion to Kernel-E
->  # proposed new expansion to Kernel-E

def ... # supplemental definitions needed by the proposed expansion


Since these definitions would be used by expansions, they would need to be in 
the universal scope and be unshadowable. To avoid polluting the namespace, 
we'd probably prefix those not of general interest with "__", not shown below. 
Since these would be universal and unshadowable, we stand a good chance of 
eventually inlining the uses shown below, in which case we can eventually 
avoid paying for the apparent additional indirection.

For readability, the expansions below show generated temporaries as "t1", 
"t2", ..., rather than actual temporary names like "foo__1". I was pleasantly 
surprised to find that *none* of the proposed new expansions below need to 
generate temporary names. They are also all shorter, and they're able to 
generate much better diagnostic info. It's arguable which expansions are more 
understandable, but the new expansions can each be understood in terms of 
fewer primitive concepts.


=>  t1 ? (t1 == 3)
->  _ :Is[3]

def Is {
     to get(standard) :Guard {
         def IsGuard extends __makeGuard(IsGuard) {
             to coerce(specimen, optEjector) :any {
                 if (standard == specimen) {
                     return specimen
                 } else {
                                 `$specimen isn't $standard`)
         return IsGuard

The expanded form is sufficiently pleasant that we may want to deprecate the 
"==" pattern sugar as well.


     bind foo
=>  t1 ? (foo__Resolver.resolve(t1); true)
->  _ :Bind[foo__Resolver]

def Bind {
     to get(resolver) :Guard {
         def BinderGuard extends __makeGuard(BinderGuard) {
             to coerce(specimen, optEjector) :any {
                 return specimen
         return BinderGuard

In this case, there's no change to the behavior at all. If the resolver is 
already resolved, this does and should throw an exception rather than 
reporting a match failure.


     bind foo :int
=>  t1 :int ? (foo__Resolver.resolve(t1); true)
->  _ :Bind[foo__Resolver] :int

It's a bit strange that a multiply-guarded pattern should be read from right 
to left but a multiply-guarded expression should be read from left to right. 
At least the latter is only accepted if it's parenthesized: '(3 :int) :int'


     ["x" => x] | _
=>  t1 ? (t1.optExtract("x") =~ [x, _])
->  [x, _] :Extract["x"]

def Extract {
     to get(key) :Guard {
         def ExtractorGuard1 extends __makeGuard(ExtractorGuard1) {
             to coerce(specimen, optEjector) :any {
                 def value := specimen.fetch(key, thunk {
                     throw.eject(optEjector, `$key not found`)
                 return [value, specimen.without(key)]
         return ExtractorGuard1
     to get(key, instead) :Guard {
         def ExtractorGuard2 extends __makeGuard(ExtractorGuard2) {
             to coerce(specimen, _) :any {
                 def value := specimen.fetch(key, thunk {
                     return [instead, specimen]
                 return [value, specimen.without(key)]
         return ExtractorGuard2

The '| <rest-pattern>' on the end of the original pattern means: Match the 
remainder of the map, after the matching associations have been extracted, 
against <rest-pattern>. <rest-pattern> here is '_', which means: Ignore the 
specimen and report a successful match.

I could have made the expansions here simpler by using the existing extract/2 
and optExtract/1 methods on EMaps. Perhaps this is the right thing to do -- it 
would certainly be more upward compatible from the existing system. But as an 
experiment in further concept reduction, I instead moved the logic of the 
default implementations of these methods into the code above.


     ["x" => x := 3] | _
=>  t1 ? (t1.extract("x", 3) =~ [x, _])
->  [x, _] :Extract["x", 3]

This means: If there's no association for "x", bind 3 (the default value) to x 
and match <rest-pattern> against the entire collection.


     ["x" => x]
=>  t1 ? (t1.optExtract("x") =~ [x, t2 ? (t2.size() == 0)])
->  [x, _ :Empty] :Extract["x"]

def Empty extends __makeGuard(Empty) {
     to coerce(specimen, optEjector) :any {
         if (specimen.size() == 0) {
             return specimen
         } else {
             throw.eject(optEjector, `$specimen isn't empty`)

Without the vertical bar, this means: After the enumerated associations are 
extracted, the remaining collection must be empty.

Introducing a new guard just to test 'specimen.size() == 0' seems like 
overkill. This is the one case where it's tempting to keep the SuchThatPattern.


     foo`$x at y`
=>  t1 :__MatchContext ?
       (def [t2, t3] := pair__1
        foo__quasiParser.matchMaker("${0}@{0}") \
          .matchBind([x],t2,t3) =~ [y])
->  [y] :MatchBind[foo__quasiParser.matchMaker("${0}@{0}"), [x]]

def MatchBind {
     to get(matchMaker, args) :Guard {
         def MatchBindGuard extends __makeGuard(MatchBindGuard) {
             to coerce(specimen, optEjector) :any[] {
                 return matchMaker.matchBind(args, specimen, optEjector)
         return MatchBindGuard

The old expansion was already terribly long in order to provide the matchMaker 
with the right optEjector, so it could report match failure info. However, if 
the extracted patterns (here, 'y') reported match failure, this report would 
still be lost. The new expansion painlessly preserves match failure info from 
both. We could then also deprecate __MatchContext, which was kludgy and hard 
to explain.


Except for better diagnostics, none of the above changes would effect the 
normal E programmer. Regarding the size of E and Kernel-E, on the one hand we 
get to deprecate and eventually remove:

     ?, ==<e>, EMap#extract/2, EMap#optExtract/1, __MatchContext

OTOH, we'd need to add:

     Is, Bind, Extract, Empty, MatchBind

to the unshadowable universal scope, with some of these possibly prefixed with 

None of the above is adequate to subsume the functionality of the experimental 
trinary-define proposal
But perhaps it takes care of all the motivating cases for this proposal, in 
which case we could drop it as well.


Text by me above is hereby placed in the public domain


More information about the e-lang mailing list