Announcing E 0.8.4: The Birthday Release
Tyler Close
tyler@waterken.com
Tue, 1 Jun 1999 10:48:27 -0400
> >So maybe E tables aren't tables but binary trees.
>
> [-] They happen to be hash tables.
>
> One can of course have other implementations, but the current implementation
> is just a normal multi-probe hashtable (descending from work by Danfuzz
> and Sidney Markowitz).
Well, I don't know who Danfuzz and Sidney Markowitz are, so I should probably
keep mum, but I won't. In my own experimentation with hashtables for the
Waterken libraries, I found that internal hashing sucks compared to external
hashing. By "sucks", I mean that there were always more collisions. I also had
trouble finding a search method that was guaranteed not to produce collisions
forever. I think the problem stems from the fact that in internal hashing, you
use the array both to locate elements and to store elements, where as in
external hashing, you only use the array to locate elements. With external
hashing you also have the nice property of being able to use the binomial
equation to predict the number of collisions.
> Chip's 'implementsSameMapping' would be no harder to implement and no less
> efficient than other things EMaps already do, like "&" and "|".
>
> >Union, intersection, difference, etc. operations are all better implemented
> >with
> >tree based sets.
>
> [+] I would like to see this.
> My hash-table-based algorithms for these are simple brute force, and I'd like
> to eventually do better.
The code for these algorithms is in the C++ STL. I'd recommend looking at the
SGI implementation, it's much more readable than others. You could also look at
the implementation in my library.
> >This means the whole
> >non-determinism thing goes away since there's only one possible ordering. It
> >also means that:
> >
> > ? ["foo" => 3, "bar" => 4]
> > # value: ["bar" => 4, "foo" => 3]
> >
> > ? ["foo" => 3, "bar" => 4] == ["bar" => 4, "foo" => 3]
> > # value: true
>
> [-] No deterministic sort possible.
> If I understand you, the deterministic order you have in mind is the sort
> order. If all keys were only transitively-PassByCopy objects, this would be
> fine.
Given that both hashtables and binary trees break if you modify the key of an
inserted element and that E already must support transitively immutable objects
(for exceptions), give me a good reason for allowing mutable keys. Better than
that, tell me why mutable keys are even something you'd want. E's answer should
be: "You don't want them AND you can't have them." Mutable keys are nothing but
a source of bugs AND, it seems, security holes.
> However, two non-PassByCopy objects that are otherwise identical are
> still different according to their creation identity.
I thought E's maps were maps and not multimaps.
> Unless we want to
> internally do the bookkeeping to keep track of creation order, there's no
> way to deterministically sort these.
Tree based sorts are stable sorts. This means that the ordering of equal
elements is preserved. This means that in the "one true order" of a binary tree,
more recently inserted elements are placed after previously inserted elements. I
believe this is the same kind of determinism you deem acceptable for E's current
hashtables.
> It took me two weeks to stop looking for determinism in the keys, where it
> can't be found,
If you're talking about determinism in an order which is based on hash value,
then of course the keys don't contain the necessary information. The capacity of
the hash map also plays a role in the order and the keys cannot possibly know
about this.
If you're talking about determinism in an order based on element value, then the
keys necessarily must contain the required determinism. There is no other
information used in constructing the order; therefore, the keys determine the
order.
I feel like there's something fundamental here that I am not understanding. What
am I missing?
> and start looking for it in the means by which a table comes
> to hold those keys. Therefore, in E,
>
> ? ["foo" => 3, "bar" => 4]
> # value: ["foo" => 3, "bar" => 4]
>
> ? ["bar" => 4, "foo" => 3]
> # value: ["bar" => 4, "foo" => 3]
>
> ? ["foo" => 3, "bar" => 4] == ["bar" => 4, "foo" => 3]
> # value: false
>
> To see why this is the only possible solution, take the quotes off of "foo"
> and "bar" and imagine that these two variables hold two
> identical-except-for-identity non-PassByCopy objects.
No, I won't because I think it's a bad idea. See above. Is this the only part of
the non-determinism issue that I am not understanding?
> >The problem here is that E overloads [] to mean both a list and a table.
> >You're not allowed to re-order a list from what was entered by the user.
>
> [-] It's just notation overload, not semantic unity.
> Depending on whether it contains "=>"s, E's [] represent either a list or a
> map.
Looks like I've fallen behind. I thought [] without => was a set. What's a set
now?
> Up through 0.8.3, before I realized the determinism problem, I was
> indeed reordering maps but not lists. Determinism doesn't require that I
> don't reorder, just that I specify a semantically deterministic algorithm for
> calculating the order.
I meant you cannot re-order a list because this changes the meaning of the list,
not because of any determinism argument. (i before e except after c).
> >If:
> > ("markm", 42)
> >was a list/array/tuple and maybe also a hashtable, then using [] for a tree
> >would be ok. I also think it would be neat if:
> > birthday("markm", 42)
> >was a typed tuple and:
> > e_lang birthday("markm", 42)
> >meant "send e_lang the following birthday tuple".
> >
> >But now I've probably gone too far afield.
>
> [-] Even when it's too late for such changes, it's good to raise them.
I disagree and I wish I hadn't. I'll try to keep my responses here brief.
> > ("markm", 42)
> I don't understand the difference between an untyped/unlabeled ordered tuple
> and a list. I also don't understand how a tuple is like a hashtable.
An ordered tuple (map) is not necessarily equal to the list that was used to
initialize it. If the list was not in sorted order, then the ordered tuple (map)
has a different ordering.
Your E 0.8.4 [] is both a tuple and a hashtable. The fact that the ordering of
the hashtable is solely determined by insertion order makes it like a tuple.
>
> > birthday("markm", 42)
> "birthday" can be a function for constructing an object representing a
> birthday, and this expression would then be a function call.
I see and I like this. Ok, maybe it's just , that constructs a list. No brackets
required. (Look ma, no hands!). [] constructs an ordered map from the list and
() does nothing but clarify order of operations. name() is then always a method
invocation. All method invocations take a list as their only parameter.
Receiving a method invocation like:
foo (bar, baz) {
}
assigns a name to each list element and
foo in {
}
assigns a name to the list as a whole.
Minus the receiving a list as a whole bit, does the current E definition allow
me to think of things in this way?
Tyler