[e-lang] Prize and Milestone as mutable state idioms
David Wagner
daw at cs.berkeley.edu
Wed Jul 2 14:13:02 CDT 2008
Tyler Close writes:
>1. Do the Prize and Milestone classes make the JSON serialization code
>more reviewable?
>2. Do you suspect these classes would make other algorithms more reviewable?
>3. Is the ref_send library the right place to host a collection of
>these utility classes for capability programming idioms?
Yes, good idea!
I think they make the example code somewhat more reviewable, although
I think this particular example was small enough and self-contained
enough that they were probably not essential. Lacking these encapsulated
classes I'm sure I still could have worked out the invariants with a bit
more work and checking (particularly if the invariants were documented).
e.g., the following might have sufficed
// marked after output method is done. once true, stays true forever
private boolean written;
// 1st output method takes ownership and sets this field to null.
// once null, stays null forever
private Writer output;
though I would have then had to do additional checks to verify that
the claimed invariants were indeed respected. So I think these classes
did help, and might help even more in other cases.
To put some perspective on it, I spent more time worrying about quoting
issues and funky interleaving of method calls than about these invariants,
though it was helpful to not have to think about these invariants,
and anything that helps code review even incrementally is a good thing.
Yes, I suspect these classes might make other code somewhat more
reviewable. I like this observation.
ref_send feels like not the perfect place for it, because this has
nothing to do with ref_send. Perhaps org.joe_e.util?
I know this is deeply subjective and personal, and it's hardly the most
important issue, but I'm not sure about whether these are the best names,
for a general-purpose utility class. I guess that names should have
two purposes: (a) help programmers guess at the intended purpose of the
class, (b) help programmers who have already familiarized themselves
with those classes at some point in the past remember what the class
does with ease. On (a), the names Prize and Milestone didn't help me
to guess the purpose. I might have found names like SingleUse, OneShot,
UseOnce, Once more helpful than Prize, and MonotonicBoolean, Unclearable,
Fuse more helpful than Milestone, but I'm just one datapoint. I concede
that the names Prize and Milestone do have the benefits of being short
and vivid, so maybe they would satisfy goal (b) better than anything I
was able to come up with.
In case it's helpful to give a broader perspective, I'll try to
reconstruct a log of my thoughts, reactions, etc. while I was reading
the attached ValueWriter.java file in your original email:
* I found it challenging to reconstruct how this class was supposed
to be used. Perhaps an example calling sequence?
* This class leaves the flow of control up to the caller to a large
extent, and that forced me to reason about complex interleavings
of method calls and the state of multiple objects (e.g., the
ValueWriter and derived ObjectWriter/ArrayWriters it may return
and any derived ValueWriters they may return, recursively). This
made it hard for me to be confident I've covered all the cases.
Basically, I think there is some invariant among the entire tree of
{Value,Object,Array}Writers that are associated with each other: at
most one of them can be "active", and calling a method on an inactive
one must throw an exception and not modify the state of the tree.
Moreover there are certain points in the code that hand off who is
"active", and this must only happen at the appropriate points.
(Is that right?) This invariant is tricky to verify.
There is an alternative that may be easier to reason about but
may not meet your requirements. An alternative would be to have
JSONObject and JSONArray classes that know how to unparse
themselves (or JSONObject and JSONArray classes, and a JSONWriter
class that accepts those objects). In this alternative approach,
the caller would populate those objects and then make a single
call to ask the root object to unparse itself to a Writer, and
all the control flow would be up to these objects -- not up to the
caller. The downside is that this is less flexible for callers:
it forces callers into an idiom where they first construct an
explicit tree of JSON objects, then make a single call to unparse
them. In comparison, in your scheme the JSON objects never need
be explicitly represented in memory; they can be unparsed on
the fly in a flexible and streaming fashion.
* Among all the write[primitive] methods, writeFloat() and
writeDouble() were the ones that made me pause the longest.
The question was whether Float.toString() and Double.toString()
would always output values that would be interpreted the same way
by the Javascript interpreter. But after a few seconds, I said,
naah, I can't see how this could create security problems, this
feels far-fetched.
Even the writeInt() etc. methods aren't entirely trivial: one
has to be confident that Integer.toString() won't write out an
int with a leading 0 (since that might be interpreted as a number
in octal, depending upon the Javascript interpreter). Should all
be OK but I had to take a second to check.
Also, one has to check for numbers that would overflow the precision
of Javascript's builtin number type. Actually, what do you do about
the latter?
I didn't think about +0 and -0, or +0.0 vs -0.0. Probably there
is nothing to think about.
Don't the following examples output invalid JSON?
new ValueWriter("", wr).writeInt(3); // outputs: 3
new ValueWriter("", wr).writeString("foo"); // outputs: "foo"
Doesn't JSON technically require that the top level be either an
object or an array, not a primitive?
* I like how you always use writeStringTo() for all strings,
including field names -- in particular I like that you quote
all strings, whether or not it seems necessary.
* I spent a long time staring at writeStringTo(). That's the
crucial method. I think it's OK. The Caja review had the
"</SCRIPT>" and "!]]" problems fresh in my mind, so if this is
going to be embedded in a HTML page or in a XML CDATA section,
you need to ensure that nothing you output can close a tag
or can close a CDATA section. But based on the fact that this
code didn't do check for that, I was guessing that this code
was intended to be used for JSON objects that are contained in
their own file, not embedded in some other content type.
* Likewise, you'd better hope that your caller takes care of
ensuring the charset encoding is reasonable. If this can get
stuck in something that will be interpreted in a crazy charset
(UTF-7, anyone?), you may be screwed.
* I didn't check the whitelist of character types in writeStringTo().
I got lazy.
I hope it doesn't whitelist character 0x85 (NEW LINE).
I didn't think about charsets, Unicode encoding, whether the
JSON output is a character stream or a byte stream, etc. I confess
they make my brain hurt.
* This code raises the question whether the receiver will treat the
field name "@" specially, and if so, to check whether
this could create any opportunity for confusion.
startElement("@").writeString(URL) and writeLink(URL) are
indistinguishable. If the receiver treats "@"-links specially (i.e.,
it is granted more authority or trusted more than other fields), then
this creates a peculiar kind of hazard for the caller. If someone
creates a facet of ValueWriter that allows arbitrary startObject()
and startElement() calls but removes writeLink(), thinking that
this will prevent the user of the facet from writing out "@"-links,
they'll shoot themselves in the foot.
* This code doesn't do any validity checks on the member name passed
to startElement(). Is the following valid JSON?
ObjectWriter o = new ValueWriter("", wr).startObject();
o.startElement("_foo").writeInt(3);
o.close(); // outputs: { "_foo": 3 }
* Stylistically, I'm not sure what to think about this idiom of
overloading NullPointerException to indicate that the caller violated
the preconditions of this class: e.g., NullPointerException may be
thrown to indicate that the caller interleaved method calls to these
classes in an invalid way. This idiom has the benefit of making the
code of ValueWriter.java a lot shorter and thus easier to read/review,
I think, but it might confusing for callers.
* The documentation ought to make clearer that if you start
writing to a ValueWriter, think you've finished, and then check
isWritten() and discover it is false, you've just corrupted your
output writer: you may have written a syntactically incorrect or
incomplete JSON value to the output writer and this class isn't going
to help you back out and undo. Also it should probably document the
assumption/precondition/requirement that the caller must not use the
Writer passed to ValueWriter's constructor until after isWritten()
on that ValueWriter returns true, and at that point the ValueWriter
must be discarded and never used again. (i.e., caller must not use
the Writer directly during the logical lifetime of the ValueWriter)
* Where's the public constructor for a ValueWriter? Why isn't
ValueWriter public? Why isn't it final?
* Very minor, matter of personal taste: I noticed this:
for (i=0; i!=n; i++) { ... }
Stylistically, I prefer:
for (i=0; i<n; i++) { ... }
because now more of the loop invariant is syntactically present
("i<n" is present in the latter, whereas I have to reconstruct it
in the former). OK, OK, don't shoot me! I said it was very minor.
I suspect many of the above are bogus or irrelevant or dead ends;
it's an attempt to write down some of what I was thinking.
Then, I went and read the most recent version from the Sourceforge
web page you posted later in the thread. An attempt at reconstructing
my thoughts:
* Oops, you got me with the writeFloat() and writeDouble() bugs.
I missed them. Even more embarassing, I got slightly suspicious in
that vicinity, and then dismissed my suspicions as far-fetched.
* The Character.SIZE change went over my head. If that was a bug,
oops, you got me: I missed it. But I actually don't understand
how this is an improvement. Can Character.SIZE ever be anything
other than 16? If it can be, then it seems like the revised code
has a bug, too, since Javascript interprets \uXXXX as a Unicode
escape, and always looks at exactly four digits following the
\u (not less, not more).
* Oops, you got me with "</".
I know the Caja folks are pretty confident it's enough to ensure
the output stream don't contain the exact substring "</". I'm still
not entirely sure whether to believe it. I can't come up with any
counterexample. But I've come to distrust browsers. Can I really
trust browsers not to treat "< /SCRIPT>" as a closing SCRIPT tag?
What about if we replace the space ' ' with a crazy Unicode character?
Are there Unicode characters that get stripped out or treated as
"not even present" by browsers (e.g., byte-order marks, non-breaking
space, who knows what). OK, maybe your whitelist of character
types in writeStringTo() ensures that none of those characters can
ever make it into the output stream, in which case my paranoia is
totally unfounded.
More information about the e-lang
mailing list