[E-Lang] remote comms: Timeouts and Connection Failure
zooko@zooko.com
zooko@zooko.com
Thu, 05 Apr 2001 00:30:23 -0700
MarcS wrote:
>
> Do you not expect it to be the common case that a broken connection is a
> good reason to stop waiting? This surprises me.
Well... Um...
First, I don't think that the notion of a "broken connection" is a good one,
since it implies the notion of a non-broken connection, which is impossible to
implement, and since it uses resources in order to implement faster-discovery-
of-non-responsiveness in a way that may or may not be what was needed by the
higher-level code, and said faster discovery can lead to false (unnecessary)
"breakages" of "connections", and in addition (something I did not properly
emphasize in my previous messages) since it sours the "familiar O-O message
send but it works remotely" abstraction which E offers and (something else very
important that I've neglected to emphasize) since it makes it impossible for
the programmer to send messages which must be delivered at-most-once and which
must be delivered before other messages (without implementing his own
non-connectiony abstraction atop the connectiony one).
Second, I think that all programmers, including apparently many of those
participating in this discussion, consider "connections" to be such a familiar
and natural abstraction that they cannot wrench their minds away from using it,
even when engaged in a discussion about whether it is an appropriate
abstraction. ;-)
This bother me a lot since my Evil Geniuses Transport Engine provides a
non-connectiony comms abstraction, and it may be that I'll have to layer a
"connection abstraction" atop it, providing my users with the resource wastage,
the inappropriate one-size-fits-all impatience policy (== unnecessary
breakages), the inability to have automatic at-most-once messages, the
inability to have automatic persistent message-ordering, and the misleading
sense of being "connected" that they crave.
BUT,
Third, I have realized that there is an even *more* natural and familiar
abstraction at hand, which will hopefully rescue me from the fate of having to
deal with "connections" in other people's E code. This even better abstraction
is the notion of references and message sending in O-O.
Hopefully there is way to define the kind of featureful and flexible comms
abstraction that I want in terms of objects and message-sends which programmers
will find natural and unsurprising.
Oh by the way, I still haven't answered your question about whether I think a
broken connection is a good reason to stop waiting. I'll attempt to answer it
now, and please do try to re-phrase your questions in non-connectiony terms
next time and see if that helps us converge.
I think that if a programmer is given a connection abstraction, then the common
case is definitely to give up as soon as the comms implementation says that the
counterparty is non-responsive (by "breaking" the "connection"). *But*,
I think that this is usually a mistake, because usually higher-level
consideration (most often, the user) should determine the impatience policy
top-down. As a thought experiment, consider how many apps could be effectively
hung or DoS'ed by a malicious counterparty sending all the right keepalives but
otherwise not doing anything.
Now normally I would *never* embark on a crusade to persuade programmers to
relinquish crufty old abstractions which they love (well... not in the context
of E, at least, which already does way too much crusading. I might embark on
such a crusade for the Evil Geniuses Transport Engine.), but I think maybe
since the familiar abstraction of O-O references and O-O message send doesn't
come with this idea of breakable connections, perhaps in this case programmers
will actually be *more* comfortable with a different, simpler "message sending"
abstraction than with the current SturdyRef/LiveRef, which is a union of
"connections" and "O-O message sending".
> At moments like these, I yearn to see you write a few E apps with E as it
> is, with the sturdy/live separation, and see if your objections-in-theory
> really are objections-in-practice.
This is a very reasonable suggestion. Of course, I have very limited time
right now and since I consider this issue to be so important and urgent that I
thought it best that I speak up now rather than remain silent until I have time
to write E apps.
I haven't finished responding to your message, but I need to go now. (I'll
leave some quotes below to show what I might respond to later.)
Regards,
Zooko
> Somehow, I feel like I just have to be able to find out that the
> "connection" has "broken". Connection breakage is one of the basic truths of
> distributed computing, and not being able to implement different policies
> for different requirements in the presence of breakage just sounds like
> madness to me. I don't want to have to create impatience policies for all my
> distributed connections when in fact the only thing that breaks in practice
> is the "connection"--even though I made it easy with promiseFirstResolved
> etc., to create impatience policies, why make me do it over and over again?
> Though I guess you are really arguing for replaceable-default impatience
> policies, in which case I find myself saying, okay, but I have found the
> current E impatience policy (based on a long-interval keep-alive) to be a
> great default-default. I find the case for replaceable defaults positive and
> interesting but not compelling for version 1.0 of E.
> > (In Mojo Nation, there are no operations which *require* fast discovery of
> > non-responsiveness, although it would possibly be a performance
> improvement if
<yadda yadda yadda>
> aside,
> > this also makes our protocol safer against active attacks, but the
> motivation
> > was firewall-hopping, not paranoid security considerations.)
>
> This makes sufficiently little sense to me, I now have to wonder if you and
> I mean the same thing when we say, "connection". I'm afraid my definition of
> connection is a little too concrete to be helpful, however--for me a
> connection is "lost" when E catches a problem in the when-catch block :-)
> And a "connection" is the infrastructure that I never really see directly,
> which allows me to send an eventual message: I do indeed have a "connection"
> to anything I can send to and get a non-problem resolution of the promise
> (i.e., a fulfillment :-) Perhaps this definition of connection, while too
> concrete in one way, is actually a higher-level concept than the one you are
> using?
> > (In fact, now that I think about it, the next performance improvement in
> Mojo
> > Nation, which will effectively fix the performance losses due to
> accidentally
> > relying on "connections", amounts to doing faster discovery of
> > non-responsiveness from the top-down requirements and knowledge rather
> This sounds like it is very easy to do in E. The keep-alives are, as Bill
> highlighted recently, so infrequent they hardly constitute a drain on
> resources. You just write a smart timebomb for use with the code I posted
> earlier, put it in the appropriate when-catches, and poof, it is done. It
> would be easier with replaceable-default impatience policies if you wanted
> to use the same one everywhere...but it doesn't compel me to feel a need to
> eliminate liverefs from E.
> experiences. I don't even know what questions to ask to figure out how these
> experiences could be so different. Certainly, Mojo Nation is a much bigger
> undertaking than any of my little E programs, so it is a very important
> example. But my limited knowledge of Mojo Nation suggests that the E
> programmer's conceptualization of a "connection" would not particularly get
> in the way of solving the problem.
> A place where the rubber meets the road with sturdy refs that no one has
> mentioned yet is in persistence. A sturdyref is meaningless unless you are
> persisting the ref.