[e-lang] Properties of E scheduling
Matej Kosik
kosik at fiit.stuba.sk
Wed Jul 6 06:35:49 EDT 2005
Hello
I would like to join this discussion and add some related problems
concerning E's concurrency. May be they will be interesting, maybe I am
wrong, I am not expert.
Dan Bornstein wrote:
> I wrote:
>
>> While reading these paragraphs, I began wondering: Does E
>> currently process the queueing of incoming eventual-sends
>> effectively as separate vat turns? E.g., if during some turn a
>> piece of code queues up two messages, is it possible for the comm
>> system to interleave a third message in the middle of those two?
>
>
> David Hopwood writes:
>
>> This *should* be possible. The reason is that guaranteeing that
>> messages-to-self take priority over messages from other vats would
>> impede fairness: it would be too easy for a vat to inadvertently
>> starve itself from receiving any messages from other vats,
>> increasing the probability of livelock.
>
>
> I think you may have misunderstood the situation I described, or at
> least I was too vague in the statement of it. In particular, I didn't
> mean to imply that a vat would wait for its event queue to be empty and
> for the vat to be quiescent before queueing remote messages, merely for
> it to be in between turns.
>
> For example, let's say that the queue contains (in order) "A, B, C". The
> turn for A starts, and it causes D and E to be added to the queue. At
> (approximately) the same time, a remote message comes in, resulting in R
> being added to the queue. My question is: After both A and the remote
> message are processed, is "B, C, D, R, E" a possible state of the queue?
It would be very surprising if that was not a possible state in the
queue. In some cases, it need not to be a problem (when D and E are
"unrelated") but in general it is a problem. What if E relies on the
fact that the status of the receiver is exactly the same as after the
processing of the D message? I think that these problem stem out from
the fact that message ordering in E cannot be called as partial ordering
http://www.erights.org/elib/concurrency/partial-order.html
Although it would have sense to say what are the nodes and what are the
edges of that partial ordering and to prove that respective properties
of the relation between the nodes (reflexive, antisymmetric,
transitive). If we speak about E and if we consider messages as nodes
and the relation as (message A will be processed before message B) then
it would be possible to prove that such relation is not transitive. That
is, when you do this (in some what)
object1 <- message1 (differentObject)
object1 <- message2 (differentObject)
then E guarantees that `message1' will be processed before `message2'.
OK, but is this _enough_ (i.e. strong enough policy to enable us to
express everything we need)? There are situations when it would have
sense to expect that all the messages sent by `object1' during
processing of `message1' to `differentObject' will be processed by
`differentObject' _before all the message_ sent to `different' during
the activation of the `object1' by `message2'. E does not have mechanism
to express such thing even though it would be completely natural (at
least for me) to expect such ordering of processing message is enforced
(implicitely) by the above code.
-------------------------------------------------------------------------
(copied from some of my mail sent privately to Mark Miller)
My objections against E are mainly from the area of concurrency. If one
information system is mapped into a single vat, then that information
couldn't be scalable. Suppose that you have an information system which
keeps track of the bank accounts registered in a bank. Suppose that
there comes million request to make some operation. For the simplicity's
sake lets assume that all those million clients want to reveal their
balance. When information system in the bank works in asingle vat, than
serving all the operations (milion turns of the same vat) would make the
customers to wait a long time. E programmer thus has to break the
theinformation system into finer and finer units (vats) in order to gain
real parallelism (when those vats are running on different machines). I
do not like the idea of managing the paralelism explicitely. It is
inconvenient. Worse, E does not have enough strong concurrency
mechanisms to enable to make arbitrarily fine concurrency units work
properly.
There are certain things which I cannot express in E. For example if
there were Transcript object known to all applications (whatever that
means) running on the same computer. If that object supports operation
printLine(string)
which does nothing more but adds a new line at the end of the receiver.
Now, assume that there are two concurrent activities (for example two
different simulations). The first simulation at the end does this:
Transcript <- printLine("The first line of the FIRST report.");
Transcript <- printLine("The second line of the FIRST report.");
Transcript <- printLine("The third line of the FIRST report.");
The second simulation similarly does this:
Transcript <- printLine("The first line of the SECOND report.");
Transcript <- printLine("The second line of the SECOND report.");
Transcript <- printLine("The third line of the SECOND report.");
When they are performed at the same time, then according to the E's
semantics the following text may appear in the Transcript
The first line of the FIRST report
The first line of the SECOND report.
The second line of the FIRST report.
The second line of the SECOND report.
The third line of the FIRST report.
The third line of the SECOND report.
and that is bad because it does not have sense. That is not what you
wanted when you programmed the particular simulation. Either this:
The first line of the FIRST report
The second line of the FIRST report.
The third line of the FIRST report.
The first line of the SECOND report.
The second line of the SECOND report.
The third line of the SECOND report.
is acceptable or this:
The first line of the SECOND report.
The second line of the SECOND report.
The third line of the SECOND report.
The first line of the FIRST report
The second line of the FIRST report.
The third line of the FIRST report.
both cases are good because those two activities are _unrelated_ and
thus it does not matter on the order in which particular reports appear
in the Transcript, provided that they are continuous.
The problem in E (and in concurrent environment) is that the references
to remote objects are stateless. This causes problems when we deal with
stateful objects in concurrent environment. For example
simulation1 <- runAndReportTo(Transcript)
simulation2 <- runAndReportTo(Transcript)
simulation3 <- runAndReportTo(Transcript)
Is a problem because, reference to Transcript is (is normal systems)
stateless (I do not know of any system where references are "stateful",
i.e. part of some cleverly designed partial ordering together with
mails). You again cannot besure about the order in which the particular
reports will appear in the Transcript. They could be even intermixed.
-------------------------------------------------------------------------
I do not know if the problems I have mentioned above follow from the
fact that I have disregarded something concerning E. I am interested in
these problems so if I am wrong (concerning E) I would like to be corrected.
Best regards
--
Matej Kosik
More information about the e-lang
mailing list