[e-lang] Properties of E scheduling

Matej Kosik kosik at fiit.stuba.sk
Wed Jul 6 06:35:49 EDT 2005


Hello

I would like to join this discussion and add some related problems 
concerning E's concurrency. May be they will be interesting, maybe I am 
wrong, I am not expert.

Dan Bornstein wrote:
> I wrote:
> 
>> While reading these paragraphs, I began wondering: Does E
>> currently process the queueing of incoming eventual-sends
>> effectively as separate vat turns? E.g., if during some turn a
>> piece of code queues up two messages, is it possible for the comm
>> system to interleave a third message in the middle of those two?
> 
> 
> David Hopwood writes:
> 
>> This *should* be possible. The reason is that guaranteeing that
>> messages-to-self take priority over messages from other vats would
>> impede fairness: it would be too easy for a vat to inadvertently
>> starve itself from receiving any messages from other vats,
>> increasing the probability of livelock.
> 
> 
> I think you may have misunderstood the situation I described, or at 
> least I was too vague in the statement of it. In particular, I didn't 
> mean to imply that a vat would wait for its event queue to be empty and 
> for the vat to be quiescent before queueing remote messages, merely for 
> it to be in between turns.
> 
> For example, let's say that the queue contains (in order) "A, B, C". The 
> turn for A starts, and it causes D and E to be added to the queue. At 
> (approximately) the same time, a remote message comes in, resulting in R 
> being added to the queue. My question is: After both A and the remote 
> message are processed, is "B, C, D, R, E" a possible state of the queue?

It would be very surprising if that was not a possible state in the 
queue. In some cases, it need not to be a problem (when D and E are 
"unrelated") but in general it is a problem. What if E relies on the 
fact that the status of the receiver is exactly the same as after the 
processing of the D message? I think that these problem stem out from 
the fact that message ordering in E cannot be called as partial ordering

	http://www.erights.org/elib/concurrency/partial-order.html

Although it would have sense to say what are the nodes and what are the 
edges of that partial ordering and to prove that respective properties 
of the relation between the nodes (reflexive, antisymmetric, 
transitive). If we speak about E and if we consider messages as nodes 
and the relation as (message A will be processed before message B) then 
it would be possible to prove that such relation is not transitive. That 
is, when you do this (in some what)

	object1 <- message1 (differentObject)
	object1 <- message2 (differentObject)

then E guarantees that `message1' will be processed before `message2'. 
OK, but is this _enough_ (i.e. strong enough policy to enable us to 
express everything we need)? There are situations when it would have 
sense to expect that all the messages sent by `object1' during 
processing of `message1' to `differentObject' will be processed by 
`differentObject' _before all the message_ sent to `different' during 
the activation of the `object1' by `message2'. E does not have mechanism 
to express such thing even though it would be completely natural (at 
least for me) to expect such ordering of processing message is enforced 
(implicitely) by the above code.

-------------------------------------------------------------------------

(copied from some of my mail sent privately to Mark Miller)

My objections against E are mainly from the area of concurrency. If one 
information system is mapped into a single vat, then that information 
couldn't be scalable. Suppose that you have an information system which 
keeps track of the bank accounts registered in a bank. Suppose that 
there comes million request to make some operation. For the simplicity's 
sake lets assume that all those million clients want to reveal their 
balance. When information system in the bank works in asingle vat, than 
serving all the operations (milion turns of the same vat) would make the 
customers to wait a long time. E programmer thus has to break the 
theinformation system into finer and finer units (vats) in order to gain 
real parallelism (when those vats are running on different machines). I 
do not like the idea of managing the paralelism explicitely. It is 
inconvenient. Worse, E does not have enough strong concurrency 
mechanisms to enable to make arbitrarily fine concurrency units work 
properly.

There are certain things which I cannot express in E. For example if 
there were Transcript object known to all applications (whatever that 
means) running on the same computer. If that object supports operation

	printLine(string)

which does nothing more but adds a new line at the end of the receiver.

Now, assume that there are two concurrent activities (for example two 
different simulations). The first simulation at the end does this:

     Transcript <- printLine("The first line of the FIRST report.");
     Transcript <- printLine("The second line of the FIRST report.");
     Transcript <- printLine("The third line of the FIRST report.");

The second simulation similarly does this:

     Transcript <- printLine("The first line of the SECOND report.");
     Transcript <- printLine("The second line of the SECOND report.");
     Transcript <- printLine("The third line of the SECOND report.");

When they are performed at the same time, then according to the E's 
semantics the following text may appear in the Transcript

     The first line of the FIRST report
     The first line of the SECOND report.
     The second line of the FIRST report.
     The second line of the SECOND report.
     The third line of the FIRST report.
     The third line of the SECOND report.

and that is bad because it does not have sense. That is not what you 
wanted when you programmed the particular simulation. Either this:

     The first line of the FIRST report
     The second line of the FIRST report.
     The third line of the FIRST report.
     The first line of the SECOND report.
     The second line of the SECOND report.
     The third line of the SECOND report.

is acceptable or this:

     The first line of the SECOND report.
     The second line of the SECOND report.
     The third line of the SECOND report.
     The first line of the FIRST report
     The second line of the FIRST report.
     The third line of the FIRST report.

both cases are good because those two activities are _unrelated_ and 
thus it does not matter on the order in which particular reports appear 
in the Transcript, provided that they are continuous.

The problem in E (and in concurrent environment) is that the references 
to remote objects are stateless. This causes problems when we deal with 
stateful objects in concurrent environment. For example

     simulation1 <- runAndReportTo(Transcript)
     simulation2 <- runAndReportTo(Transcript)
     simulation3 <- runAndReportTo(Transcript)

Is a problem because, reference to Transcript is (is normal systems) 
stateless (I do not know of any system where references are "stateful", 
i.e. part of some cleverly designed partial ordering together with 
mails). You again cannot besure about the order in which the particular 
reports will appear in the Transcript. They could be even intermixed.
-------------------------------------------------------------------------
I do not know if the problems I have mentioned above follow from the 
fact that I have disregarded something concerning E. I am interested in 
these problems so if I am wrong (concerning E) I would like to be corrected.

Best regards
-- 
Matej Kosik


More information about the e-lang mailing list