[e-lang] Strange lock-ups with simple E program
Thomas Leonard
tal at it-innovation.soton.ac.uk
Mon Nov 23 13:11:55 PST 2009
Hi Mark,
OK, I've got it now! I think I can explain the two exceptions ("OldFarRef may
only be smashed" and "ViciousCycleException") and the hangs.
First, this is my understanding of how it's supposed to work:
When passing an object between vats in a single JVM, we use BootRefHandler.packageArg().
There are two places where this is used:
1. To package the arguments to a method call. In this case, the arguments are all in the
current vat (and therefore won't change under us).
2. To package the promise of the result. Here, the argument is a SwitchableRef in another vat, which the other vat is possibly in the process of resolving.
Case 2 seems pretty dangerous in general, especially knowing the optimisations
Java likes to make when things aren't synchronized.
I think something like this is happening:
1. The calling vat does "obj<-run()".
2. This calls BootRefHandler.handleSendAll().
3. In the calling thread, we create a SwitchableRef for the result.
4. We give the SwitchableRef to the target vat (synchronized on the runner's queue).
5. We call packageArg on the SwitchableRef in the calling thread.
6. packageArg sees that this is EVENTUAL.
7. The called vat resolves the SwitchableRef and calls commit() on it.
8. commit() briefly sets its target to TheViciousRef to detect cycles.
9. In the calling thread, we create a new BootRefHandler for the SwitchableRef.
10. We create a DelayedRedirector and call ViciousCycleException<-__whenMoreResolved(redirector)
11. In the called thread, we set the target to the real answer (null in the test case).
12. When we get the reply from ViciousCycleException, the DelayedRedirector can't set the new target to ViciousCycleException because it's already set to null (BTW, why isn't a problem DeepFrozen, and why isn't the handler fresh when this happens?)
If the SwitchableRef resolves before we create the BootRefHandler, we get "OldFarRef may
only be smashed", but the program continues OK with the already-working OldFarRef.
Otherwise, we have an OldRemotePromise that never gets set to anything and the program hangs.
Perhaps it would make more sense to package the SwitchableRef before giving it to the target vat? You could then use a cut down version of packageArg for that, and make packageArg itself only package things in its own vat.
--
Dr Thomas Leonard
IT Innovation Centre
2 Venture Road
Southampton
Hampshire SO16 7NP
Tel: +44 0 23 8076 0834
Fax: +44 0 23 8076 0833
mailto:tal at it-innovation.soton.ac.uk
http://www.it-innovation.soton.ac.uk
-----Original Message-----
From: e-lang-bounces at mail.eros-os.org on behalf of Mark Miller
Sent: Sun 2009-11-22 8:49 PM
To: Discussion of E and other capability languages
Subject: Re: [e-lang] Strange lock-ups with simple E program
Hi Thomas, thanks for tracking this down! This has been a long
standing irritation that I had not been able to diagnose. I hope to be
able to get to this either over the thanksgiving weekend or soon
thereafter.
On Tue, Nov 17, 2009 at 6:43 AM, Thomas Leonard
<tal at it-innovation.soton.ac.uk> wrote:
> OK, I think I'm making progress here. In BootRefHandler.packageArg(), we
> have:
>
> //arg is EVENTUAL (or was at the time of the test), and is not
> //handled by a BootRefHandler, so we treat it as a promise in the
> //src vat.
>
> BootRefHandler handler = new BootRefHandler(src, arg);
> DelayedRedirector rdr = new DelayedRedirector(handler.myResolver);
> //handler and rdr are in the dest vat
> Object[] args = {packageArg(rdr, dest, src, currentVat)};
> src.qSendAllOnly(arg, false, "__whenMoreResolved", args);
> //Is it ok to ignore the E.sendOnly return result here?
> return handler.myResolver.getProxy();
>
> DelayedRedirector will create either OldFarRef or OldRemotePromise
> objects, depending on whether "arg" has identity when it checks in its
> constructor.
>
> OldRemotePromise.setTarget will accept anything, but OldFarRef.setTarget
> complains if you try to set it.
>
> Adding a sleep makes it print the error message in all cases:
>
> diff --git a/src/jsrc/org/erights/e/elib/vat/BootRefHandler.java b/src/jsrc/org/erights/e/elib/vat/BootRefHandler.java
> index e6d0818..4ca874a 100644
> --- a/src/jsrc/org/erights/e/elib/vat/BootRefHandler.java
> +++ b/src/jsrc/org/erights/e/elib/vat/BootRefHandler.java
> @@ -239,6 +239,15 @@ class BootRefHandler implements EProxyHandler {
> //handled by a BootRefHandler, so we treat it as a promise in the
> //src vat.
>
> + // TAL: but if it has an identity by the time we create BootRefHandler,
> + // then it will be an OldFarRef, not an OldRemotePromise, and subscribing to
> + // __whenMoreResolved will only cause trouble...
> +
> + try {
> + Thread.sleep(200);
> + } catch (Exception ex) {
> + }
> +
> BootRefHandler handler = new BootRefHandler(src, arg);
> DelayedRedirector rdr = new DelayedRedirector(handler.myResolver);
> //handler and rdr are in the dest vat
>
> I'm not sure what the correct fix is, though.
>
>
> On Sun, 2009-07-26 at 19:56 -0700, Mark Miller wrote:
>> Hi Thomas, I haven't found the bug yet, but I have narrowed it down to
>> a race condition bug in the boot-comm system. Your same stress test,
>> with your
>>
>> def seedVat := seedVatAuthor(<unsafe>)
>>
>> replaced by
>>
>> introducer.onTheAir()
>> def seedVat := seedVatAuthor(<unsafe>).virtualize(introducer)
>>
>> seems to never lock up. The only difference between these is that the
>> latter uses captp instead of boot-comm. In the boot comm case, the
>> form of lock up looks like a lost signal: all vats are quiescent
>> waiting for a message. I have also discovered many things that the bug
>> is not ;). Unfortunately, the clues so far seem to point to a
>> multi-threading race condition bug.
>>
>> Unfortunately, I won't have time for further investigation for another
>> week. If you're blocked on this, might the above change be a useful
>> workaround for you?
>>
>
>
> --
> Dr Thomas Leonard
> IT Innovation Centre
> 2 Venture Road
> Southampton
> Hampshire SO16 7NP
>
> Tel: +44 0 23 8076 0834
> Fax: +44 0 23 8076 0833
> mailto:tal at it-innovation.soton.ac.uk
> http://www.it-innovation.soton.ac.uk
>
> _______________________________________________
> e-lang mailing list
> e-lang at mail.eros-os.org
> http://www.eros-os.org/mailman/listinfo/e-lang
>
--
Text by me above is hereby placed in the public domain
Cheers,
--MarkM
_______________________________________________
e-lang mailing list
e-lang at mail.eros-os.org
http://www.eros-os.org/mailman/listinfo/e-lang
More information about the e-lang
mailing list