[e-lang] Defensive consistency and integer overflow

David Hopwood david.hopwood at industrial-designers.co.uk
Sun Sep 23 13:07:41 EDT 2007


Tyler Close wrote:
> Hi David,
> On 9/22/07, David Wagner <daw at cs.berkeley.edu> wrote:
>> One minor nitpick on the inner class Generator:
>>
>>>        public void
>>>        write(final byte[] b, final int off, final int len) {
>>>            if (size + len > buffer.length) {
>>>                System.arraycopy(buffer,0, buffer = new byte[size+len],0, size);
>>>            }
>>>            System.arraycopy(b, off, buffer, size, len);
>>>            size += len;
>>>        }
>> If someone with a reference to a Generator g calls
>>
>>     g.write(.., .., -1);
>>
>> then this destroys the Generator's internal invariants and renders
>> it in an inconsistent state.  In particular, the body of the method
>> will create a new byte array that is a bit shorter and whose elements
>> are initialized to zero, set g.buffer to that new byte array, and
>> then throw IndexOutOfBoundsException.
> 
> I don't think it would happen that way. The size is always less than,
> or equal to, buffer.length, which is always at least 1. So (size - 1)
> is also always less than buffer.length, so the if test is false, so
> there's no new array allocated or assigned. The System.arraycopy
> method will then throw IndexOutOfBoundsException, because len is
> negative. Execution then exits the function, without any of the
> internal state variables being modified.

The other non-obvious case is that when size + len overflows, the 'if'
body will not be executed, even though it would have been under
arbitrary-precision arithmetic. But then the arraycopy outside the 'if'
will throw an exception (because the buffer array cannot be larger
than Integer.MAX_VALUE), and the fact that the spec of System.arraycopy
allows part of the array to have been copied to the buffer (beyond
buffer[size]) happens not to matter.

<http://www.uni-koblenz.de/~beckert/pub/ifm2004.pdf> argues against
accepting "incidentally correct" programs like this one -- it suggests
that the program should be required to have an explicit test for overflow,
to be considered correct.

In this case, it would also work to write the 'if' condition as
"(size > buffer.length - len)". Many of the range checks in Sun's
implementation of the Java API are written in that style.

In general, it would be much nicer to be working in a language with
arbitrary-precision integers and integer range types (e.g. off >= 0 and
len >= 0 here), or at the very least unsigned types. (Hmm, I wonder
where I would find such a language? :-) There is a paper about the
JML specfication language which makes this point in more detail:
<http://users.encs.concordia.ca/~chalin/papers/TR-2003-001.3.pdf>.

[The proposal in that paper has since been implemented for JML; see
<http://www.cs.caltech.edu/~cs141/2003-2004/resources/JML/NEWS.txt>.]

-- 
David Hopwood <david.hopwood at industrial-designers.co.uk>



More information about the e-lang mailing list