[e-lang] Joe-E taming error in java.lang.String.taming
David Hopwood
david.hopwood at industrial-designers.co.uk
Sat Sep 22 13:38:23 EDT 2007
David Hopwood wrote:
> Tyler Close wrote:
>> So now that I've tamed away all the string encoding methods from
>> java.lang.String, I find I need another method in the
>> org.joe_e.charset.ASCII API. In particular,
>>
>> /**
>> * Decodes a US-ASCII string.
>> * @return The corresponding string
>> */
>> static public String
>> decode(final byte[] buffer, final int off, final int len) {
>> try {
>> return new String(buffer, off, len, "US-ASCII");
>From the documentation of this constructor
<http://java.sun.com/javase/6/docs/api/java/lang/String.html#String(byte[],%20int,%20int,%20java.lang.String)>:
# The behavior of this constructor when the given bytes are not valid in the
# given charset is unspecified. The CharsetDecoder class should be used when
# more control over the decoding process is required.
This behaviour needs to be specified for Joe-E. Here is a possible
implementation (the Charset.decode method uses a thread-locally cached
CharsetDecoder):
import java.nio.charset.Charset;
import java.nio.ByteBuffer;
static private final Charset charset = Charset.forName("US-ASCII");
/**
* Decodes a US-ASCII string. Each byte not corresponding to a US-ASCII
* character decodes to the Unicode replacement character U+FFFD.
* @return The corresponding string
* @throws java.lang.IndexOutOfBoundsException
*/
static public String
decode(final byte[] buffer, final int off, final int len) {
return charset.decode(ByteBuffer.wrap(buffer, off, len));
}
> A corresponding method should also be added to org.joe_e.charset.UTF8.
Same issue here, and in addition, it should be specified that the
UTF8 class does nothing special with initial byte-order marks.
--
David Hopwood <david.hopwood at industrial-designers.co.uk>
More information about the e-lang
mailing list