Java direct ByteBuffer - decode the characters

2.1k views Asked by At

I would like to read the bytes into the direct ByteBuffer and then decode them without rewrapping the original buffer into the byte[] array to minimize memory allocations.

Hence I'd like to avoid using StandardCharsets.UTF_8.decode() as it allocates the new array on the heap.

I'm stuck on how to decode the bytes. Consider the following code that writes a string into the buffer and then reads id again.

ByteBuffer byteBuffer = ByteBuffer.allocateDirect(2 << 16);

byteBuffer.put("Hello Dávid".getBytes(StandardCharsets.UTF_8));

byteBuffer.flip();

CharBuffer charBuffer = byteBuffer.asCharBuffer();
for (int i = charBuffer.position(); i < charBuffer.length(); i++) {
    System.out.println(charBuffer.get());
}

The code output:

䡥汬漠

How can I decode the buffer?

2

There are 2 answers

5
john16384 On

You can't specify the encoding of a CharBuffer. See here: What Charset does ByteBuffer.asCharBuffer() use?

Also, since buffers are mutable, I don't see how you could ever possibly create a String from it which are always immutable without doing a memory re-allocation...

0
nandsito On

I would like to read the bytes into the direct ByteBuffer and then decode them without rewrapping the original buffer into the byte[] array to minimize memory allocations.

ByteBuffer.asCharBuffer() fits your need, indeed, since both wrappers share the same underlying buffer.

This method's javadoc says:

The new buffer's position will be zero, its capacity and its limit will be the number of bytes remaining in this buffer divided by two

Although it's not explicitly said, it's a hint that CharBuffer uses UTF-16 character encoding over the given buffer. Since we don't have control over what encoding the charbuffer uses, you don't have much choice but to necessarily write the character bytes in that encoding.

byteBuffer.put("Hello Dávid".getBytes(StandardCharsets.UTF_16));

One thing about your printing for loop. Be careful that CharBuffer.length() is actually the number of remaining chars between the buffer's position and limit, so it decreases as you call CharBuffer.get(). So you should use get(int) or change the for termination condition to limit().