javajava-ffm

What is the difference between `JAVA_LONG` and `JAVA_LONG_UNALIGNED` in `java.lang.foreign.ValueLayout`?


The java.lang.foreign.ValueLayout API that was introduced in Java 22 provides a convenient method for declaring the layout of manually-managed memory so you don't need to do byte arithmetic on reads & writes. There are layouts available for all the common primitive types, and many have an _UNALIGNED variant. To take the example of JAVA_LONG, here is how it is described:

A value layout constant whose size is the same as that of a Java long, (platform-dependent) byte alignment set to ADDRESS.byteSize(), and byte order set to ByteOrder.nativeOrder().

Here is how JAVA_LONG_UNALIGNED is described:

An unaligned value layout constant whose size is the same as that of a Java long and byte order set to ByteOrder.nativeOrder().

These descriptions sound very similar to me, so I'm not sure what the difference is. However, I have noticed one difference in practice: _UNALIGNED variants do not support the atomic VarHandle.compareAndSet APIs, as also documented in this answer.

What is the benefit to using _UNALIGNED, then? Is it only really relevant if you are dealing with memory allocated in that way by some other application, and if you're allocating the memory yourself you should always used the aligned memory layout?


Solution

  • Alignment means that addresses must be an exact multiple of the width of the data type -- 8 bytes for a long. Aligned access is generally more efficient and allows more complex operations, like compareAndSet as you have noticed.

    However, achieving alignment can sometimes mean you are "wasting" memory. Imagine that you have a situation for which you need to store values which contain one byte and one long, a total of 9 bytes. To achieve aligned access, each of these would need 16 bytes, with 7 bytes wasted to make sure the long is on a address that's a multiple of 8 bytes. Or, for example, you may be working with UTF-8, where code points can take between one and four bytes, and you may want to be clever and grab four bytes at a time into an int -- but code points don't always start on an address that's a multiple of four.

    What's appropriate will vary depending on your application, and how exactly your data is organized in memory, for whatever reason.