I have a need to construct a MethodHandle
with signature (InternalContext)->Object[]
from a List<MethodHandle>
each with the signature (InternalContext)->Object
.
If the list is small <253 elements I can do something like
static MethodHandle makeArrayUsingCollector(List<MethodHandle> elements) {
// (Object[]) -> Object[]
var handle = MethodHandles.identity(Object[].class);
// (Object,Object...Object) -> Object[]
handle = handle.asCollector(Object[].class, elements.size());
// (InternalContext,InternalContext..InternalContext) -> Object[]
handle = MethodHandles.filterArguments(handle, 0, elements.toArray(new MethodHandle[0]));
// (InternalContext) -> Object[]
handle = MethodHandles.permuteArguments(
handle, methodType(Object[].class, InternalContext.class), new int[elements.size()]);
return handle;
}
However, if the list is larger the asCollector()
call will fail since the resulting handle will have too many parameters.
So I tried something like this:
private static final int MAX_ARITY = 255;
static MethodHandle buildLargeArrayNaive(List<MethodHandle> elementFactories) {
if (elementFactories.size() < MAX_ARITY) {
return makeArrayUsingCollector(elementFactories);
}
var setter = MethodHandles.arrayElementSetter(Object[].class);
// (Object[], InternalContext) -> void
MethodHandle handle = null;
for (int i = 0; i < elementFactories.size(); i++) {
// (Object[], InternalContext) -> void
var setElement =
MethodHandles.filterArguments(
MethodHandles.insertArguments(setter, 1, i), 1, elementFactories.get(i));
if (handle == null) {
handle = setElement;
} else {
handle = MethodHandles.foldArguments(setElement, handle);
}
}
// (Object[], InternalContext) -> Object[]
handle =
MethodHandles.foldArguments(
MethodHandles.dropArguments(
MethodHandles.identity(Object[].class), 1, InternalContext.class),
handle);
return MethodHandles.foldArguments(
handle,
MethodHandles.insertArguments(
MethodHandles.arrayConstructor(Object[].class), 0, elementFactories.size()));
}
Which basically constructs an array explicitly and then sets each element combining everything with foldArguments
This works, but the resulting MethodHandle
produces StackOverflowError
when constructing large arrays (say 100K elements). From the docs on foldArguments
this makes sense since each call sets up a 'tail call' structure.
If i was generating literal bytecode to do this I would probably generate a method with as many aastore
instructions as could fit and then set up a kind of 'tree' structure to call all these methods.
static makeArray(InternalContext ctx) {
var a = new Object[<num-elements];
fillArray0(a, context)
fillArray1(a, context)
...
return a;
}
static void fillArray0(Object[] a, InternalContext context) {
a[0] = <element-factory>(context);
...
}
I did discover i could solve this with recursion
static MethodHandle buildLargeArrayRecursive(List<MethodHandle> elementFactories) {
// (Object[], InternalContext) -> void
var handle = doBuildLargeArrayRecursive(0, elementFactories);
// (Object[], InternalContext) -> Object[]
handle =
MethodHandles.foldArguments(
MethodHandles.dropArguments(
MethodHandles.identity(Object[].class), 1, InternalContext.class),
handle);
;
return MethodHandles.foldArguments(
handle,
MethodHandles.insertArguments(
MethodHandles.arrayConstructor(Object[].class), 0, elementFactories.size()));
}
static final MethodHandle ARRAY_SETTER = MethodHandles.arrayElementSetter(Object[].class);
static MethodHandle doBuildLargeArrayRecursive(int offset, List<MethodHandle> elementFactories) {
int size = elementFactories.size();
if (size == 0) {
return MethodHandles.empty(methodType(void.class, Object[].class, InternalContext.class));
}
if (size == 1) {
return MethodHandles.filterArguments(
MethodHandles.insertArguments(ARRAY_SETTER, 1, offset), 1, elementFactories.get(0));
}
int half = size / 2;
var left = elementFactories.subList(0, half);
var right = elementFactories.subList(half, size);
return MethodHandles.foldArguments(
doBuildLargeArrayRecursive(offset + half, right), doBuildLargeArrayRecursive(offset, left));
}
which seems to scale to at least 100K elements. But clearly this will consume O(logN) stack space.. that is probably fine but this is also a lot of method calls...
A few things occurred while staring at this:
Should i create helper 'bulk-set' methods to handle more base cases? e.g. static void set4(Object[], int offset, Object o1...Object o3){...}
could maybe help reduce recursion? Or should i just rely on the VM to do this via inlining?
I see that Setting array elements is an 'intrinsic' operation that compiles to an aastore
instruction, is there some way to combine things so that an arbitrary number of those could be in the same LambdaForm
?
Would it be better to construct small arrays and patch them together using System.arraycopy
?
Using an adhoc JMH benchmark to try different ideas
normalLoop
This scenario creates a simple method
static Object[] makeFromHandles(MethodHandle[] handles, Context context) throws Throwable {
Object[] array = new Object[handles.length];
for (int i = 0; i < handles.length; i++) {
array[i] = (Object) handles[i].invokeExact(context);
}
return array;
}
then binds the delegate handles to it. This definitely wins for simplicity and in theory this loop will be trivial for the VM to optimize (the handles
parameter will be a constant according to the VM)
asCollector
asCollector
approach described in the OP but only works for small numbers of parameters. The second simplest and probably the most intuitive solution.recursive
foldArguments
approach described by the OP. Scales easily and is not too complex.recursiveWithHelpers
set2
, set4
, set8
and set16
. Then it breaks the larger list down by powers of 2 to target those helpers using foldArguments
to manage the rest. Definitely more complicated but perhaps is useful by making the tree produced by recursive
shallower?arrayCopySetter
asCollector
approach to build up small arrays and then uses System.arraycopy
to merge them into the target array.asCollector
approach.According to the benchmark results all these approaches are actually quite similar. Here is a snippet of the results... (see the gist for full results/code)
Benchmark (size) Mode Cnt Score Error Units
ObjectArrayBenchmarks.arrayCopySetter 10 thrpt 2 55.668 ops/us
ObjectArrayBenchmarks.arrayCopySetter 100 thrpt 2 3.549 ops/us
ObjectArrayBenchmarks.arrayCopySetter 1000 thrpt 2 0.156 ops/us
ObjectArrayBenchmarks.asCollector 10 thrpt 2 59.297 ops/us
ObjectArrayBenchmarks.asCollector 100 thrpt 2 1.742 ops/us
ObjectArrayBenchmarks.normalLoop 10 thrpt 2 35.753 ops/us
ObjectArrayBenchmarks.normalLoop 100 thrpt 2 3.273 ops/us
ObjectArrayBenchmarks.normalLoop 1000 thrpt 2 0.143 ops/us
ObjectArrayBenchmarks.recursive 10 thrpt 2 60.146 ops/us
ObjectArrayBenchmarks.recursive 100 thrpt 2 4.161 ops/us
ObjectArrayBenchmarks.recursive 1000 thrpt 2 0.220 ops/us
ObjectArrayBenchmarks.recursiveWithHelpers 10 thrpt 2 59.053 ops/us
ObjectArrayBenchmarks.recursiveWithHelpers 100 thrpt 2 4.137 ops/us
ObjectArrayBenchmarks.recursiveWithHelpers 1000 thrpt 2 0.201 ops/us
So the recursive
approach seems like a narrow winner. The handwritten 'helpers' do not pay for themselves. But it is interesting to see that the arrayCopySetter
approach is so competitive, I'm guessing the JVM can easily leverage escape analysis to eliminate the allocations for the temporaries and maybe even fuse the copies. normalLoop
is appealing but fails for basic 'megamorphic MethodHandle' reasons.
I suppose the one reason that the normalLoop
approach might be desirable is simply due to code size. With the other solutions the size of the resulting MethodHandle
is proportional to the number of input handles, but with normalLoop
it is essentially O(1) overhead