Let's say I have a large IStreamMap on a large cluster and I only want to do an operation on a few keys. I could just right a filter expression as shown below, but my understanding is that this will run on all nodes. And 99% of the nodes will be forced to stream the map even though ultimately nothing comes out of it. Is there a way to get the Hazelcast jet cluster to ONLY run the operation on the nodes that correspond to those keys? The code that ought to work is below, but I don't think it's efficient. (In my case, I might be running this operation many times on large distributed maps, so I would not want each node to try to execute this operation if I can tell ahead of time that 99% of the nodes are not relevant to the selected keys.)
final IStreamMap<String, Integer> streamMap = instance1.getMap("source");
// stream of entries, you can grab keys from it
streamMap.stream()
.filter(key -> key == 1 || key = 9999999)
.forEach(key -> <do something interesting>));
IStreamMap
was removed from Hazelcast Jet three years ago, I think. You should use Jet through its Pipeline API.
You can try using a map
source with a predicate:
Pipeline p = Pipeline.create();
BatchStage<Entry<K, V>> stage = p.readFrom(Sources.map("name",
(Map.Entry<K, V> mapEntry) -> myCondition(mapEntry),
e -> e));
This will still scan the entire map, though. If you simply have a set of keys you're interested in, then perhaps a better match for your use case is IMap.executeOnKeys()
.