javastreamgarbage-collectionchroniclechronicle-map

Using Chronicle Map producing garbage while using Streams API


Today I was experimenting with Chronicle Map. Here is a code sample:

package experimental;

import net.openhft.chronicle.core.values.IntValue;
import net.openhft.chronicle.map.ChronicleMap;
import net.openhft.chronicle.values.Values;

public class Tmp {

    public static void main(String[] args) {

        try (ChronicleMap<IntValue, User> users = ChronicleMap
                .of(IntValue.class, User.class)
                .name("users")
                .entries(100_000_000)
                .create();) {

            User user = Values.newHeapInstance(User.class);
            IntValue id = Values.newHeapInstance(IntValue.class);

            for (int i = 1; i < 100_000_000; i++) {

                user.setId(i);
                user.setBalance(Math.random() * 1_000_000);

                id.setValue(i);
                users.put(id, user);

                if (i % 100 == 0) {
                    System.out.println(i + ". " +
                            users.values()
                                    .stream()
                                    .max(User::compareTo)
                                    .map(User::getBalance)
                                    .get());
                }
            }
        }
    }

    public interface User extends Comparable<User> {

        int getId();
        void setId(int id);
        double getBalance();
        void setBalance(double balance);

        @Override
        default int compareTo(User other) {
            return Double.compare(getBalance(), other.getBalance());
        }
    }
}

As you see in above code I am just creating User object and putting it in Chronicle Map, and after each 100th record I am just printing the User with max balance. But unfortunately it is producing some garbage. When I monitored it with VisualVM I got the following:

VisualVM Screenshot

It seems using streams in Chronicle Map will produce garbage anyway.

So my questions are:
* Does this mean that I should not use Streams API with Chronicle Map.
* Are there any other solutions/ways of doing this?
* How to filter/search Chronicle Map in proper way because I have use cases other than just putting/getting data in it.


Solution

  • ChronicleMap's entrySet().iterator() (as well as iterator on keySet() and values()) is implemented so that it dumps all objects in a Chronicle Map's segment into memory before iterating over them.

    You can inspect how much segments do you have by calling map.segments(). You could also configure it during the ChronicleMap construction phase, check out ChronicleMapBuilder javadoc.

    So, during iteration, you should expect regularly, approximately numEntries / numSegments entries to be dumped into memory at once, where numEntries is the size of your Chronicle Map.

    You can implement streaming processing on a Chronicle Map avoiding creating a lot of garbage, by reusing objects, via Segment Context API:

        User[] maxUser = new User[1];
        for (int i = 0; i < users.segments(); i++) {
            try (MapSegmentContext<IntValue, User, ?> c = map.segmentContext(i)) {
                c.forEachSegmentEntry((MapEntry<IntValue, User> e) -> {
                  User user = e.value().get();
                  if (maxUser[0] == null || user.compareTo(maxUser[0]) > 0) {
                    // Note that you cannot just assign `maxUser[0] = user`:
                    // this object will be reused by the SegmentContext later
                    // in the iteration, and it's contents will be rewritten.
                    // Check out the doc for Data.get().
                    if (maxUser[0] == null) {
                      maxUser[0] = Values.newHeapInstance(User.class);
                    }
                    User newMaxUser = e.value().getUsing(maxUser[0]);
                    // assert the object is indeed reused
                    assert newMaxUser == maxUser[0];
                  }
                });
            }
        }
    

    Link to doc for Data.get().

    The code of the above example is adapted from here.