javaneo4jchronicle-map

What would a proper way to replace Roaring64NavigableMap with Chronicle-Map in Java?


I have a code that uses Roaring64NavigableMap as a neo4j plugin the long value of nodes using getId() from the Neo4J API.

I would like to use Chronicle-Map. I see this example:

ChronicleSet<UUID> uuids =
    ChronicleSet.of(Long.class)
        .name("ids")
        .entries(1_000_000)
        .create();
  1. What if I don't know how many values to anticipate? does .entries(1_000_000) limit the cache or the DB max number of enteries
  2. Is there a way to handle really big amount of data around a billion entries?
  3. Is there a more efficient way to create Chronicle-Map?
  4. Can I control the size of the cache it's uses?
  5. Can I control the volume where the DB is stored in?

Solution

  • What if I don't know how many values to anticipate? does .entries(1_000_000) limit the cache or the DB max number of entries

    From the Javadoc of entries() method:

    Configures the target number of entries, that is going be inserted into the hash containers, created by this builder. If ChronicleHashBuilder.maxBloatFactor(double) is configured to 1.0 (and this is by default), this number of entries is also the maximum. If you try to insert more entries, than the configured maxBloatFactor, multiplied by the given number of entries, IllegalStateException might be thrown.

    This configuration should represent the expected maximum number of entries in a stable state, maxBloatFactor - the maximum bloat up coefficient, during exceptional bursts.

    To be more precise - try to configure the entries so, that the created hash container is going to serve about 99% requests being less or equal than this number of entries in size.

    You shouldn't put additional margin over the actual target number of entries. This bad practice was popularized by HashMap.HashMap(int) and HashSet.HashSet(int) constructors, which accept capacity, that should be multiplied by load factor to obtain the actual maximum expected number of entries. ChronicleMap and ChronicleSet don't have a notion of load factor.

    So this is kind of the maximum number of entries unless you specify maxBloatFactor(2.0) (or 10.0, etc.). Currently, Chronicle Map doesn't support the case "I really don't know how many entries will I have; maybe 1; maybe 1 billion; but I want to create a Map that will grow organically to the required size". This is a known limitation.

    Is there a way to handle really big amount of data around a billion entries?

    Yes, if you have sufficient amount of memory. Although memory-mapped, Chronicle Map is not designed to work efficiently when the amount of data is significantly larger than the memory. Use LMDB, or RocksDB, or something similar in that case.