javacollectionssortedset

Java SortedSet Implementation that allows you to get smallest/largest element in O(1)


is there a sorted set implementation that allows one to get the first element in O(1)? C++'s std::set can do this, so I don't see why we can't do this in Java. Thanks!


Solution

  • I presume you've seen the first and last methods:

    E first()
    

    Returns the first (lowest) element currently in this set.

    E last()
    

    Returns the last (highest) element currently in this set.

    The Java API docs don't say what their Big-O performance is. They could conceivably be O(1). SortedSet is an interface, so it depends on the concrete implementation you use.

    Tree set

    SortedSets are usually TreeSets, so in practice they are O(log n).

    TreeSet leverages TreeMap, and TreeMap doesn't cache the first and last nodes. When you call first or last, or create an iterator, it starts at the root and traverses down and left (or down and right) to find the target node. That takes O(log n) time.

    See the OpenJDK source:

    /**
     * Returns the first Entry in the TreeMap (according to the TreeMap's
     * key-sort function).  Returns null if the TreeMap is empty.
     */
    final Entry<K,V> getFirstEntry() {
        Entry<K,V> p = root;
        if (p != null)
            while (p.left != null)
                p = p.left;
        return p;
    }
    
    /**
     * Returns the last Entry in the TreeMap (according to the TreeMap's
     * key-sort function).  Returns null if the TreeMap is empty.
     */
    final Entry<K,V> getLastEntry() {
        Entry<K,V> p = root;
        if (p != null)
            while (p.right != null)
                p = p.right;
        return p;
    }
    

    Practically speaking, O(log n) is pretty close to O(1). In a sorted set with a million elements it would only take 20 traversals to find the first/last node (log2 1,000,000 ≈ 20).

    Heap or priority queue

    If you want true O(1) performance then a heap—i.e., a PriorityQueue—is a better data structure than a tree. A heap doesn't maintain the entire set of elements in sorted order. Yet you can always get the head element in O(1) time. Removing the head takes O(log n) time, after which the new head is available for instant lookup.

    Skip list

    There is also a lesser used ConcurrentSkipListSet class. Per Wikipedia:

    A skip list is a probabilistic data structure that allows O(log n) search complexity as well as O(log n) insertion complexity within an ordered sequence of n elements. Thus it can get the best features of a sorted array (for searching) while maintaining a linked list-like structure that allows insertion, which is not possible with a static array.

    Finding the first element is O(1). There's a loop but it only loops more than once if it needs to clean up nodes that have been marked for deletion. It returns immediately when it doesn't have any deletions to process. See the OpenJDK source:

    /**
     * Gets first valid node, unlinking deleted nodes if encountered.
     * @return first node or null if empty
     */
    final Node<K,V> findFirst() {
        Node<K,V> b, n;
        if ((b = baseHead()) != null) {
            while ((n = b.next) != null) {
                if (n.val == null)
                    unlinkNode(b, n);
                else
                    return n;
            }
        }
        return null;
    }
    

    Finding the last element, on the other hand, is... um... not O(1).

    /**
     * Specialized version of find to get last valid node.
     * @return last node or null if empty
     */
    final Node<K,V> findLast() {
        outer: for (;;) {
            Index<K,V> q; Node<K,V> b;
            VarHandle.acquireFence();
            if ((q = head) == null)
                break;
            for (Index<K,V> r, d;;) {
                while ((r = q.right) != null) {
                    Node<K,V> p;
                    if ((p = r.node) == null || p.val == null)
                        RIGHT.compareAndSet(q, r, r.right);
                    else
                        q = r;
                }
                if ((d = q.down) != null)
                    q = d;
                else {
                    b = q.node;
                    break;
                }
            }
            if (b != null) {
                for (;;) {
                    Node<K,V> n;
                    if ((n = b.next) == null) {
                        if (b.key == null) // empty
                            break outer;
                        else
                            return b;
                    }
                    else if (n.key == null)
                        break;
                    else if (n.val == null)
                        unlinkNode(b, n);
                    else
                        b = n;
                }
            }
        }
        return null;
    }