data-structuresbinary-treered-black-treesortedset

Rope and self-balancing binary tree hybrid? (i.e Sorted set with fast n-th element lookup)


Is there a data structure for a sorted set allows quick lookup of the n-th (i.e. the least but n-th) item? That is, something like a a hybrid between a rope and a red-black tree.

Seems like it should be possible to either keep track of the size of the left subtree and update it through rotations or do something else clever and I'm hoping someone smart has already worked this out.


Solution

  • Seems like it should be possible to either keep track of the size of the left subtree and update it through rotations […]

    Yes, this is quite possible; but instead of keeping track of the size of the left subtree, it's a bit simpler to keep track of the size of the complete subtree rooted at a given node. (You can then get the size of its left subtree by examining its left-child's size.) It's not as tricky as you might think, because you can always re-calculate a node's size as long as its children are up-to-date, so you don't need any extra bookkeeping beyond making sure that you recalculate sizes by working your way up the tree.

    Note that, in most mutable red-black tree implementations, 'put' and 'delete' stop walking back up the tree once they've restored the invariants, whereas with this approach you need to walk all the way back up the tree in all cases. That'll be a small performance hit, but at least it's not hard to implement. (In purely functional red-black tree implementations, even that isn't a problem, because those always have to walk the full path back up to create the new parent nodes. So you can just put the size-calculation in the constructor — very simple.)


    Edited in response to your comment:

    I was vaguely hoping this data structure already had a name so I could just find some implementations out there and that there was something clever one could do to minimize the updating but (while I can find plenty of papers on data structures that are variations of balanced binary trees) I can't figure out a good search term to look for papers that let one lookup the nth least element.

    The fancy term for the nth smallest value in a collection is order statistic; so a tree structure that enables fast lookup by order statistic is called an order statistic tree. That second link includes some references that may help you — not sure, I haven't looked at them — but regardless, that should give you some good search terms. :-)