javaalgorithm

Performance of frequently reordering a short list


I have an application where I have a Rule object that has a list of Filter predicates. I will be comparing many Events to the predicates, and want to know if they all match. Most Events will not match against the Rule, so I want to fail as fast as possible.

I figure, if I reorder the list so that the most-recently failing Filter is first, the Filters that are low-value will naturally sink down to the bottom of the list. Each Filter will hold a task roughly equivalent to a case-insensitive string comparison, so I think reordering the list will be more performant than doing an unnecessary test.

I know from studying Big-O complexity that a Linked List is the correct solution here for algorithmic complexity. However, given the fact that I expect this list to be somewhere in the range of 5-50 elements long, I have a suspicion that storing the Filters in an ArrayList will actually be the most performant, as that forces the list to be stored in one block of physical memory, instead of spread across the heap.

Is there some other data structure that would be even more performant for this task?

public class Rule {
  private List<Predicate<Event>> filters;

  public boolean doesEventMatch(final Event event) {
    for (int i = 0; i < filters.size(); i++) {
      if (!filter.test(event)) {
        // remove the active filter and insert it at the front of the list
        if (i != 0) {
          final Predicate<Event> activeFilter = filters.remove(i);
          filters.add(0, activeFilter);
        }

        return false;
      }
    }
    return true;
  }
}

Solution

  • Imagine your list has 20 filters in it, and you repeatedly reorder filter at index 0 and filter at index 1. This sounds quite plausible.

    Your operation will end up performing 18 'move steps' (to remove(1), ArrayList's impl moves elements 2-19 down by 1), and then 19 'move steps' (to add(0, ..), the impl moves 0-18 up by 1).

    Whereas flipping element 0 and element 1 around, which is all that you really want, would be faster.

    Your instincts are highly likely correct. LinkedList is essentially always wrong. There's always some data type (usually, but not always, ArrayList) that will do better. Regardless of the algorithmic complexity: Computers aren't von neumann machines; they operate almost nothing like it. Hence, the algorithmic complexity stuff only works on a large enough scale (in fairness to the 'point' of algorithmic complexity, that's what it is about: It just means eventually, with enough input data, the performance characteristic will coalesce into this formula. With the 'eventually' doing a lot of work.

    But, we're getting ahead of ourselves.

    The golden rule

    Because computers aren't von neumann machines and java VMs also aren't, it is completely ridiculous to imagine you can guess what the performance is going to be. It's just stupid at this point. Don't, ever, do it.

    If you want to know if something is more or less performant than an alterative, measure it. You use JMH to set up a test (Don't time it yourself, you'll do it wrong, use JMH, it knows the ways to manipulate a VM to give you reliable timing info), and compare both implementations. If you don't do that, you are just fumbling about in the dark and have no idea what you are doing. It's reminiscent of the 4 humours kind of 'science'. Just pontificating without an iota of reality.

    ... ah, but what do I compare?

    I'd go with 3 options which sound like plausible strategies:

    1. What you have now.
    2. Don't reorder.

    And stop there, run with JMH. Because it's somewhat likely you end up with 'no significant difference' and thus there's no further need to write more code.

    Your JMH code will need to emulate a realistic load, which means you have to answer a bunch of questions. What is a realistic load. Do you have real data? How often does the 'low value filter' thing actually come up?

    But, if the difference is significant enough, or you feel like you really wanna try and make things go faster:

    1. Write a custom method that implements 'move to front'.

    Unfortunately, to efficiently do the move to front thing, you need access to the backing array of arraylist which you do not have. Hence, you'd have to copy/paste ArrayList. Which you can; it's not magic, and the source is available.

    Then, it would look like this:

    public void moveToFront(int idx) {
      if (idx == 0) return;
      Object front = array[idx];
      System.arraycopy(array, 0, idx - 1, array, 1, idx - 1);
      array[0] = front;
    }