javasortingcomparatortransitivity

How to write a transitive comparator when "equality" implies "order doesn't matter"?


I have a set of items that are serialized to a file. Some items can rely on other items, but circular references are not allowed. Thus, they need to be serialized in a way such that if A relies on B, B is serialized first in the file.

I wrote my Comparator that uses a reliesOn() function to determine if two items are linked:

Collections.sort(itemsToSort, new Comparator<Item>() {
    @Override
    public int compare(Item first, Item second) {
        boolean firstReliesOnSecond = reliesOn(first, second);
        if (firstReliesOnSecond) {
            return 1;
        }
        boolean secondReliesOnFirst = reliesOn(second, first);
        if (secondReliesOnFirst) {
            return -1;
        }
        return 0;
    }
});

This works for some cases, but not all. In debugging, it's obvious that the sort relies on the transitive nature of the Comparator, and understandably does not compare every possible pairing of items.

For example, with five items A through E, if:

A -> B
B -> E
C
D
E

Then one possible ordering would be:

E, B, A, C, D

At the very least, E comes before B, and B comes before A.

However, during the comparison phase (paraphrasing as an example), what happens is that C is compared to E, returning 0 because they have no relation. And then C is compared to B, and also returns 0.

And as a result, the sorting algorithm assumes B = E, which is not the case. (Even though I broke the Comparator contract.) How can I write my compare() method in a way that ensures transitivity?

Edit: It was pointed out that I'm performing a topological sort on a directed acyclic graph. I'm having flashbacks to my Data Structures course. Fortunately Wikipedia seems to have a good linear-time algorithm for performing this sort - I'll give it a shot.


Solution

  • How can I write my compare() method in a way that ensures transitivity?

    As you've discovered the contract of a Comparator forces you to make a decision based on two given objects, while their relation in the overall sort may involve other objects.

    What you have here is a DAG and what you're trying to do is a topological sort. The only way I see that it would be possible to do using a Comparator would be to first do a topological sort, and then using the indexes of the objects in this sorting as keys when implementing the comparator. But then of course there's no need for the comparator since you have already sorted the elements.