javaarraylistsetequalshashcode

Define a object to be unique in a Set, irrespective of order of List elements within the given object


I have a Class B defined as below -->

Class B {
   private List<String> obj1;
   private List<A> obj2;

   // Need to implement hashcode() and equals() methods
}

In my code, I am using below logic to identify duplicate(non-unique) elements of object type B -->

Set<B> setBObjects = new HashSet<>();
for (B tempB : listOfBTypeObjects) {
     boolean isUnique = setBObjects.add(tempB);
     if (!isUnique) {
         // Print non-unique elements
     }
}

Need to find unique B type elements, irrespective of order of list elements in object B e.g.

suppose 1 instance of object B(B1) is as below ->

B1 ->
{
   obj1 -> ["R", "G", "B"]
   obj2 -> [objA1, objA2, objA3]
}

Another instance of object B(B2) is as below ->

B2 ->
{
   obj1 -> ["B", "G", "R"] <<<< order of list is changed
   obj2 -> [objA3, objA2, objA1] <<<< order of list is changed
}

Now, I want B1 and B2 to be same in above set -> add() operation, where we find unique elements. So B1 and B2 will be non-unique.


In order to achieve this how should I define hashcode() and equals() methods for class B?


Solution

  • The entire design is somewhat questionable, but getting to the actual point of your question, which I take to be:

    Given a collection with elements ["B", "G", "R"] and another one with elements ["R", "G", "B"], how would I set them up so that they are considered equal to each other?

    The answer'd be: XOR the hashCodes of all elements (because A XOR B XOR C is the same value as C XOR A XOR B - it is a commutative operation).

    For equality, that's trickier; you define equality to be that the sets contain the same data in arbitrary order. To implement that:

    For each element in A check if it exists in B. If B is a list, that's O(n^2); if it is a set, it's O(n). This is important if you want the algorithms you write with this not to fall off a cliff, performancewise, once inputs gets large, but if the inputs are small (no more than a few thousand per list) it doesn't matter.

    You're going to have to figure out the answer to situations like ["R", "G", "B", "B"] - is that equal to ["B", "R", "G"]? If not, size becomes a factor and for the hashcoding, you run into some trouble (because the hashcode of ["B", "B"] if applying the above algorithm, i.e. just XOR everything together, is 0, i.e. any paired elements eliminate themselves, because x XOR x is 0 for any x), and things would then be a lot simpler if you actually end up storing {"R": 1, "G": 1, "B": 2} instead.