javacollectionssetmutablegeneric-collections

Set Collection for mutable objects in Java


In Java, a set only checks for equality of an object with objects already in the set only at insertion time. That means that if after an object is already present in the set, it becomes equal to another object in the set, the set will keep both equal objects without complaining.

EDIT: For example, consider a simple object, and assume that hashCode and equals are defined following best practices/

class Foo {
    int foo;

    Foo(int a){ foo = a; }
    //+ equals and hashcode based on "foo"
}

Foo foo1 = new Foo(1);
Foo foo2 = new Foo(2);
Set<Foo> set = new HashSet<Foo>();
set.add(foo1);
set.add(foo2);
//Here the set has two unequal elements.
foo2.foo = 1;
//At this point, foo2 is equal to foo1, but is still in the set 
//together with foo1.

How could a set class could be designed for mutable objects? The behavior I would expected would be the following: If at any time one of the objects in the set becomes equal to another object in the set, that object is deleted from the set by the set. Is there one already? Is there a programming language that would make this easier to accomplish?


Solution

  • I don't think this can be reliably done in Java in the general sense. There is no general mechanism for ensuring a certain action on mutation of an object.

    There a few approaches for solutions that may be sufficient for your use case.

    1. Observe elements for changes

    You could try to enforce an observer like construction where your Set class is registered as an Observer to all its items. This implies you'd need to control the types of objects that can be put into the Set (only Observable objects). Furthermore, you'd need to ensure that the Observables notify the observer for every change that can affect hashcode and equals. I don't know of any class like this that exists already. Like Ray mentions below, you'll need to watch out for potential concurrency problems as well. Example:

    package collectiontests.observer;
    
    import java.util.ArrayList;
    import java.util.Collection;
    import java.util.HashSet;
    import java.util.Iterator;
    import java.util.Observable;
    import java.util.Observer;
    import java.util.Set;
    
    public class ChangeDetectingSet<E extends Observable> implements Set<E>, Observer {
    
        private HashSet<E> innerSet;
    
        public void update(Observable o, Object arg) {
            innerSet.remove(o);
            innerSet.add((E)o); 
        }
        public int size() {
            return innerSet.size();
        }
        public boolean isEmpty() {
            return innerSet.isEmpty();
        }
        public boolean contains(Object o) {
            return innerSet.contains(o);
        }
        public Iterator<E> iterator() {
            return innerSet.iterator();
        }
        public Object[] toArray() {
            return innerSet.toArray();
        }
        public <T> T[] toArray(T[] a) {
            return innerSet.toArray(a);
        }
        public boolean add(E e) {
            e.addObserver(this);
            return innerSet.add(e);
        }
        public boolean remove(Object o) {
            if(o instanceof Observable){
                ((Observable) o).deleteObserver(this);
            }
            return innerSet.remove(o);
        }
        public boolean containsAll(Collection<?> c) {
            return innerSet.containsAll(c);
        }
        public boolean addAll(Collection<? extends E> c) {
            boolean result = false;
            for(E el: c){
                result = result || add(el);
            }
            return result;
        }
        public boolean retainAll(Collection<?> c) {
            Iterator<E> it = innerSet.iterator();
            E el;
            Collection<E> elementsToRemove = new ArrayList<E>();
            while(it.hasNext()){
                el = it.next();
                if(!c.contains(el)){
                    elementsToRemove.add(el); //No changing the set while the iterator is going. Iterator.remove may not do what we want.
                }
            }
            for(E e: elementsToRemove){
                remove(e);
            }
            return !elementsToRemove.isEmpty(); //If it's empty there is no change and we should return false
        }
        public boolean removeAll(Collection<?> c) {
            boolean result = false;
            for(Object e: c){
                result = result || remove(e);
            }
            return result;
        }
        public void clear() {
            Iterator<E> it = innerSet.iterator();
            E el;
            while(it.hasNext()){
                el = it.next();
                el.deleteObserver(this);
            }
            innerSet.clear();
        }
    }
    

    This incurs a performance hit every time the mutable objects change.

    2. Check for changes when Set is used

    If the objects in your set change often, but the set itself is used rarely, you could try Joe's solution below. He suggests to check whether the Set is still correct whenever you call a method on it. As a bonus, his method will work on any set of objects (no having to limit it to observables). Performance-wise his method would be problematic for large sets or often used sets (as the entire set needs to be checked at every method call).

    Possible implementation of Joe's method:

    package collectiontests.check;
    
    import java.util.ArrayList;
    import java.util.Collection;
    import java.util.HashSet;
    import java.util.Iterator;
    import java.util.List;
    import java.util.Set;
    
    public class ListBasedSet<E> {
    
        private List<E> innerList;
    
        public ListBasedSet(){
            this(null);
        }
    
        public ListBasedSet(Collection<E> elements){
            if (elements != null){
                innerList = new ArrayList<E>(elements);
            } else {
                innerList = new ArrayList<E>();
            }
        }
    
        public void add(E e){
            innerList.add(e);
        }
    
        public int size(){
            return toSet().size();
        }
    
        public Iterator<E> iterator(){
            return toSet().iterator();
        }
    
        public void remove(E e){
            while(innerList.remove(e)); //Keep removing until they are all gone (so set behavior is kept)
        }
    
        public boolean contains(E e){
            //I think you could just do innerList.contains here as it shouldn't care about duplicates
            return innerList.contains(e);
        }
    
        private Set<E> toSet(){
            return new HashSet<E>(innerList);
        }
    }
    

    And another implementation of the check always method (this one based on an existing set). This is the way to go if you want to reuse the existing sets as much as possible.

    package collectiontests.check;
    
    import java.util.Collection;
    import java.util.Comparator;
    import java.util.Iterator;
    import java.util.NavigableSet;
    import java.util.SortedSet;
    import java.util.TreeSet;
    
    public class ChangeDetectingSet<E> extends TreeSet<E> {
    
        private boolean compacting = false;
    
        @SuppressWarnings("unchecked")
        private void compact(){
            //To avoid infinite loops, make sure we are not already compacting (compact also gets called in the methods used here)
            if(!compacting){ //Warning: this is not thread-safe
                compacting = true;
                Object[] elements = toArray();
                clear();
                for(Object element: elements){
                    add((E)element); //Yes unsafe cast, but we're rather sure
                }
                compacting = false;
            }
        }
        @Override
        public boolean add(E e) {
            compact();
            return super.add(e);
        }
        @Override
        public Iterator<E> iterator() {
            compact();
            return super.iterator();
        }
        @Override
        public Iterator<E> descendingIterator() {
            compact();
            return super.descendingIterator();
        }
        @Override
        public NavigableSet<E> descendingSet() {
            compact();
            return super.descendingSet();
        }
        @Override
        public int size() {
            compact();
            return super.size();
        }
        @Override
        public boolean isEmpty() {
            compact();
            return super.isEmpty();
        }
        @Override
        public boolean contains(Object o) {
            compact();
            return super.contains(o);
        }
        @Override
        public boolean remove(Object o) {
            compact();
            return super.remove(o);
        }
        @Override
        public void clear() {
            compact();
            super.clear();
        }
        @Override
        public boolean addAll(Collection<? extends E> c) {
            compact();
            return super.addAll(c);
        }
        @Override
        public NavigableSet<E> subSet(E fromElement, boolean fromInclusive, E toElement, boolean toInclusive) {
            compact();
            return super.subSet(fromElement, fromInclusive, toElement, toInclusive);
        }
        @Override
        public NavigableSet<E> headSet(E toElement, boolean inclusive) {
            compact();
            return super.headSet(toElement, inclusive);
        }
        @Override
        public NavigableSet<E> tailSet(E fromElement, boolean inclusive) {
            compact();
            return super.tailSet(fromElement, inclusive);
        }
        @Override
        public SortedSet<E> subSet(E fromElement, E toElement) {
            compact();
            return super.subSet(fromElement, toElement);
        }
        @Override
        public SortedSet<E> headSet(E toElement) {
            compact();
            return super.headSet(toElement);
        }
        @Override
        public SortedSet<E> tailSet(E fromElement) {
            compact();
            return super.tailSet(fromElement);
        }
        @Override
        public Comparator<? super E> comparator() {
            compact();
            return super.comparator();
        }
        @Override
        public E first() {
            compact();
            return super.first();
        }
        @Override
        public E last() {
            compact();
            return super.last();
        }
        @Override
        public E lower(E e) {
            compact();
            return super.lower(e);
        }
        @Override
        public E floor(E e) {
            compact();
            return super.floor(e);
        }
        @Override
        public E ceiling(E e) {
            compact();
            return super.ceiling(e);
        }
        @Override
        public E higher(E e) {
            compact();
            return super.higher(e);
        }
        @Override
        public E pollFirst() {
            compact();
            return super.pollFirst();
        }
        @Override
        public E pollLast() {
            compact();
            return super.pollLast();
        }
        @Override
        public boolean removeAll(Collection<?> c) {
            compact();
            return super.removeAll(c);
        }
        @Override
        public Object[] toArray() {
            compact();
            return super.toArray();
        }
        @Override
        public <T> T[] toArray(T[] a) {
            compact();
            return super.toArray(a);
        }
        @Override
        public boolean containsAll(Collection<?> c) {
            compact();
            return super.containsAll(c);
        }
        @Override
        public boolean retainAll(Collection<?> c) {
            compact();
            return super.retainAll(c);
        }
        @Override
        public String toString() {
            compact();
            return super.toString();
        }
    }
    

    3. Use Scala sets

    You could cheat and do away with mutable objects (in the sense that instead of mutating, you'd create a new one with one property changed) in your set. You can look at the set in Scala (I thought it was possible to call Scala from Java, but I'm not 100% sure): http://www.scala-lang.org/api/current/scala/collection/immutable/IndexedSeq.html