javacollectionsduplicatessethashset

remove duplicates in Set


How can I avoid inserting duplicate elements in a Set? If I have:

 Set<User> user=new HashSet<>();
                            User user1=new User("11","Mark",null,"1");
                            User user2=new User("11","Mark",null,"1");
                            User user3=new User("12","Helen",null,"2");

                            user.add(user1);
                            user.add(user2);
                            Log.d("main_activity_user", "la dimensione è" +String.valueOf(user.size()));

Adn User class is:

public class User {
    public String uid;
public String name;
    public String pversion;
public String upicture;
    public User(String uid,
            String name,
                String upicture, String pversion ){
        this.uid=uid;
        this.name=name;
        this.upicture=upicture;
        this.pversion=pversion;
    }
    public String get_uid(){
        return uid;
    }
    public String get_name(){
        return name;
    }
    public String get_pversion(){
        return pversion;
    }
    public String get_upicture(){
        return upicture;
    }
    @Override
    public boolean equals(Object obj) {
        User newObj = (User)obj;
        if (this.get_uid().equals( newObj.get_uid()))
            return true;
        else
            return false;
    }
}

Now the Set also stores duplicates and prints me 3 elements instead of two. Why?

I have never used the Set class before and I don't understand it. So, every time I use the Set class, do I have to Override the Equals method? Why? Doesn't the class delete duplicates automatically?


Solution

  • As it has been already said in the comments, your User class needs to honor the hashcode and equals contracts by overriding the equals() and hashCode() methods.

    https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode()

    In your code, you're using a HashSet, which is implemented as a HashMap under the hood. A HashMap, instead, is implemented as an array of buckets, where each entry is stored within a bucket based on the key's hashCode(). However, different keys may yield the same hashcode, so multiple entries may be listed within a same bucket. In order to reference the right entry within a bucket, a HashMap has to resort to the key's equals() method to find the exact entry whose key matches the given value. This sequence of operations is performed, for example, when an element is retrieved or replaced. Lastly, a bucket is a generic data structure. In Java, HashMap's buckets were implemented as LinkedList up to Java 7, while from Java 8 on, it is employed a hybrid solution where after a certain number of collisions the LinkedList implementation is switched to a Balanced Tree. For more details, read the JEP-180.

    This brief explanation shows you why it is so crucial to provide a proper definition of the hashCode() and equals() methods, because, as you can see, a HashMap heavily relies on these methods.

    https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html

    Here is a proper implementation of your User class, where two users are said identical if they have the same: uid, name, pversion and upicture. Instead, if two users are identical only by some of the mentioned fields, then you should update the equals() and hashCode() implementations accordingly (both of them must be based on the same fields).

    public class User {
        public String uid;
        public String name;
        public String pversion;
        public String upicture;
    
        public User(String uid, String name, String upicture, String pversion) {
            this.uid = uid;
            this.name = name;
            this.upicture = upicture;
            this.pversion = pversion;
        }
    
        public String getUid() {
            return uid;
        }
    
        public String getName() {
            return name;
        }
    
        public String getPversion() {
            return pversion;
        }
    
        public String getUpicture() {
            return upicture;
        }
    
        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            User user = (User) o;
            return Objects.equals(uid, user.uid) && Objects.equals(name, user.name) && Objects.equals(pversion, user.pversion) && Objects.equals(upicture, user.upicture);
        }
    
        @Override
        public int hashCode() {
            return Objects.hash(uid, name, pversion, upicture);
        }
    }
    

    Test Main

    public class Main {
        public static void main(String[] args) {
            Set<User> user = new HashSet<>();
            User user1 = new User("11", "Mark", null, "1");
            User user2 = new User("11", "Mark", null, "1");
            User user3 = new User("12", "Helen", null, "2");
    
            user.add(user1);
            user.add(user2);
            System.out.println("Size is: " + user.size());
        }
    }
    

    Output

    Size is: 1