Java - equals() and hashcode() quagmire

The equals() contract as specified demands an equivalence relation. It demands that the below properties are satisfied.

hashCode() contract demands the below

ORM objects

One thing to remember while overriding hashCode() and equals for objects that are used in ORM tools is to not use the primary key variable(usually named id) in hashCode() because the primary key will be set only after the data is persisted. So if we create an object and put the object in a HashSet with id field used in hashCode(), we will never find it in the Set after its persisted.

Also since getters force the loading of lazy loaded objects, its always better to use getter methods rather than accessing the variables directly in both equals() and hashCode() methods.

Inheritance impact for equals() contract

Inheritance throws a wrench in the equals() contract. The question is whether the subclass should override the equals() method because this always break the symmetric contract of the equals().

Using instanceof to identify the method argument in equals() breaks the symmetric contract in case of inheritance. If class B extends class A, then a.equals(b) returns true but b.equals(a) returns false. The check a instanceof B fails. Its better to make the class final if we are going to use instanceof operator.

Another approach is to use getClass() in equals() which will return false in both cases. But using getClass() violates the Liskov Substitution Principle and lead to unexpected behaviors. Depending on run-time classes for equals() method causes subClasses to not be equal to the superClass. Also be careful when using getClass() in Spring and ORM environments because they make heavy use of proxy objects.

These two methods play a big role in the Collections library. When an object is added to a collection, the the hashCode of the provided object is calculated and placed in a bucket mapped to this hashCode. If another object has the same hashCode, it is also added to the same bucket.

Retrieving the object involves the following steps

If hashCode() and equals() are not implemented properly satisfying the required contract, the Collection behavior is implementation dependent.

Example

class Obj {
    private final String name;

    public Obj(final String name) {
        this.name = name;
    }

    public String getName() {
        return this.name;
    }

    @Override
    public int hashCode() {
        return 3;
    }

    @Override
    public boolean equals(Object obj) {
        return false;
    }

    @Override
    public String toString() {
        return this.name;
    }
}

The implementation of equals() and hashCode() are terrible for the above class(only for learning purposes).

Any object of class Obj will return the same hashcode and the equals() method will always fail.

1
2
3
4
5
6
7
8
9
10
11
12
HashMap<Obj, String> map = new HashMap<>();

Obj obj1 = new Obj("p"), 
    obj2 = new Obj("q");

map.put(obj1, "java");
map.put(obj2, "spring");

System.out.println(map.get(new Obj("p")));
System.out.println(map.get(obj1));
System.out.println(map.get(new Obj("q")));
System.out.println(map.get(obj2));

Results

null     // for lineno: 9
java     // for lineno: 10
null     // for lineno: 11
spring   // for lineno: 12

Internally map saves the (key, value) pair as an Map.Entry object. Since the hashcode is same for both obj1 and obj2, there will be 2 entries in the same bucket.

For retrieval,

So while getting from a map, the entries in that particular bucket are picked and the key for each entry is compared against the obj passed in the map.get(). If any succeeds, the value of that entry is returned.

In the above code, consider that the equals() method always returns true. Then the value that we get is implementation dependent. In my machine, I get the last added value ie) spring for all the 4 retrievals.

Conclusion

In short, there is no simple way to properly implement the equals() method. There are different workarounds to resolve these issues. Have a look here and here. They discuss multiple approaches to resolve these issues and the new problems that come with each approach. We need to consider the tradeoffs for each of the different choices.