Running on Java 22-ea+27-2262 (Preview)
Home of The JavaSpecialists' Newsletter

309TreeSet with Custom Comparator

Author: Dr Heinz M. KabutzDate: 2023-05-31Java Version: 17Category: Tips and Tricks
 

Abstract: We can construct our TreeSet with our own Comparator, however, we need to be careful that it conforms to the specification. The Comparator needs to be consistent with equals() and hashCode() of the elements, otherwise we might end up with a TreeSet with non-symmetric equals() behaviour.

 

Welcome to the 309th edition of The Java(tm) Specialists' Newsletter, sent from one of the most fun Java conference in the world - JPrime in Sofia, Bulgaria. The organizers won the coveted Duke's Choice Award in 2018. I arrived by plane via Athens and there was a smiling Ivan St Ivanov, the leader of the conference, waiting to take me to the speakers' hotel. Their program is packed with such interesting talks, that it is difficult to decide which one to go to. Right now I'm listening to Grace Jansen talk about JITServer. Everyone at the conference looks so happy and chilled, it's a pleasure to be here.

Our speaker hotel is great and not too far from the conference venue. It certainly is walking distance. Having already done my morning run, I wasn't too keen on the walk, especially carrying my heavy laptop bag, but thought it was close enough to make it in time for Dmytro Vyazelenko and Maurice Nafatlin's Collection Puzzlers. So I set off. As I was trudging along, Google Maps kept on adjusting the remaining time to match my current position. After about 1/3 of the way, I realized I wasn't going to make it in time. So I upped my pace, pearls of sweat appearing on my forehead, laptop bouncing around on my shoulderstrap. A taxi whizzed past, but the light was off. The taxi had to stop in traffic and I caught up, wondering whether by some wonder I could persuade the driver to take me the rest of the way. But it was occupied. I looked carefully, and it was one of my friends :-))) I pointed at the front seat, he gave me a thumbs-up and I hopped in, much to the amazement of the driver. Got to the conference with plenty of time to spare and secured my seat on the front row, ready for the Collection Puzzlers.

javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.

TreeSet with Custom Comparator

One of the puzzles involved a TreeSet with a String.CASE_INSENSITIVE_ORDER comparator. It produced an inconsistent equals result when compared to another set. For example, something like this:

public class CaseInsensitiveTreeSet {
    public static void main(String... args) {
        Set<String> set1 = Stream.of("a", "b", "c")
                .collect(Collectors.toCollection(
                        () -> new TreeSet<>(
                                String.CASE_INSENSITIVE_ORDER)));
        Set<String> set2 = Stream.of("A", "B", "C")
	        .collect(Collectors.toSet());
        System.out.println(set1.equals(set2));
        System.out.println(set2.equals(set1));
    }
}
  

The output from this code is:

true
false

The equals() method in java.lang.Object specifies that it needs to be reflexive (x.equals(x)), symmetric (x.equals(y) iff y.equals(x)), transitive (x.equals(y) && y.equals(z) => x.equals(z)) and consistent. The code above clearly violates the symmetric requirement, because set1.equals(set2) is true, but set2.equals(set1) is false.

Tagir Valeev was sitting next to me and pointed me to the specification for TreeSet, where it clearly states:

"Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the {@code Set} interface. (See {@code Comparable} or {@code Comparator} for a precise definition of consistent with equals.) This is so because the {@code Set} interface is defined in terms of the {@code equals} operation, but a {@code TreeSet} instance performs all element comparisons using its {@code compareTo} (or {@code compare}) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the {@code Set} interface.

The TreeSet we constructed had an explicit comparator, but the Strings were not necessarily equal using the standard equals() method of String. Thus we violated the general contract of Set, but our TreeSet would still be considered well-defined. However, if we compare it to other Set instances, we cannot expect it to give us the results we were hoping for.

Let's try the same thing, but this time using a wrapper class for the String that implements both compareTo() and equals() to be case insensitive:

public class BetterCaseInsensitiveTreeSet {
    record CaseInsensitive(String value)
            implements Comparable<CaseInsensitive>{
        public boolean equals(Object obj) {
            return obj instanceof CaseInsensitive that
                    && this.value.equalsIgnoreCase(that.value);
        }

        public int compareTo(CaseInsensitive that) {
            return String.CASE_INSENSITIVE_ORDER.compare(
                    this.value, that.value
            );
        }
    }
    public static void main(String... args) {
        Set<CaseInsensitive> set1 = Stream.of("a", "b", "c")
                .map(CaseInsensitive::new)
                .collect(Collectors.toCollection(TreeSet::new));
        Set<CaseInsensitive> set2 = Stream.of("A", "B", "C")
                .map(CaseInsensitive::new)
                .collect(Collectors.toSet());
        System.out.println(set1.equals(set2));
        System.out.println(set2.equals(set1));
    }
}

Unfortunately, the output is still:

true
false

The reason should be obvious: the type of set returned by Collectors.toSet() is currently a HashSet, and our CaseInsensitive record has not implemented the hashCode() method. It would thus be a case sensitive hash code based on the field value. For this to work correctly, we need to also implement a case insensitive hashCode() version.

public class EvenBetterCaseInsensitiveTreeSet {
    record CaseInsensitive(String value)
            implements Comparable<CaseInsensitive>{
        public boolean equals(Object obj) {
            return obj instanceof CaseInsensitive that
                    && this.value.equalsIgnoreCase(that.value);
        }

        /**
         * Case insensitive hashCode() on the characters of value, which
         * converts each of the chars to lower case.
         */
        public int hashCode() {
            if (value == null) return 0;
            int h = 0, length = value.length();
            for (int i = 0; i < length; i++) {
                h = 31 * h + Character.toLowerCase(value.charAt(i));
            }
            return h;
        }

        public int compareTo(CaseInsensitive that) {
            return String.CASE_INSENSITIVE_ORDER.compare(
                    this.value, that.value
            );
        }
    }
    public static void main(String... args) {
        Set<CaseInsensitive> set1 = Stream.of("a", "b", "c")
                .map(CaseInsensitive::new)
                .collect(Collectors.toCollection(TreeSet::new));
        Set<CaseInsensitive> set2 = Stream.of("A", "B", "C")
                .map(CaseInsensitive::new)
                .collect(Collectors.toSet());
        System.out.println(set1.equals(set2));
        System.out.println(set2.equals(set1));
    }
}

This time, it gives a symmetric output:

true
true

It would be nice if it were possible for TreeSet to check whether the comparator passed into the constructor is consistent with equals() and hashCode(). However, that would require it to also have a sample of all the possible objects that we intend to insert into the set. IMHO the restriction in the Javadocs specification is sufficient. I tweeted about this interesting puzzle and the author of the original TreeSet responded with "Yeah, this is one part of the collections framework that I don't love, but I couldn't come up with a better solution at the time (1997), and still can't. And this TreeSet isn't useless, just quirky." @joshbloch

An example of a Comparator that would be correct is Comparator.reverseOrder(), like this:

public class ReversedTreeSet {
    public static void main(String... args) {
        Set<String> set1 = Stream.of("a", "b", "c")
                .collect(Collectors.toCollection(
                        () -> new TreeSet<>(
                                Comparator.<String>reverseOrder())));
        Set<String> set2 = Stream.of("a", "c", "b")
                .collect(Collectors.toCollection(
                        TreeSet::new));
        Set<String> set3 = Stream.of("c", "a", "b")
                .collect(Collectors.toSet());
        System.out.println(set1.equals(set2));
        System.out.println(set2.equals(set1));
        System.out.println(set3.equals(set1));
        System.out.println(set1.equals(set3));
        System.out.println(set2.equals(set3));
        System.out.println(set3.equals(set2));
    }
}

This time, all of the equals() return true.

It is an interesting quirkiness that we do not see with the ConcurrentSkipListSet. The first two examples would return false for both equals() methods, and the last two would all return true.

King regards

Heinz

 

Comments

We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)

When you load these comments, you'll be connected to Disqus. Privacy Statement.

Related Articles

Browse the Newsletter Archive

About the Author

Heinz Kabutz Java Conference Speaker

Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Superpack '23

Superpack '23 Our entire Java Specialists Training in one huge bundle more...

Free Java Book

Dynamic Proxies in Java Book
Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

We can help make your Java application run faster and trouble-shoot concurrency and performance bugs...