Abstract: We can construct our TreeSet with our own Comparator, however, we need to be careful that it conforms to the specification. The Comparator needs to be consistent with equals() and hashCode() of the elements, otherwise we might end up with a TreeSet with non-symmetric equals() behaviour.
Welcome to the 309th edition of The Java(tm) Specialists' Newsletter, sent from one of the most fun Java conference in the world - JPrime in Sofia, Bulgaria. The organizers won the coveted Duke's Choice Award in 2018. I arrived by plane via Athens and there was a smiling Ivan St Ivanov, the leader of the conference, waiting to take me to the speakers' hotel. Their program is packed with such interesting talks, that it is difficult to decide which one to go to. Right now I'm listening to Grace Jansen talk about JITServer. Everyone at the conference looks so happy and chilled, it's a pleasure to be here.
Our speaker hotel is great and not too far from the conference venue. It certainly is walking distance. Having already done my morning run, I wasn't too keen on the walk, especially carrying my heavy laptop bag, but thought it was close enough to make it in time for Dmytro Vyazelenko and Maurice Nafatlin's Collection Puzzlers. So I set off. As I was trudging along, Google Maps kept on adjusting the remaining time to match my current position. After about 1/3 of the way, I realized I wasn't going to make it in time. So I upped my pace, pearls of sweat appearing on my forehead, laptop bouncing around on my shoulderstrap. A taxi whizzed past, but the light was off. The taxi had to stop in traffic and I caught up, wondering whether by some wonder I could persuade the driver to take me the rest of the way. But it was occupied. I looked carefully, and it was one of my friends :-))) I pointed at the front seat, he gave me a thumbs-up and I hopped in, much to the amazement of the driver. Got to the conference with plenty of time to spare and secured my seat on the front row, ready for the Collection Puzzlers.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
One of the puzzles involved a TreeSet
with a String.CASE_INSENSITIVE_ORDER
comparator.
It produced an inconsistent equals result when compared to another set. For example,
something like this:
public class CaseInsensitiveTreeSet { public static void main(String... args) { Set<String> set1 = Stream.of("a", "b", "c") .collect(Collectors.toCollection( () -> new TreeSet<>( String.CASE_INSENSITIVE_ORDER))); Set<String> set2 = Stream.of("A", "B", "C") .collect(Collectors.toSet()); System.out.println(set1.equals(set2)); System.out.println(set2.equals(set1)); } }
The output from this code is:
true false
The equals() method in java.lang.Object
specifies that it needs to be reflexive
(x.equals(x)), symmetric (x.equals(y) iff y.equals(x)), transitive
(x.equals(y) && y.equals(z) => x.equals(z)) and consistent. The code
above clearly violates the symmetric requirement, because set1.equals(set2)
is true, but set2.equals(set1)
is false.
Tagir Valeev was sitting
next to me and pointed me to the specification for TreeSet
, where it clearly
states:
"Note that the ordering maintained by a set (whether or not an explicit comparator is provided) must be consistent with equals if it is to correctly implement the {@code Set} interface. (See {@code Comparable} or {@code Comparator} for a precise definition of consistent with equals.) This is so because the {@code Set} interface is defined in terms of the {@code equals} operation, but a {@code TreeSet} instance performs all element comparisons using its {@code compareTo} (or {@code compare}) method, so two elements that are deemed equal by this method are, from the standpoint of the set, equal. The behavior of a set is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the {@code Set} interface.
The TreeSet
we constructed had an explicit comparator, but the Strings were
not necessarily equal using the standard equals()
method of String
. Thus we
violated the general contract of Set, but our TreeSet would still be
considered well-defined. However, if we compare it to other Set instances,
we cannot expect it to give us the results we were hoping for.
Let's try the same thing, but this time using a wrapper class for the String
that implements both compareTo()
and equals()
to be case insensitive:
public class BetterCaseInsensitiveTreeSet { record CaseInsensitive(String value) implements Comparable<CaseInsensitive>{ public boolean equals(Object obj) { return obj instanceof CaseInsensitive that && this.value.equalsIgnoreCase(that.value); } public int compareTo(CaseInsensitive that) { return String.CASE_INSENSITIVE_ORDER.compare( this.value, that.value ); } } public static void main(String... args) { Set<CaseInsensitive> set1 = Stream.of("a", "b", "c") .map(CaseInsensitive::new) .collect(Collectors.toCollection(TreeSet::new)); Set<CaseInsensitive> set2 = Stream.of("A", "B", "C") .map(CaseInsensitive::new) .collect(Collectors.toSet()); System.out.println(set1.equals(set2)); System.out.println(set2.equals(set1)); } }
Unfortunately, the output is still:
true false
The reason should be obvious: the type of set returned by Collectors.toSet()
is
currently a HashSet
, and our CaseInsensitive
record has not implemented the
hashCode()
method. It would thus be a case sensitive hash code based on the field value.
For this to work correctly, we need to also implement a case insensitive hashCode()
version.
public class EvenBetterCaseInsensitiveTreeSet { record CaseInsensitive(String value) implements Comparable<CaseInsensitive>{ public boolean equals(Object obj) { return obj instanceof CaseInsensitive that && this.value.equalsIgnoreCase(that.value); } /** * Case insensitive hashCode() on the characters of value, which * converts each of the chars to lower case. */ public int hashCode() { if (value == null) return 0; int h = 0, length = value.length(); for (int i = 0; i < length; i++) { h = 31 * h + Character.toLowerCase(value.charAt(i)); } return h; } public int compareTo(CaseInsensitive that) { return String.CASE_INSENSITIVE_ORDER.compare( this.value, that.value ); } } public static void main(String... args) { Set<CaseInsensitive> set1 = Stream.of("a", "b", "c") .map(CaseInsensitive::new) .collect(Collectors.toCollection(TreeSet::new)); Set<CaseInsensitive> set2 = Stream.of("A", "B", "C") .map(CaseInsensitive::new) .collect(Collectors.toSet()); System.out.println(set1.equals(set2)); System.out.println(set2.equals(set1)); } }
This time, it gives a symmetric output:
true true
It would be nice if it were possible for TreeSet
to check whether
the comparator passed into the constructor is consistent with
equals()
and hashCode()
. However, that would require it to also have
a sample of all the possible objects that we intend to insert into the
set. IMHO the restriction in the Javadocs specification is sufficient.
I tweeted
about this interesting puzzle and the author of the original
TreeSet responded with "Yeah, this is one part of the collections
framework that I don't love, but I couldn't come up with a better
solution at the time (1997), and still can't. And this TreeSet isn't
useless, just quirky." @joshbloch
An example of a Comparator
that would be correct is Comparator.reverseOrder()
, like this:
public class ReversedTreeSet { public static void main(String... args) { Set<String> set1 = Stream.of("a", "b", "c") .collect(Collectors.toCollection( () -> new TreeSet<>( Comparator.<String>reverseOrder()))); Set<String> set2 = Stream.of("a", "c", "b") .collect(Collectors.toCollection( TreeSet::new)); Set<String> set3 = Stream.of("c", "a", "b") .collect(Collectors.toSet()); System.out.println(set1.equals(set2)); System.out.println(set2.equals(set1)); System.out.println(set3.equals(set1)); System.out.println(set1.equals(set3)); System.out.println(set2.equals(set3)); System.out.println(set3.equals(set2)); } }
This time, all of the equals() return true
.
It is an interesting quirkiness that we do not see with the ConcurrentSkipListSet.
The first two examples would return false
for
both equals() methods, and the last two would all return true
.
King regards
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.