Running on Java 24-ea+24-2960 (Preview)
Home of The JavaSpecialists' Newsletter

303Null Keys and Values in Maps

Author: Dr Heinz M. KabutzDate: 2022-08-30Java Version: 19Category: Tips and Tricks
 

Abstract: Some Map implementations allow null keys and values. This leads to funky behaviour when calling putIfAbsent() and the compute functions. In this newsletter we look a bit more closely at the issues at hand when allowing nulls in maps.

 

Welcome to the 303rd edition of The Java(tm) Specialists' Newsletter, sent from the wonderful Island of Crete. Last month we had a small version of JCrete, with some of the folks staying on for a few days. We socialized like we haven't for two years, and as a result, I missed the July edition of this newsletter. But we are back!

javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.

Null Keys and Values in Maps

A few days ago, my friend Dmitry Vyazelenko mentioned the strange behavior of nulls in maps. That tweet inspired this newsletter. Thanks Dmitry :-)

When I started coding in Java, the only collections we had available were Vector and Hashtable. Anything else, we had to write ourselves. Well, there was also Stack, a subclass of Vector, and Properties, a subclass of Hashtable. But we did not use them much. At least in those days job interviews were quick: "If you needed to store key/value pairs, which data structure would you use?" "Um, Vector?" "Sorry, next candidate please."

Soon a collection framework was created to fill the gap. It was based on the standard template library for C++ and was named JGL by a company called ObjectSpace. It had some cool features like one-to-many mappings in maps, similar to Eclipse Collection's Multimap. We used JGL for a while, but when Java 1.2 was released, switched over to the Java Collections framework.

Old Hashtable had a quirk that we could insert neither a null key nor a null value. If we desperately wanted to add null, we would create a dummy object representing null and add that instead, and then convert that back to null if we found it. It was messy, and we wished for the ability to add null. Be careful what you wish for, it may be granted!

The Map interface specified that an implementation might accept null keys and values, but it may also reject them. Thus we might see NullPointerExceptions on put() or get(). When the Map interface was introduced in Java 1.2, we had four Map implementations with five different behaviours. What fun:

  • Hashtable would always throw a NullPointerException for either null keys or values.
  • HashMap would be happy with both null keys and values.
  • TreeMap would accept null values, but would throw a NullPointerException with null keys. However, if the very first pair had a null key, then it accepted it happily. And if we created the TreeMap with a Comparator that worked with null, then the tree would accept a null key.
  • WeakHashMap would accept null keys, but would never remove that entry again.

In Java 1.4, LinkedHashMap and IdentityHashMap were added, and these behaved the same as HashMap.

In Java 5, several things happened. First off, generics were added to the language. These were sneaked in without breaking backwards compatibility. Also a ConcurrentHashMap was added, which had much faster read performance than the Hashtable. The ConcurrentHashMap used an internal lock, thus we could no longer participate in the locking from outside. Hashtable synchronized on "this". With ConcurrentHashMap, any compound actions would need to be supported by the map itself. However, in those days we did not have default methods in interfaces, and so a new interface was born, the ConcurrentMap. The most useful of these compound methods was putIfAbsent(). The only public implementations of the ConcurrentMap in the JDK are the ConcurrentHashMap from Java 5, and later the ConcurrentSkipListMap from Java 6. They both throw a NullPointerException if we attempt to add a null key or value. However, the Javadoc for ConcurrentMap does not prevent an implementation from accepting null, but when the map does allow null, the behavior becomes a bit funky. The meaning of "absent" is that the key either does not exist at all or that the value is null. Thus the following two snippets of code are not equivalent if we can have null values:

if (!map.containsKey(-1)) map.put(-1, 0L);
if (map.get(-1) == null) map.put(-1, 0L);

However, if we want to be pedantic, only the second one can be converted to putIfAbsent() without change in semantics. For example, if we have put (-1, null) into the map, then the first code snippet would not overwrite it, but the second would.

We see here that if the map contains a mapping to (-1, null), then it is not overwritten in the first if statement, but it is in the second:

public class PutIfAbsentFun {
  public static void main(String... args) {
    Map<Integer, Long> map = HashMap.newHashMap(1); // Java 19
    map.put(-1, null); // start with a null value
    System.out.println("map = " + map); // {-1=null}
    if (!map.containsKey(-1)) map.put(-1, 0L);
    System.out.println("map = " + map); // {-1=null}
    if (map.get(-1) == null) map.put(-1, 0L);
    System.out.println("map = " + map); // {-1=0}
  }
}

At first glance, it seems that computeIfAbsent() works like putIfAbsent(). If the value is null, then the computation is done. However, there is a subtlety. If the computation function returns null, then the entry is not inserted into the map. And if we use computeIfPresent() or compute() and their function return null, then the entry is removed from the map. This behaviour makes sense if maps do not allow null keys and values. But it can get confusing when they do.

public class ComputeIfFun {
  public static void main(String... args) {
    Map<Integer, Long> map = HashMap.newHashMap(1);
    System.out.println("map = " + map); // {}
    map.computeIfAbsent(-1, key -> {
      System.out.println("computing the value");
      return null; // this will not add an entry into the map
    });
    System.out.println("map = " + map); // {}
    map.computeIfAbsent(-1, key -> {
      System.out.println("computing the value");
      return 0L; // the map now contains -1=0L
    });
    System.out.println("map = " + map); // {-1=0L}
    map.computeIfPresent(-1, (key, value) -> {
      System.out.println("computing the value again");
      return null; // this removes the entry from the map
    });
    System.out.println("map = " + map); // {}
    map.put(-1, null);
    System.out.println("map = " + map); // {-1=null}
    map.compute(-1, (key, value) -> null); // removes the entry
    System.out.println("map = " + map);
  }
}

The merge() function is subtly different, yet still mostly the same. The value parameter cannot be null, otherwise we get a NullPointerException. However, if the value in the map is null, then the merge() function is not called and instead, the value is simply added. Similarly to the compute() methods, if our function returns null, then the entry is removed. Here is an example:

public class MergeFun {
  public static void main(String... args) {
    Map<Integer, Long> map = HashMap.newHashMap(1);
    System.out.println("map = " + map); // {}
    map.put(-1, null);
    System.out.println("map = " + map); // {-1=null}
    map.merge(-1, 1L, (value1, value2) -> 2L);
    System.out.println("map = " + map); // {-1=1}
    map.merge(-1, 1L, (value1, value2) -> 2L);
    System.out.println("map = " + map); // {-1=2}
    map.merge(-1, 1L, (value1, value2) -> null);
    System.out.println("map = " + map); // {}
  }
}

The compound methods remove(key, value), replace(key, value), and replace(key, oldValue, newValue) treat a null value as being present:

public class ReplaceRemoveFun {
  public static void main(String... args) {
    Map<Integer, Long> map = HashMap.newHashMap(1);
    System.out.println(map.replace(-1, null)); // null
    map.put(-1, 0L);
    System.out.println(map.replace(-1, null)); // 0

    System.out.println(map.replace(-1, null, null)); // true

    System.out.println(map.remove(-1, null)); // true
    System.out.println(map.remove(-1, null)); // false
  }
}

Allowing null keys in maps has some interesting implications. I sent out some Twitter polls to see how many programmers would know the correct answers:

What does new TreeMap().put(null, null) produce? with possible answers of "No output" (15%), NullPointerException (61%) and "Either, depending on ..." (24%). The correct answer is the last, because it depends on the Java version. Versions 1.2 through 1.6 they accepted the first key to be null, and would only throw a NullPointerException on subsequent entries. This was made more consistent in Java 1.7 to always throw a NullPointerException unless the TreeMap was created with a Comparator that could cope with null keys. See Bug 5045147.

In a WeakHashMap.put(null, "Hello world"), when will the entry be removed in #Java? with possible answers of "Won't be added" (35%), "At the next GC" (14%), "A while after the next GC" (14%), and "Never" (37%). The correct answer is the last one, since when the key is null, a private static final object is used instead as a key. Since this object is always reachable, the WeakReference will never be cleared, and thus the entry will remain in the map forever.

What is the computational time complexity of HashMap.get(null) in #Java, where n is the number of keys with a hash code of 0? Possible answers were "linear - O(n)" (29%), "quadratic - O(n²)" (3%), "constant - O(1)" (47%), and "logarithmic - O(log n)" (21%). The correct answer was the first - linear. Since Java 8, when a lot of keys have the same hash code, the HashMap builds up a red-black tree of entries. Thus ordinarily the complexity would be logarithmic. However, this is only true if the key is Comparable. Since the key is null, it does not implement Comparable, and thus the map needs to search for it linearly.

A little example for the last quiz, where we create a lot of Strings with hashCode=0 and then also add a null key. This time we are using a HashSet instead, which internally delegates to a HashMap:

public class ZeroHashFun {
  private static final String[] zeroHashCodes = {
      "ARbyguv", "ARbygvW", "ARbyhVv", "ARbyhWW", "ARbzHuv",
      "ARbzHvW", "ARbzIVv", "ARbzIWW", "ARcZguv", "ARcZgvW",
      "ARcZhVv", "ARcZhWW", "ASCyguv", "ASCygvW", "ASCyhVv",
      "ASCyhWW", "ASCzHuv", "ASCzHvW", "ASCzIVv", "ASCzIWW",
      "ASDZguv", "ASDZgvW", "ASDZhVv", "ASDZhWW",};

  public static void main(String... args) {
    for (int i = 16; i <= 24; i++) {
      System.out.println("Testing with size " + (1 << i));
      test(1 << i);
    }
  }

  private static void test(int size) {
    HashSet<String> set = HashSet.newHashSet(size);
    for (int i = 0; i < size - 1; i++) {
      StringBuilder sb = new StringBuilder();
      for (int j = 1, index = 0; j <= i; j <<= 1, index++) {
        if ((i & j) != 0)
          sb.append(zeroHashCodes[index % zeroHashCodes.length]);
      }
      set.add(sb.toString());
    }
    set.add(null); // also adding a null key
    System.out.println("set.size() = " + set.size());

    System.out.println("Searching for key not in the set");
    String notInMap = zeroHashCodes[1] + zeroHashCodes[0];
    for (int repeats = 0; repeats < 10; repeats++) {
      testLookup(set, notInMap);
    }
    System.out.println("Searching for null key");
    for (int repeats = 0; repeats < 10; repeats++) {
      testLookup(set, null);
    }
  }

  private static void testLookup(Set<String> set, String key) {
    long time = System.nanoTime();
    try {
      set.contains(key);
    } finally {
      time = System.nanoTime() - time;
      System.out.printf("time = %dms%n", (time / 1000000));
    }
  }
}

Summary of the output. For each of the times, we only show the median. It is clear though from the timing that the contains() method has linear compexity to the number of keys with hashCode==0, since as we double that size, we also take twice as long to call contains():

Testing with size 65536
set.size() = 65536
Searching for key not in the set
time = 0ms
Searching for null key
time = 0ms
Testing with size 131072
set.size() = 131072
Searching for key not in the set
time = 0ms
Searching for null key
time = 3ms
Testing with size 262144
set.size() = 262144
Searching for key not in the set
time = 0ms
Searching for null key
time = 7ms
Testing with size 524288
set.size() = 524288
Searching for key not in the set
time = 0ms
Searching for null key
time = 13ms
Testing with size 1048576
set.size() = 1048576
Searching for key not in the set
time = 0ms
Searching for null key
time = 25ms
Testing with size 2097152
set.size() = 2097152
Searching for key not in the set
time = 0ms
Searching for null key
time = 55ms
Testing with size 4194304
set.size() = 4194304
Searching for key not in the set
time = 0ms
Searching for null key
time = 104ms
Testing with size 8388608
set.size() = 8388608
Searching for key not in the set
time = 0ms
Searching for null key
time = 230ms
Testing with size 16777216
set.size() = 16777216
Searching for key not in the set
time = 0ms
Searching for null key
time = 454ms

(Even though this could be classified as an extremely mild threat, I don't think I'll bother submitting a bug report. It takes longer to create the set than to call contains().)

None of this weirdness with null keys and values would happen if we used maps that did not allow them in the first place. My concluding recommendation thus is to use ConcurrentHashMap instead of HashMap and ConcurrentSkipListMap instead of TreeMap. No more null weirdness, and race conditions are also taken care of.

I hope you enjoyed reading this as much as I enjoyed writing it :-)

Kind regards from Chorafakia

Heinz

 

Comments

We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)

When you load these comments, you'll be connected to Disqus. Privacy Statement.

Related Articles

Browse the Newsletter Archive

About the Author

Heinz Kabutz Java Conference Speaker

Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Superpack '23

Superpack '23 Our entire Java Specialists Training in one huge bundle more...

Free Java Book

Dynamic Proxies in Java Book
Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

We can help make your Java application run faster and trouble-shoot concurrency and performance bugs...