Abstract: In this newsletter we measure the memory requirements of various types of hash maps available in Java. Maps usually need to be threadsafe, but non-blocking is not always the most important requirement.
Welcome to the 193rd issue of The Java(tm) Specialists' Newsletter, sent from the amazing Island of Crete. A few weeks ago my Greek teacher introduced me to "aoristos", the simple past. This is great, because now I can bore my Greek friends with wild tales of life in Africa when I was a child. "Ipia me tous elefantes" - I drank with the elephants. The beauty of being from Africa is that Europeans will believe anything you say. "Ahh, so you had a lion as a pet? I knew it!"
In my previous
newsletter, I challenged you to explain why
the anonymous class sets the this$0
field
before calling super()
. Kai Windmoeller
was the first to send me a partial reason and Wouter
Coekaerts was the first with a perfect explanation. Both Kai
and Wouter subsequently sent me other clever ideas that I
would like to incorporate in future newsletters. [And the
explanation is ....... you'll have to figure that out
yourself :-) A hint though, if you compile the class with
-source 1.3 and -target 1.3, it does not do that. See what
issues that can cause and you will see why we need this.]
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
My newsletter is strongly connected to my courses. When I research Java for my newsletter, ideas emerge on how to improve my advanced Java courses. Questions asked during the courses often stimulate ideas for new research topics. It is a delicate ecosystem. They cannot exist without each other.
A few months ago, during one of my masters courses, one of the programmers mentioned that they had noticed a memory bottleneck with the ConcurrentHashMap. They were creating about 100000 maps and wanted them to be threadsafe. The natural choice seemed to be the ConcurrentHashMap, since it, well, is supposed to work with concurrent access.
The ConcurrentHashMap splits the bucket table into a number of segments, thus reducing the probability that you would have contention when modifying the map. It is quite a clever design and scales nicely to about 50 cores. Above 50 cores, you would be better off using Cliff Click's Highly Scalable Libraries. Since my friends did not need high scalability, the ConcurrentHashMap seemed fine.
Whilst doing a memory profile, JVisualVM showed that the top culprit was the ConcurrentHashMap.Segment class. The default number of segments per ConcurrentHashMap is 16. The HashEntry tables within the segments would probably be tiny, but each Segment is a ReentrantLock. Each ReentrantLock contains a Sync, in this case a NonFairSync, which is a subclass of Sync and then AbstractQueuedSynchronizer. Each of these contains a queue of nodes that maintain state of what is happening with your threads. It is used when fairness is determined. This queue and the nodes use a lot of memory.
Many years ago, I wrote a newsletter that demonstrated a simple memory test bench. It would construct an object, then release it again with System.gc() and measure the difference. Here is a slightly updated version of the MemoryTestBench. It still does virtually the same, with the only enhancement that I sleep a bit after each System.gc() call:
public class MemoryTestBench { public long calculateMemoryUsage(ObjectFactory factory) { Object handle = factory.makeObject(); long memory = usedMemory(); handle = null; lotsOfGC(); memory = usedMemory(); handle = factory.makeObject(); lotsOfGC(); return usedMemory() - memory; } private long usedMemory() { return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory(); } private void lotsOfGC() { for (int i = 0; i < 20; i++) { System.gc(); try { Thread.sleep(100); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } } public void showMemoryUsage(ObjectFactory factory) { long mem = calculateMemoryUsage(factory); System.out.println( factory.getClass().getSimpleName() + " produced " + factory.makeObject().getClass().getSimpleName() + " which took " + mem + " bytes"); } }
I created an ObjectFactory interface and some basic classes to test that it still worked correctly:
public interface ObjectFactory { Object makeObject(); } public class BasicObjectFactory implements ObjectFactory { public Object makeObject() { return new Object(); } } public class IntegerObjectFactory implements ObjectFactory { public Object makeObject() { return new Integer(333); } } public class LongObjectFactory implements ObjectFactory { public Object makeObject() { return new Long(333L); } }
I tried this out with Java 1.6.0_24 on my Mac OS X 10.6.8 and with a self-built Java 7 based on the OpenJDK. I also tried 32 vs. 64 bit, server vs. client on the 32 bit and the flag -XX:+UseCompressedOops on the Server Hotspot Compiler. The -server and -client made the least difference, so I only include the -server results. Also, the results for Java 7 were similar enough to 1.6.0_24 that I will only show the 1.6 data:
Java Version 1.6.0_24 with -server 32/64bit 64 64 32 CompressedOops No Yes N/A Object 24 24 8 Long 24 24 16 Integer 24 24 16
It looks as if the -XX:+UseCompressedOops flag has no effect on these objects. You will only see the difference with more complex objects that have pointers to others. This flag can also speed up your application substantially if you are using a 64 bit machine.
Here are some factories for creating various hash maps. The first is not threadsafe, the other two are:
public class HashMapFactory implements ObjectFactory { public Object makeObject() { return new HashMap(); } } public class SynchronizedHashMapFactory implements ObjectFactory { public Object makeObject() { return Collections.synchronizedMap(new HashMap()); } } public class HashtableFactory implements ObjectFactory { public Object makeObject() { return new Hashtable(); } }
We see that even basic hash tables differ greatly in size between various implementations. If memory space is a major issue, like it was for my friends, then the Java 1.0 Hashtable class might work best. Hashtable is fully synchronized, which means that it will cause contention when accessed from more than one core at a time. It also uses integer division to locate the correct bucket, which is slower than the bit masking approach used since Java 1.4. However, if memory is your bottleneck, then Hashtable might be a good solution. Here are the memory sizes:
32/64bit 64 64 32 CompressedOops No Yes N/A HashMap 216 128 120 SynchronizedMap 272 160 152 Hashtable 176 112 96
The ConcurrentHashMap allows us to construct it with a concurrency level, which is used to calculate the number of segments that the map will contain. The actual number of segments is a power of 2 greater or equal to the concurrency level. Thus if we construct a map with concurrency level of 200, it will create 256 segments. As mentioned above, every segment is subclassed from ReentrantLock. Thus we will show the sizes for ReentrantLock, and ConcurrentHashMaps with sizes 16 (the default), 2 and 256:
public class ReentrantLockFactory implements ObjectFactory { public Object makeObject() { return new ReentrantLock(); } } public class ConcurrentHashMapFactory implements ObjectFactory { public Object makeObject() { return new ConcurrentHashMap(); } } public class SmallConcurrentHashMapFactory implements ObjectFactory { public Object makeObject() { return new ConcurrentHashMap(16, .75f, 2); } } public class BigConcurrentHashMapFactory implements ObjectFactory { public Object makeObject() { return new ConcurrentHashMap(16, .75f, 256); } }
Based on these results, we can see why the ConcurrentHashMap uses so much memory, even when it is empty.
32/64bit 64 64 32 CompressedOops No Yes N/A ReentrantLock 72 56 40 ConcurrentHashMap 2272 1664 1272 SmallConcurrentHashMap 480 312 272 BigConcurrentHashMap 34912 25664 19512
Another option is Cliff Click's Highly Scalable Libraries.
import org.cliffc.high_scale_lib.*; public class HighlyScalableTableFactory implements ObjectFactory { public Object makeObject() { return new NonBlockingHashMap(); } }
Click's empty NonBlockingHashMap uses a lot less memory than the ConcurrentHashMap.
32/64bit 64 64 32 CompressedOops No Yes N/A ConcurrentHashMap 2272 1664 1272 NonBlockingHashMap 1312 936 872
Let's see what happens when we put some elements into the map, by writing an ObjectFactory that fills the map with objects. By adding autoboxed Integers from the integer cache and constant Strings, we can measure the hash map overhead, instead of the objects contained inside the map.
public class FullMapObjectFactory implements ObjectFactory { private final ObjectFactory factory; protected FullMapObjectFactory(ObjectFactory factory) { this.factory = factory; } public Object makeObject() { return fill((Map) factory.makeObject()); } protected Map fill(Map map) { for (int i = -128; i < 128; i++) { map.put(i, "dummy"); } return map; } }
With this particular data set, the NonBlockingHashMap uses the least amount of memory, but I have seen other data sets where the Hashtable uses the least. You would have to try it out in your particular situation to find the best possible map for your needs. Also remember that the NonBlockingHashMap scales to hundreds of cores, whereas the Hashtable would have contention with two.
32/64bit 64 64 32 CompressedOops No Yes N/A ConcurrentHashMap 30024 21088 16872 SmallConcurrentHashMap 16736 10488 8400 BigConcurrentHashMap 48144 34040 26416 Hashtable 15440 9792 7728 NonBlockingHashMap 10912 6696 6632
As in all performance issues, you need to measure and not guess. Please only change your code if you have measured and verified that this is indeed your bottleneck.
Kind regards from sunny Crete
Heinz
P.S. You might have seen disturbing images of Greek rioting. Fortunately this is far away from where we live on the Island of Crete. The despair is quite real though.
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.