Abstract: Soft and weak references allow us to control how long objects stay live if they do not have strong references to them. Weak references can help us avoid memory leaks by breaking circular dependencies. Soft references can help us cache objects, recreating them when running low on memory.
Welcome to the 98th edition of The Java(tm) Specialists' Newsletter. Did you know that the world is split into six floral kingdoms? The largest in size is the Holarctic Kingdom (42%), consisting of North America and Eurasia. The smallest is Capensis (0.04%) consisting of Cape Town and surrounding areas. Down here at the most south-west part of Africa, we have the greatest density of diverse plant life on the planet, which we generically call "fynbos". This is wonderful for plant lovers, but terrible for hayfever sufferers like me. If it was up to me, I would chop down all the fynbos and cover it all with concrete! [ok, I know I made enemies with that statement, but if we could swap bodies for one week, you would agree with me ;-]
I usually run my Java courses at the Faculty Training Institute in Cape Town. From our balcony, we overlook the Kenilworth Race Track. When the horse racing track was built, the inside of the track was kept in its original state. This is the last remaining habitat of some micro frogs living in the original fynbos. What happened was that the horse race track guarded the original plants, protecting them from the alien plant invasion from Australia.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
How many articles have you read about soft, weak and phantom references? How many of those did you understand? I tried, and found most of the articles extremely hard to make sense of. This is an attempt to explain some of the behaviour of soft and weak references, and then to invent a new type of references that is, dare I say, more useful than a weak reference?
In newsletter 15, I described a SoftReference based HashMap, where the values were embedded in soft references. (The WeakHashMap has the keys embedded in weak references, so that would not be useful for building a cache.) At the time of writing, I was using JDK 1.3.x, and I did not notice any difference in the behaviour of soft vs. weak references. It turns out that there appeared to be some design flaw in JDK 1.3.x that rendered SoftReferences useless in the Sun JVM. Consider the following test code:
public class LargeObject { private final byte[] space = new byte[1024 * 1024]; private final int id; public LargeObject(int id) { this.id = id; } public int getId() { return id; } }
import java.lang.ref.*; import java.util.*; public class ReferenceTest { private static final int NUMBER_OF_REFERENCES = 30; public static void main(String[] args) { // the map contains reference objects as keys, and the id // as a value. final Map refs = new HashMap(); final ReferenceQueue queue = new ReferenceQueue(); // We need a thread that reads processed references from the // reference queue so that we can see in what order they are // reclaimed. new Thread() { { setDaemon(true); start(); } public void run() { try { while (true) { Reference ref = queue.remove(); Integer id = (Integer) refs.remove(ref); if (ref instanceof SoftReference) { System.out.println("SOFT " + id); } else if (ref instanceof WeakReference) { System.out.println("WEAK " + id); } else { throw new IllegalArgumentException(); } } } catch (InterruptedException e) { return; } } }; for (int i = 0; i < NUMBER_OF_REFERENCES; i++) { System.out.println("NEW " + i); Integer num = new Integer(i); // must keep a strong reference to the actual reference, // otherwise it will not be enqueued. refs.put(new SoftReference(new LargeObject(i), queue), num); refs.put(new WeakReference(new LargeObject(i), queue), num); } byte[][] buf = new byte[1024][]; System.out.println("Allocating until OOME..."); for (int i = 0; i < buf.length; i++) { buf[i] = new byte[1024 * 1024]; } } }
We need to run this with the various JVMs. Very few companies still use JDK 1.2.x, so I will not consider that case. In addition, I could not find a difference between JDK 1.4.x and JDK 5, so will only compare JDK 1.3.x with JDK 5.
Let us start by looking at Sun JDK 1.3.1_12. When we run the code, we can see that soft and weak references are released at about the same time. This is not dependent on how many references we have. It does not matter whether we have 10 or 1000 references. The JavaDocs (hard to understand) of the reference classes seem to hint that weak references should be released before soft references. However, in JDK 1.3.x, this was not the case:
java version "1.3.1_12" Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_12-b03) Java HotSpot(TM) Client VM (build 1.3.1_12-b03, mixed mode) NEW 0 SOFT 0 soft ref released first! NEW 1 WEAK 0 NEW 2 SOFT 1 WEAK 1 NEW 3 NEW 4 SOFT 3 WEAK 2 SOFT 2 WEAK 3 NEW 5 SOFT 4 WEAK 4 NEW 6 SOFT 5 WEAK 5 NEW 7 SOFT 6 WEAK 6 SOFT 7 NEW 8 WEAK 7 NEW 9 SOFT 8 WEAK 8 Allocating until OOME... SOFT 9 hardly any soft references still left on heap WEAK 9 Exception in thread "main" java.lang.OutOfMemoryError <<no stack trace available>>
The JDK 5 behaves more closely to what we would expect. When we run the code, we can see that weak references are typically released first, and soft references are mainly released when the memory becomes low:
java version "1.5.0" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-b64) Java HotSpot(TM) Client VM (build 1.5.0-b64, mixed mode, sharing) NEW 0 NEW 1 SOFT 0 WEAK 0 NEW 2 WEAK 1 NEW 3 WEAK 2 NEW 4 SOFT 2 SOFT 3 SOFT 1 WEAK 3 NEW 5 WEAK 4 NEW 6 WEAK 5 NEW 7 WEAK 6 NEW 8 WEAK 7 NEW 9 WEAK 8 Allocating until OOME... WEAK 9 SOFT 4 lots of soft references still left on heap SOFT 8 SOFT 5 SOFT 9 SOFT 6 SOFT 7 Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
I put this observation to Mark Reinhold, who was kind enough to send me a quick response:
> [I know you are busy, so a simple: "Yes, JDK 1.3.x had a useless > implementation." will suffice :-] In fact that's exactly the case. HotSpot didn't have a useful implementation of soft references until 1.4.
There we go. From the author of the java.lang.ref package himself. No wonder Sydney and I needed such a strange approach in newsletter 15.
I then wrote a new SoftReference based HashMap, but this time based on generics, and using my discoveries of the behaviour of the new SoftReferences. (Thanks to Jason Walton for pointing out two possible memory leaks in the original code.)
import java.lang.ref.*; import java.util.*; import java.io.Serializable; public class SoftHashMap <K, V> extends AbstractMap<K, V> implements Serializable { /** The internal HashMap that will hold the SoftReference. */ private final Map<K, SoftReference<V>> hash = new HashMap<K, SoftReference<V>>(); private final Map<SoftReference<V>, K> reverseLookup = new HashMap<SoftReference<V>, K>(); /** Reference queue for cleared SoftReference objects. */ private final ReferenceQueue<V> queue = new ReferenceQueue<V>(); public V get(Object key) { expungeStaleEntries(); V result = null; // We get the SoftReference represented by that key SoftReference<V> soft_ref = hash.get(key); if (soft_ref != null) { // From the SoftReference we get the value, which can be // null if it has been garbage collected result = soft_ref.get(); if (result == null) { // If the value has been garbage collected, remove the // entry from the HashMap. hash.remove(key); reverseLookup.remove(soft_ref); } } return result; } private void expungeStaleEntries() { Reference<? extends V> sv; while ((sv = queue.poll()) != null) { hash.remove(reverseLookup.remove(sv)); } } public V put(K key, V value) { expungeStaleEntries(); SoftReference<V> soft_ref = new SoftReference<V>(value, queue); reverseLookup.put(soft_ref, key); SoftReference<V> result = hash.put(key, soft_ref); if (result == null) return null; reverseLookup.remove(result); return result.get(); } public V remove(Object key) { expungeStaleEntries(); SoftReference<V> result = hash.remove(key); if (result == null) return null; return result.get(); } public void clear() { hash.clear(); reverseLookup.clear(); } public int size() { expungeStaleEntries(); return hash.size(); } /** * Returns a copy of the key/values in the map at the point of * calling. However, setValue still sets the value in the * actual SoftHashMap. */ public Set<Entry<K,V>> entrySet() { expungeStaleEntries(); Set<Entry<K,V>> result = new LinkedHashSet<Entry<K, V>>(); for (final Entry<K, SoftReference<V>> entry : hash.entrySet()) { final V value = entry.getValue().get(); if (value != null) { result.add(new Entry<K, V>() { public K getKey() { return entry.getKey(); } public V getValue() { return value; } public V setValue(V v) { entry.setValue(new SoftReference<V>(v, queue)); return value; } }); } } return result; } }
We can now use the SoftHashMap just like an ordinary HashMap, except that the entries will disappear if we are running low on memory. Ideal if you want to build a cache. Here is some test code:
import java.util.*; public class SoftHashMapTest { private static void print(Map<String, Integer> map) { System.out.println("One=" + map.get("One")); System.out.println("Two=" + map.get("Two")); System.out.println("Three=" + map.get("Three")); System.out.println("Four=" + map.get("Four")); System.out.println("Five=" + map.get("Five")); } private static void testMap(Map<String, Integer> map) throws InterruptedException { System.out.println("Testing " + map.getClass()); map.put("One", new Integer(1)); map.put("Two", new Integer(2)); map.put("Three", new Integer(3)); map.put("Four", new Integer(4)); map.put("Five", new Integer(5)); print(map); Thread.sleep(2000); print(map); try { byte[] block = new byte[200 * 1024 * 1024]; } catch (OutOfMemoryError ex) { ex.printStackTrace(); } print(map); } public static void main(String[] args) throws InterruptedException { testMap(new HashMap<String, Integer>()); testMap(new SoftHashMap<String, Integer>()); } }
When you run this code, you get:
Testing class java.util.HashMap One=1 Two=2 Three=3 Four=4 Five=5 One=1 Two=2 Three=3 Four=4 Five=5 java.lang.OutOfMemoryError: Java heap space One=1 Two=2 Three=3 Four=4 Five=5 Testing class SoftHashMap One=1 Two=2 Three=3 Four=4 Five=5 One=1 Two=2 Three=3 Four=4 Five=5 java.lang.OutOfMemoryError: Java heap space One=null Two=null Three=null Four=null Five=null
Did you know that you can write your own reference? A few months ago, I was having dinner at Mariner's Wharf in the village of Hout Bay near Cape Town, with three of Sun's Java evangelists. One of the points of discussion was that none of us had ever found a good use for the PhantomReference. The only suggestion that I could come up with is with object counting, and to use it to determine when objects are garbage collected (newsletter 38).
I spent several hours today thinking about PhantomReference.
I did a search on google and could not find any good use case
for the PhantomReference. On very careful inspection, I
discovered some differences between the phantom and weak
references. Both are released rather quickly, but the
phantom reference is enqueued in the reference queue before
it's referent is cleared, whereas the weak reference is
enqueued after the referent is cleared. One hitch -
the PhantomReference.get() method always returns
null
. Hah - time for reflection!
Another difference is that the PhantomReference is enqueued
only after the finalize()
method has
been called.
Another hitch with references is this: If you want the reference to be added to the reference queue, you have to keep a strong reference to the reference. To solve this problem, I keep a Collection of currently active references inside my GhostReference. As is the case with all PhantomReference objects, you have to call clear() on the reference once you have completed working with it.
import java.lang.ref.*; import java.lang.reflect.Field; import java.util.*; public class GhostReference extends PhantomReference { private static final Collection currentRefs = new HashSet(); private static final Field referent; static { try { referent = Reference.class.getDeclaredField("referent"); referent.setAccessible(true); } catch (NoSuchFieldException e) { throw new RuntimeException("Field \"referent\" not found"); } } public GhostReference(Object referent, ReferenceQueue queue) { super(referent, queue); currentRefs.add(this); } public void clear() { currentRefs.remove(this); super.clear(); } public Object getReferent() { try { return referent.get(this); } catch (IllegalAccessException e) { throw new IllegalStateException("referent should be accessible!"); } } }
We now have a more sensible Reference, with a cool name. It is not necessary to have a strong reference to the Reference object, in order for it to be enqueued, and it is possible to get hold of the referent (the object that we want a reference to) from the enqueued Reference:
import java.lang.ref.*; public class GhostReferenceTest { private static final int NUMBER_OF_REFERENCES = 10; public static void main(String[] args) { final ReferenceQueue queue = new ReferenceQueue(); new Thread() { { setDaemon(true); start(); } public void run() { try { while (true) { GhostReference ref = (GhostReference) queue.remove(); LargeObject obj = (LargeObject) ref.getReferent(); System.out.println("GHOST " + obj.getId()); ref.clear(); } } catch (InterruptedException e) { return; } } }; for (int i = 0; i < NUMBER_OF_REFERENCES; i++) { System.out.println("NEW " + i); // We do not need to keep strong reference to the actual // reference anymore, and we also do not need a reverse // lookup anymore new GhostReference(new LargeObject(i), queue); } byte[][] buf = new byte[1024][]; System.out.println("Allocating until OOME..."); for (int i = 0; i < buf.length; i++) { buf[i] = new byte[1024 * 1024]; } } }
Warning: the PhantomReference is only called after the finalize() has completed. This means that we are resurrecting an object that will then never be finalized again. However, this is only a theoretical limitation, as in practice you should never use the finalize() method anyway.
I have still to this day (2009-11-12) not used the PhantomReference in real programming work. What is interesting is that it is also not used once in the whole of the JDK, whereas both weak and soft references are. The WeakReference is subclassed in order to implement Finalizers.
Kind regards
Heinz
P.S. If you can come up with a miracle cure that will solve my hayfever, I will reconsider my stance on the fynbos eradication ;-)
P.P.S. The fynbos is safe - I moved to Crete in Greece in 2006, which has a lot less pollen in the air.
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.