Abstract: Most programmers had a blind spot with the statement "arr[size++] = e;" We somehow think that size will be updated after the assignment. In this newsletter we look at this basic concurrency bug and also present a solution in the form of the Java 8 StampedLock.
Welcome to the 242nd edition of The Java(tm) Specialists' Newsletter, sent to you from rainy Crete. We always get amazed with the first real rain of the season. How dare water fall from the sky? And what is that grey up there? Our olive trees do need a slurp of water to fatten up the fruit for harvesting. Sadly, thanks to gale force winds, a large portion of our crop is now turning into compost. Oh well, maybe next year we can fill our vats again. Until then, rations!
It gives me great pleasure to welcome Burkina Faso to my list of confirmed subscriber countries, bringing the total to 138.
The Java Specialists Slack Team is on fire. Almost 2000 members already, exchanging great ideas, learning new things. We have team members in 22 time zones. This means you can find someone to chat to about Java in real time at any time of day or night. To join, please fill in your details here and you will automagically receive an invite from Slack (but please do check your SPAM folder).
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
We had a lot of fun last month. I sent out a concurrency puzzle that had Java experts around the world scratching their heads. So much so, that I also published some helpful hints.
What amazed me about the responses was how many had not bothered to run the code. Instead, they looked it over and clicked "Reply". None of these responses caught the bug that caused it to fail consistently. Worst of all, they missed out on a fantastic learning opportunity.
When looking for concurrency bugs, the very first step is to try to reproduce it. This can be quite hard. For example, the code that Jack sent me did not fail on my MacBookPro. I had to run it on my 2-4-1 server with older CPUs to cause the bug to appear. Here is a version that has a higher chance of failure, adapted for my friend Stuart Marks:
import java.util.*; public class MyArrayListStuart { private final Object READ_LOCK = new Object(); private final Object WRITE_LOCK = new Object(); private int[] arr = new int[10]; private int size = 0; public int size() { synchronized (READ_LOCK) { return size; } } public int get(int index) { synchronized (READ_LOCK) { rangeCheck(index); return arr[index]; } } public boolean add(int e) { synchronized (WRITE_LOCK) { if (size + 1 > arr.length) arr = Arrays.copyOf(arr, size + 10); arr[size++] = e; return true; } } public int remove(int index) { synchronized (WRITE_LOCK) { rangeCheck(index); int oldValue = arr[index]; int numMoved = size - index - 1; if (numMoved > 0) System.arraycopy(arr, index + 1, arr, index, numMoved); arr[--size] = 0; return oldValue; } } private void rangeCheck(int index) { if (index >= size) throw new IndexOutOfBoundsException( "Index: " + index + ", Size: " + size); } private static volatile boolean goAdd; public static void main(String[] args) { for (int i = 0; i < 100000; i++) { goAdd = false; MyArrayListStuart list = new MyArrayListStuart(); new Thread(new Main(list, true)).start(); new Thread(new Main(list, false)).start(); new Thread(new Main(list, false)).start(); new Thread(new Main(list, false)).start(); } } static class Main implements Runnable { MyArrayListStuart list; boolean update; public Main(MyArrayListStuart list, boolean update) { this.list = list; this.update = update; } @Override public void run() { if (update) { while(!goAdd); goAdd = false; for (int i = 1; i < 1000; i++) { list.add(i); } for (int i = 1; i < 250; i++) { list.remove(7); } } else { // wait until we're certain // index 6 has a value while (list.size() < 7) {goAdd = true;} for (int i = 1; i < 1000; i++) { int x; if ((x = list.get(6)) != 7) { System.out.println(x + " and " + list.size()); } } } } } }
The concurrency bug is easier to manifest with the modified code. We can reproduce it consistently. It is almost impossible to do this with happens-before bugs. Those usually show their ugly faces in production, not in a dinky little test like ours. Visibility bugs usually go wrong and never recover, for example if we have a tight loop reading an unprotected field. In our case the bug is much more basic.
Second clue was that it happened even without remove(). Thus we could safely ignore the remove() and focus only on the add().
A third clue was that resizing had nothing to do with the error. Even if we presized the array to 10000, it would still fail.
The first good answer was by Andreas Senft. He
recognized that it was just a common concurrency bug that had
nothing to do with caching, happens-before, visibility, etc.
Just a plain and simple race condition on the add() method.
If we look inside add(), we will notice the statement
arr[size++] = e;
Most programmers read this
as "Set the array element to e and update the size". In
their minds, this is happening:
arr[size] = e; size = size + 1;
However, it would be more accurate to view it as this:
int temp = size; size = size + 1; arr[temp] = e;
It now becomes obvious that size
is first updated and then
the array element is set. If between the call to
size = size + 1;
and arr[temp] = e;
the thread is swapped out, the reader thread could try to
read an element that does not exist yet, since it is locking
on a difference monitor. If we replace the code with the
following, then errors occur less frequently (but it is still
incorrect):
arr[size] = e; size++;
Proving this is still broken will be more challenging. The test that we had in the previous newsletter would certainly not suffice.
The problem with the code was that we had multiple locks guarding fields that had some invariants across them. The whole motivation was to try to avoid having the write locks block threads that just want to read. I thought that this would be a good use case for Java 8 StampedLock. If any of this is new to you, may I suggest you take my "Extreme Java - Concurrency and Performance Course"? You will learn more in three days than you would ever have thought possible!
StampedLock was added in Java 8 to give us a possibility to
do optimistic reads on fields. This was useful when we had
multiple fields with an invariant across them. In our
example, that would be the fields arr
and
size
.
IntList
is a collaborative effort by Peter
Levart, Henri Tremblay, Hanko Gergely, Alex Snaps and myself.
Thanks also to further input by members of the Concurrency
Interest Mailing List such as Martin Buchholz, Nitsan Wakart,
Alex Otenko and Aleksey Shipilev. Lastly, thanks to Vladimir
Sitnikov for pointing out that the comment of size() could be
improved. Each of us has helped to refine the final solution
a little bit. It is a work in progress, so please let me
know if you find a way to improve it further. It almost
certainly contains some juicy undiscovered bugs.
First up is size()
. Since all we care about is
visibility of the size
field, we only need to
call the tryOptimisticRead()
method. This does
a volatile read on a field inside StampedLock, which has the
same effect as entering a synchronized block, thus
guaranteeing that we will see the correct value of the size
field.
The get()
method is the most complex. We try to
read the array element into a local variable between the
calls to tryOptimisticRead()
and
validate()
, making sure we do not cause an
ArrayIndexOutOfBoundsException
in the process.
If we are fast enough, we can do this before another thread
acquires a write lock. You'll notice that we try the
optimistic read three times and if we don't succeed, change
to a pessimistic non-exclusive read lock. This spinning can
be good, but it can also hurt us if we do too much of it.
The trimToSize()
function is meant to show how
we can use the tryOptimisticRead()
to avoid
acquiring a write lock if we don't need it.
The add()
method is straightforward, almost like using a
ReentrantLock.
We have two remove functions. The plain
remove()
is analogous to add()
,
using a pessimistic exclusive write lock. The
removeWithUpgrade()
acquires a read lock, then
if the index is not out of range, we try to upgrade that to a
write lock. A ReentrantReadWriteLock would deadlock, but the
StampedLock might succeed. This conditional write lock is
useful when we have a high chance of not needing a write
lock. However, in our case we would hope that most of the
time the index would be within range. Thus
remove()
would probably be a better solution.
Faster and easier to understand and maintain.
Over to the code. Have fun. We will discuss this on the #newsletter channel in the JavaSpecialists.slack.com Team. Please fill in your details here to become a member (no fees, no obligations, except to "be nice").
import java.util.*; import java.util.concurrent.locks.*; public class IntList { private static final int OPTIMISTIC_SPIN = 3; private final StampedLock sl = new StampedLock(); private int[] arr = new int[10]; private int size = 0; /** * The size() method cannot be used to perform compound * operations, such as getting the last element of a list. * This code could fail with an IndexOutOfBoundsException: * * <pre> * Thread1: * while(true) { * list.get(list.size()-1); * } * * Thread2: * for (int i = 0; i < 2000; i++) { * for (int j = 0; j < 100; j++) { * list.add(j); * } * for (int j = 0; j < 50; j++) { * list.remove(0); * } * } * </pre> */ public int size() { // Internally, the tryOptimisticRead() is reading a volatile // field. From a memory visibility perspective, reading a // volatile field is like entering a synchronized block. // We will thus not have an issue with stale values of size. // Note 1: volatile != synchronized. Volatile does not // magically make compound operations on fields mutually // exclusive. Race conditions are more probable on volatile // fields. // Note 2: We could also have made size volatile. From a // visibility perspective the tryOptimisticRead() will work, // but not if size was a long or if it could have // intermediate values that broke invariants. sl.tryOptimisticRead(); return this.size; } public int get(int index) { for (int i = 0; i < OPTIMISTIC_SPIN; i++) { long stamp = sl.tryOptimisticRead(); int size = this.size; int[] arr = this.arr; if (index < arr.length) { int r = arr[index]; if (sl.validate(stamp)) { rangeCheck(index, size); return r; } } } long stamp = sl.readLock(); try { rangeCheck(index, this.size); return this.arr[index]; } finally { sl.unlockRead(stamp); } } public void trimToSize() { long stamp = sl.tryOptimisticRead(); int currentSize = size; int[] currentArr = arr; if (sl.validate(stamp)) { // fast optimistic read to accelerate trimToSize() when // there is no work to do if (currentSize == currentArr.length) return; } stamp = sl.writeLock(); try { if (size < arr.length) { arr = Arrays.copyOf(arr, size); } } finally { sl.unlockWrite(stamp); } } public boolean add(int e) { long stamp = sl.writeLock(); try { if (size + 1 > arr.length) arr = Arrays.copyOf(arr, size + 10); arr[size++] = e; return true; } finally { sl.unlockWrite(stamp); } } // just to illustrate how an upgrade could be coded public int removeWithUpgrade(int index) { long stamp = sl.readLock(); try { while (true) { rangeCheck(index, size); long writeStamp = sl.tryConvertToWriteLock(stamp); if (writeStamp == 0) { sl.unlockRead(stamp); stamp = sl.writeLock(); } else { stamp = writeStamp; return doActualRemove(index); } } } finally { sl.unlock(stamp); } } public int remove(int index) { long stamp = sl.writeLock(); try { rangeCheck(index, size); return doActualRemove(index); } finally { sl.unlock(stamp); } } private int doActualRemove(int index) { int oldValue = arr[index]; int numMoved = size - index - 1; if (numMoved > 0) System.arraycopy(arr, index + 1, arr, index, numMoved); arr[--size] = 0; return oldValue; } private static void rangeCheck(int index, int size) { if (index >= size) throw new IndexOutOfBoundsException( "Index: " + index + ", Size: " + size); } }
I ran some performance figures comparing this against a correctly synchronized version of MyArrayList that we had in our previous newsletter. get() is roughly twice the speed and uses far less system CPU time thanks to reduced voluntary context switching. add() and remove() are slightly slower, but as expected, size() is ridiculously fast. I've now added this as an exercise to my concurrency course. Let's see how well professional Java programmers cope when they see StampedLock for the first time :-)
Kind regards
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.