Abstract: In this newsletter we examine what happens when a ReadWriteLock is swamped with too many reader or writer threads. If we are not careful, we can cause starvation of the minority in some versions of Java.
Welcome to the 165th issue of The Java(tm) Specialists' Newsletter, sent to you from Wall Street, New York City. New York has to date been the most expensive city that I have visited. Not that prices are particularly high in comparison to Crete, but you can purchase almost anything, twenty four hours a day! Usually when I go to New York, my bank sends me urgent emails confirming that I really am spending all that money on my credit card. In contrast, back home in Crete, items are often double the New York price, but there is very little temptation to spend.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
On my flight from Greece to New York, I decided to try a different approach to try cope with jet lag. Instead of taking a nap, I forced myself to stay awake, thus adapting immediately to the new time zone. No jet lag at all this time round! To keep myself occupied, I began investigating what happens with ReadWriteLocks in contention situations, which resulted in this newsletter.
I had always thought that with the ReadWriteLock, the write lock would get preference over the read lock. Since the read locks can get allocated concurrently, they could essentially tag-team each other and thus never give a chance to a write lock to enter the critical section. However, something in the back on my mind always told me to try it out. The documentation certainly did not promise that write locks would be given preference, on the contrary, it gave no guarantee.
Writing tests like this can be tricky, so I showed my findings to some of the leading Java concurrency experts, who confirmed that they had similar experiences. My test is probably not perfect, but it does demonstrate what happens when we have many threads acquiring read locks and only few threads acquiring write locks.
To make it more interesting, I gave each thread a bunch of instructions to do in the critical section. Doug Lea recommended in an email that the ReadWriteLock should only be used in cases where the critical section is at least 2000 code statements. Doug also mentioned that the Locks had been modified slightly in Java 6, which may have made the problem of starvation more pronounced. In my experiments, I found that Java 5 was much worse than Java 6 in this regard. Doug Lea also suggested that for most applications, we should rather use the concurrency classes, such as ConcurrentHashMap and ConcurrentLinkedQueue, rather than rely on the ReadWriteLock, which I wholeheartedly agree with.
To be able to test different types of locks, we define a LockFactory interface, with two implementations, ReadWriteLockFactory and StandardLockFactory:
import java.util.concurrent.locks.Lock; public interface LockFactory { Lock createLock(boolean readOnly); } import java.util.concurrent.locks.*; public class ReadWriteLockFactory implements LockFactory { private final ReadWriteLock rwl; public ReadWriteLockFactory(boolean fair) { rwl = new ReentrantReadWriteLock(fair); } public Lock createLock(boolean readOnly) { return readOnly ? rwl.readLock() : rwl.writeLock(); } } import java.util.concurrent.locks.*; public class StandardLockFactory implements LockFactory { private final Lock lock; public StandardLockFactory(boolean fair) { lock = new ReentrantLock(fair); } public Lock createLock(boolean readOnly) { return lock; } }
We then define a class that implements Runnable and will
acquire locks repeatedly, called the LockAcquirer. The
LockAcquirer will run until the shared AtomicBoolean
running
is set to false
.
We also give it the lock that must be acquired, which could
either be a read or a write lock. The CountDownLatches
ensure that the threads start and end at the same time.
Lastly, the readOnly field helps us to distinguish what the
number of calls for both read and write locks was.
In order to see real starvation of threads, I added a work() method, that adds some random numbers and then subtracts them again, thus fooling the hotspot compiler into not removing that particular code. As we've seen in other newsletters, a simple for() loop could be optimized away in a flash. On my dual-core MacBook Pro, we should see both cores at work when we use read locks, but only one at a time when we use write locks.
Here is the code for our LockAcquirer class:
import java.util.concurrent.CountDownLatch; import java.util.concurrent.atomic.AtomicBoolean; import java.util.concurrent.locks.Lock; import java.util.Random; public class LockAcquirer implements Runnable { private final AtomicBoolean running; private final Lock lock; private final CountDownLatch start; private final CountDownLatch finish; private final boolean readOnly; private long numberOfLockAcquisitions = 0; private int work = 50; private long offset = calculateOffset(); private long calculateOffset() { Random r = new Random(0); long result = 0; for(int i = 0; i<work; i++) { result += r.nextInt(); } return result; } public LockAcquirer(AtomicBoolean running, Lock lock, CountDownLatch start, CountDownLatch finish, boolean readOnly) { this.running = running; this.lock = lock; this.start = start; this.finish = finish; this.readOnly = readOnly; } public void run() { try { start.countDown(); start.await(); while (running.get()) { lock.lock(); try { work(); } finally { lock.unlock(); } } finish.countDown(); } catch (InterruptedException e) { return; } } private void work() { Random r = new Random(0); for(int i=0; i<work; i++) { numberOfLockAcquisitions += r.nextInt(); } numberOfLockAcquisitions -= offset; numberOfLockAcquisitions++; } public String toString() { return String.format("%s%,15d", (readOnly ? "RO" : "RW"), numberOfLockAcquisitions); } public long getNumberOfLockAcquisitions() { return numberOfLockAcquisitions; } public boolean isReadOnly() { return readOnly; } }
The last class in the puzzle is called TestLocks. It creates one thread for each LockAcquirer. We use different numbers of reader and writer threads to test whether starvation does indeed occur. If you run this, you can change the field DETAILED_OUTPUT to specify how much information is displayed.
import java.util.*; import java.util.concurrent.*; import java.util.concurrent.atomic.AtomicBoolean; import java.util.concurrent.locks.Lock; public class TestLocks { private static final int DURATION_SECONDS = 5; private static final int TOTAL_LOCKS = 16; private static final boolean DETAILED_OUTPUT = false; public static void main(String[] args) throws InterruptedException { test(false); test(true); } private static void test(boolean fair) throws InterruptedException { readWriteLockTest(fair); lockTest(fair); } private static void readWriteLockTest(boolean fair) throws InterruptedException { System.out.println("ReentrantReadWriteLock test"); for (int locks = 1; locks <= TOTAL_LOCKS; locks *= 2) { for (int reads = 0; reads <= locks; reads++) { int writes = locks - reads; test(new ReadWriteLockFactory(fair), reads, writes, fair); } } } private static void lockTest(boolean fair) throws InterruptedException { System.out.println("ReentrantLock test"); for (int locks = 1; locks < TOTAL_LOCKS; locks++) { test(new StandardLockFactory(fair), 0, locks, fair); } } public static void test(LockFactory factory, int reads, int writes, boolean fair) throws InterruptedException { System.out.printf("RO=%2d, RW=%2d, fair=%b ", reads, writes, fair); CountDownLatch start = new CountDownLatch(reads + writes); CountDownLatch finish = new CountDownLatch(reads + writes); AtomicBoolean running = new AtomicBoolean(true); ExecutorService pool = Executors.newFixedThreadPool( reads + writes); Collection<LockAcquirer> acquirers = new ArrayList<LockAcquirer>(); for (int i = 0; i < reads + writes; i++) { boolean isReadOnly = i >= writes; Lock lock = factory.createLock(isReadOnly); LockAcquirer acquirer = new LockAcquirer(running, lock, start, finish, isReadOnly); acquirers.add(acquirer); pool.submit(acquirer); } start.await(); TimeUnit.SECONDS.sleep(DURATION_SECONDS); running.set(false); finish.await(); long totalReads = 0; long totalWrites = 0; if (DETAILED_OUTPUT) System.out.println(); for (LockAcquirer acquirer : acquirers) { long numberOfLockAcquisitions = acquirer.getNumberOfLockAcquisitions(); if (DETAILED_OUTPUT) System.out.printf("## %,17d%n", numberOfLockAcquisitions); if (acquirer.isReadOnly()) { totalReads += numberOfLockAcquisitions; } else { totalWrites += numberOfLockAcquisitions; } } System.out.printf("%,17d%,17d%n", totalReads, totalWrites); pool.shutdown(); } }
With microbenchmarks, we usually need to have some idea of what output to expect, based on our understanding of the code. If the benchmark is wildly wrong then either our understanding of the code is wrong or our benchmark is flawed. My microbenchmark is possibly a bit simplistic, but Cliff Click confirmed that his more sophisticated benchmark showed similar weaknesses with the ReadWriteLock.
I always assumed that ReadWriteLock would give preference to the WriteLock. If not, the writer thread could be starved of resources if new reader threads kept on entering the critical section in a tag-team fashion. That would result in the WriteLock literally never getting a chance. We're talking of total starvation here! Surely ReadWriteLock would not do that!?
My TestLocks class produced different output for Java 5 and 6. I used JDK 1.6.0_07 and JDK 1.5.0_13 on my MacBook Pro 2.6 dual core. My suspicion is that the more cores you add, the more pronounced the benchmark will become. Unfortunately I don't own enough cool hardware to verify that.
First the output for Java 5. Note that as expected, when we have two threads using ReadOnly locks, we get better throughput than with one (first highlighted line). It gets interesting when we use more than two threads. I don't think the number of cores is as important here. When we have three threads accessing ReadLocks and one accessing a WriteLock, the readers access the critical section 2,757,365 times, whereas the writer only 2 times (second highlighted line). We see the same effect with 3 readers / 5 writers and 3 readers / 13 writers (third and fourth highlighted lines). Thus it confirms my worry that although we get added throughput due to the readers being able to access the critical section concurrently, if these threads keep on repeatedly accessing the critical section, they will lock out all writer threads.
ReentrantReadWriteLock test RO= 0, RW= 1, fair=false 0 2,117,034 RO= 1, RW= 0, fair=false 2,090,567 0 RO= 0, RW= 2, fair=false 0 1,238,352 RO= 1, RW= 1, fair=false 339,871 913,259 1. RO= 2, RW= 0, fair=false 2,728,331 0 RO= 0, RW= 4, fair=false 0 1,256,089 RO= 1, RW= 3, fair=false 229,381 1,019,947 RO= 2, RW= 2, fair=false 2,676,337 11,274 2. RO= 3, RW= 1, fair=false 2,757,365 2 RO= 4, RW= 0, fair=false 2,877,727 0 RO= 0, RW= 8, fair=false 0 1,246,185 RO= 1, RW= 7, fair=false 72,935 1,170,020 RO= 2, RW= 6, fair=false 2,649,444 19,693 3. RO= 3, RW= 5, fair=false 2,856,555 5 RO= 4, RW= 4, fair=false 2,854,140 4 RO= 5, RW= 3, fair=false 2,881,890 3 RO= 6, RW= 2, fair=false 2,958,654 2 RO= 7, RW= 1, fair=false 3,108,282 1 RO= 8, RW= 0, fair=false 3,233,558 0 RO= 0, RW=16, fair=false 0 1,234,477 RO= 1, RW=15, fair=false 61,381 1,181,331 RO= 2, RW=14, fair=false 976,371 749,589 4. RO= 3, RW=13, fair=false 2,855,692 13 RO= 4, RW=12, fair=false 2,941,074 12 RO= 5, RW=11, fair=false 2,918,760 11 RO= 6, RW=10, fair=false 2,921,576 10 RO= 7, RW= 9, fair=false 3,185,453 9 RO= 8, RW= 8, fair=false 3,206,795 8 RO= 9, RW= 7, fair=false 3,343,547 7 RO=10, RW= 6, fair=false 3,359,844 1,018 RO=11, RW= 5, fair=false 3,180,840 677 RO=12, RW= 4, fair=false 3,542,919 4 RO=13, RW= 3, fair=false 3,638,226 3 RO=14, RW= 2, fair=false 3,694,536 2 RO=15, RW= 1, fair=false 3,688,920 1 RO=16, RW= 0, fair=false 3,677,340 0
The normal Locks show typical behaviour, so it is not that interesting to look at. Here are the values for completeness:
ReentrantLock test RO= 0, RW= 1, fair=false 0 2,105,577 RO= 0, RW= 2, fair=false 0 1,233,695 RO= 0, RW= 3, fair=false 0 1,199,802 RO= 0, RW= 4, fair=false 0 1,231,030 RO= 0, RW= 5, fair=false 0 1,229,788 RO= 0, RW= 6, fair=false 0 1,216,533 RO= 0, RW= 7, fair=false 0 1,229,848 RO= 0, RW= 8, fair=false 0 1,285,035 RO= 0, RW= 9, fair=false 0 1,251,308 RO= 0, RW=10, fair=false 0 1,226,727 RO= 0, RW=11, fair=false 0 1,239,005 RO= 0, RW=12, fair=false 0 1,231,186 RO= 0, RW=13, fair=false 0 1,237,608 RO= 0, RW=14, fair=false 0 1,501,639 RO= 0, RW=15, fair=false 0 1,303,815
It gets interesting again when we add fairness to the picture. With single threads or when there are only reads happening, we see similar throughput as with unfair locks (highlighted line 1). We also see that there is no more starvation happening with our readers. However, the throughput is also much lower than before. In the past, the total throughput (reads plus writes) was between 2.5 and 3.5 million, now it hardly reaches 0.5 million (highlighted lines 2 and 3). We see that fairness is a throughput killer, also confirmed inJava Concurrency in Practice.
ReentrantReadWriteLock test RO= 0, RW= 1, fair=true 0 2,072,787 RO= 1, RW= 0, fair=true 2,060,310 0 RO= 0, RW= 2, fair=true 0 565,702 RO= 1, RW= 1, fair=true 275,904 275,624 RO= 2, RW= 0, fair=true 2,682,398 0 RO= 0, RW= 4, fair=true 0 443,952 RO= 1, RW= 3, fair=true 110,523 331,542 RO= 2, RW= 2, fair=true 234,798 234,772 RO= 3, RW= 1, fair=true 402,339 134,095 1. RO= 4, RW= 0, fair=true 2,918,238 0 RO= 0, RW= 8, fair=true 0 439,975 RO= 1, RW= 7, fair=true 54,965 384,712 RO= 2, RW= 6, fair=true 111,827 335,442 RO= 3, RW= 5, fair=true 171,148 285,232 RO= 4, RW= 4, fair=true 238,927 238,915 RO= 5, RW= 3, fair=true 318,833 191,296 RO= 6, RW= 2, fair=true 415,278 138,424 RO= 7, RW= 1, fair=true 556,224 79,317 RO= 8, RW= 0, fair=true 3,145,546 0 2. RO= 0, RW=16, fair=true 0 438,235 RO= 1, RW=15, fair=true 27,571 413,475 RO= 2, RW=14, fair=true 54,852 383,873 RO= 3, RW=13, fair=true 83,335 361,065 RO= 4, RW=12, fair=true 111,132 333,361 RO= 5, RW=11, fair=true 142,440 313,366 RO= 6, RW=10, fair=true 171,931 286,555 3. RO= 7, RW= 9, fair=true 211,226 271,570 RO= 8, RW= 8, fair=true 240,015 240,012 RO= 9, RW= 7, fair=true 278,216 216,392 RO=10, RW= 6, fair=true 322,074 193,246 RO=11, RW= 5, fair=true 374,487 170,221 RO=12, RW= 4, fair=true 422,841 140,949 RO=13, RW= 3, fair=true 488,080 112,634 RO=14, RW= 2, fair=true 565,221 80,748 RO=15, RW= 1, fair=true 678,649 44,844 RO=16, RW= 0, fair=true 3,721,636 0
Just how fair the ReentrantLock is (and how poor the throughput can be), is seen in the figures below:
ReentrantLock test RO= 0, RW= 1, fair=true 0 2,078,251 RO= 0, RW= 2, fair=true 0 555,530 RO= 0, RW= 3, fair=true 0 440,922 RO= 0, RW= 4, fair=true 0 443,428 RO= 0, RW= 5, fair=true 0 441,595 RO= 0, RW= 6, fair=true 0 441,867 RO= 0, RW= 7, fair=true 0 442,270 RO= 0, RW= 8, fair=true 0 441,479 RO= 0, RW= 9, fair=true 0 443,839 RO= 0, RW=10, fair=true 0 442,190 RO= 0, RW=11, fair=true 0 444,120 RO= 0, RW=12, fair=true 0 440,291 RO= 0, RW=13, fair=true 0 443,763 RO= 0, RW=14, fair=true 0 440,695 RO= 0, RW=15, fair=true 0 444,110
We saw that for Java 5, we can see serious starvation of our writer threads to the point where they never get a chance.
Let's now take a look at the figures for Java 6. The figures for fair locks and the normal ReentrantLock are very similar in their nature to Java 5, so I will just show unfair ReadWriteLocks. They do not degrade the writers as badly as in Java 5. An interesting phenomenon is that the readers seem to be starved by the writers (highlighted lines 1 and 2). Perhaps the locks were modified to counteract the starvation of Java 5 and this might have been overcompensated? We also see that as the readers increase, the writers get disadvantaged, but not as seriously as in Java 5 (highlighted line 3 and 4). Even though there are 6 readers and 10 writers, the readers manage to read more than 10 times more frequently (in total) than the writers. The detailed output for this case is shown below.
ReentrantReadWriteLock test RO= 0, RW= 1, fair=false 0 4,386,864 RO= 1, RW= 0, fair=false 3,929,583 0 RO= 0, RW= 2, fair=false 0 2,724,154 1. RO= 1, RW= 1, fair=false 8,515 2,630,659 RO= 2, RW= 0, fair=false 5,687,455 0 RO= 0, RW= 4, fair=false 0 2,515,201 RO= 1, RW= 3, fair=false 2,848 2,602,498 RO= 2, RW= 2, fair=false 28,315 2,476,875 RO= 3, RW= 1, fair=false 5,193,481 394,732 RO= 4, RW= 0, fair=false 5,902,569 0 RO= 0, RW= 8, fair=false 0 2,662,787 2. RO= 1, RW= 7, fair=false 647 2,662,818 RO= 2, RW= 6, fair=false 3,460 2,693,680 RO= 3, RW= 5, fair=false 3,474,177 1,234,051 3. RO= 4, RW= 4, fair=false 4,699,560 584,991 RO= 5, RW= 3, fair=false 5,296,241 293,361 RO= 6, RW= 2, fair=false 5,699,336 87,412 RO= 7, RW= 1, fair=false 5,781,451 54,931 RO= 8, RW= 0, fair=false 6,063,859 0 RO= 0, RW=16, fair=false 0 2,654,262 RO= 1, RW=15, fair=false 479 2,688,341 RO= 2, RW=14, fair=false 2,052 2,693,743 RO= 3, RW=13, fair=false 2,672,766 1,514,596 RO= 4, RW=12, fair=false 3,402,001 1,206,118 RO= 5, RW=11, fair=false 2,523,643 1,585,372 4. RO= 6, RW=10, fair=false 4,873,365 555,011 RO= 7, RW= 9, fair=false 5,094,895 362,255 RO= 8, RW= 8, fair=false 5,365,968 310,056 RO= 9, RW= 7, fair=false 5,394,899 283,084 RO=10, RW= 6, fair=false 5,650,911 220,294 RO=11, RW= 5, fair=false 5,971,020 171,484 RO=12, RW= 4, fair=false 5,916,624 159,303 RO=13, RW= 3, fair=false 6,193,914 117,573 RO=14, RW= 2, fair=false 5,969,253 74,979 RO=15, RW= 1, fair=false 6,187,903 32,918 RO=16, RW= 0, fair=false 6,415,678 0
Here is the detailed output for the case where we have 6 reader and 10 writer threads:
RO= 6, RW=10, fair=false ## 48,849 ## 40,850 ## 74,406 ## 83,103 ## 45,319 ## 54,664 ## 62,691 ## 56,161 ## 35,511 ## 53,457 ## 958,901 ## 857,711 ## 867,904 ## 772,812 ## 804,360 ## 611,677 4,873,365 555,011
We see that starvation has become better managed in Java 6, but it still appears to be a potential problem. We also see that when we have several readers, the throughput will be related to the number of actual cores available, which proves that the readers are happening in parallel.
In a way, I am surprised that ReadWriteLocks are as bad as they are. I really thought that write locks would get priority over read locks, since not doing that would result in obvious starvation scenarios. Instead, with Java 5, as soon as you have 3 or more read threads, you can expect complete starvation of the writer threads. With Java 6, the situation has improved somewhat.
In my understanding of these values, we should probably never use ReadWriteLocks in Java 5. It is just too risky. In Java 6, we can use the ReadWriteLock when we are willing to wait for our writers to get an opportunity to acquire the lock. It seems that they won't get locked out completely, but it might take some time before they are serviced. Fairness does not really help us consistenly, since it also reduces the throughput.
Instead of using ReadWriteLock, we rather recommend that you
use higher level concurrency classes such as the atomic
classes, the ConcurrentHashMap
and
ConcurrentLinkedQueue
.
Kind regards
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.