Abstract: Imagine a very stubborn viking father insisting that his equally stubborn child eat its lutefisk before going to sleep. In real life one of the "threads" eventually will give up, but in Java, the threads become deadlocked, with neither giving an inch. In this newsletter we discover how we can sometimes escape from such deadlocked situations in Java and learn why the stop() function should never ever ever be called.
Welcome to the 160th issue of The Java(tm) Specialists' Newsletter, sent to you from Belmont in San Francisco. I am here in the San Francisco Bay Area until the 21st of May, so give me a shout if you are in the vicinity and would like to get together one of the evenings or the coming weekend for a chat about Java or life in general.
I must apologize for the length of this newsletter. I try to keep them in nice bit-sized chunks. However, there is just too much material to cover on deadlocks. It should be worth reading though, even if it takes you a few days. In the end, we make some discoveries about how we can sometimes escape from deadlock situations in Java, something that was not possible in earlier versions of Java. Infact, I have not seen any of this information published elsewhere. This also brings us to the end of our series on concurrency. If this series interested you and you would like more information about concurrency and other advanced Java topics, please consider attending our Java Master Course.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
We are looking at a series of laws to make sense of Java concurrency. Just to remind you, here they are again. In this newsletter, we will look at the Law of the Uneaten Lutefisk.
Definition: Lutefisk - A traditional Scandinavian dish prepared by soaking air-dried cod in a lye solution for several weeks before skinning, boning, and boiling it, a process that gives the dish its characteristic gelatinous consistency.
Imagine a very stubborn viking father insisting that his equally stubborn child eat its lutefisk (eggplant, tomatoes, etc.) before going to bed. Each child seems to have at least one food that they will not eat, irrespective of the amount of coaxing by their parents. When I was a kid, I could not stomach hotdog sausages. This was particularly challenging, as they were the food of choice at birthday parties. Food can create deadlocks between parents and children. Fortunately in real life, one of the parties usually gives in after a while, which means the chances of getting to sleep improve. In Java, when we get into a deadlock situation, we do not always have an easy way out!
Java has several ways of locking critical sections. When we lock using synchronized, we call this monitor locking. The Java 5 locks would just be called locking.
Deadlocks can occur when two threads are locking on two different locks, in opposite order. Usually this happens inadvertently. The deadlock can occur between two monitors or between two locks or a hybrid of the two.
You have probably heard that Thread.stop() should never be called. There are several reasons why. One of them was sent to me this morning by Brian Goetz. What connects Brian and me is not just the last two letters of our surnames, but also a keen interest in Java concurrency. His book titled Java Concurrency in Practice belongs on every Java Specialist's book shelf.
Brian sent me this example of code that would break a class invariant if stop() is called at an inappropriate time on thread Foo.
public static final Object lock = new Object(); @GuardedBy("lock") public static int x; @GuardedBy("lock") public static int y; // invariant: x + y = 10 class Foo extends Thread { public void run() { for (int i=0; i<ONE_BILLION; i++) { synchronized(lock) { x++; y--; } } } }
I put Brian's idea into a class that you can run and which will predictably produce a failed class invariant. In addition, you can change it to terminate the thread using the interrupt() mechanism, by just setting a constant USE_INTERRUPT to true:
public class StopIsBad { private static final boolean USE_INTERRUPT = false; public static final Object lock = new Object(); public static int x = 5; public static int y = 5; // invariant: x + y = 10 static class Foo extends Thread { public void run() { while (!isInterrupted()) { synchronized (lock) { x++; sleepQuietly(); // make sure stop() happens here y--; } } } private void sleepQuietly() { try { sleep(10); } catch (InterruptedException ex) { Thread.currentThread().interrupt(); } } } public static void main(String[] args) throws InterruptedException { while (true) { System.out.println("Creating new foo"); Foo foo = new Foo(); foo.start(); Thread.sleep(50); if (USE_INTERRUPT) { foo.interrupt(); } else { foo.stop(); } synchronized (lock) { if (x + y != 10) { throw new IllegalStateException( "Invariant is broken: " + (x + y)); } } foo.join(); } } }
When we run this with USE_INTERRUPT==false, we almost immediately break the class invariant:
Creating new foo Exception in thread "main" java.lang.IllegalStateException: Invariant is broken: 11 at StopIsBad.main(StopIsBad.java:42)
When we use the interrupt() mechanism to shut down the code, assuming that we listened to The Law of the Sabotaged Doorbell, the class invariant stays intact:
Creating new foo Creating new foo Creating new foo Creating new foo Creating new foo Creating new foo ...
The Thread.stop() method is by its nature dangerous. During the course of this newsletter, I will show you two other occasions where stop() produces unexpected results.
Imagine a group of philosophers sitting around a table. Each philosopher has a bowl of rice and a chopstick on either side of the bowl, each of which he shares with his neighbour. Philosophers alternate between the states THINKING and EATING. To go from THINKING to eating, he first picks up the left chopstick then the right chopstick. If they all pick up the left chopstick at the same time, they will have a deadlock. None of them can go from state THINKING to EATING, because they each need the right chopstick, which is held by their neighbour.
Depending on how long he waits between taking each chopstick and the number of philosophers, the chances of him deadlocking can increase.
For example, here is our Philosopher, with a left and right chopstick (both synchronized monitors). In order to change from THINKING to EATING state, he needs to first acquire the left chopstick, followed by the right chopstick. In order to make the deadlock occur quicker, I make him ponder for a while inbetween moves.
public class Philosopher implements Runnable { private final Object left; private final Object right; public Philosopher(Object left, Object right) { this.left = left; this.right = right; } private void ponder() throws InterruptedException { Thread.sleep(((int) Math.random() * 10) + 10); } public void run() { try { while (true) { ponder(); // thinking synchronized (left) { ponder(); synchronized (right) { ponder(); // eating } } } } catch (InterruptedException e) { Thread.currentThread().interrupt(); return; } } }
Next we need a way to discover deadlocks. Fortunately the ThreadMXBean can help us find these. Before I show you my DeadlockChecker, here is the Dinner, with 5 philosophers:
import java.lang.management.ThreadInfo; import java.util.Collection; public class Dinner { public static void main(String[] args) throws Exception { final Philosopher[] philosophers = new Philosopher[5]; Object[] chopsticks = new Object[philosophers.length]; for (int i = 0; i < chopsticks.length; i++) { chopsticks[i] = new Object(); } for (int i = 0; i < philosophers.length; i++) { Object left = chopsticks[i]; Object right = chopsticks[(i + 1) % chopsticks.length]; philosophers[i] = new Philosopher(left, right); Thread t = new Thread(philosophers[i], "Phil " + (i + 1)); t.start(); } DeadlockChecker checker = new DeadlockChecker(); while (true) { Collection<ThreadInfo> threads = checker.check(); if (!threads.isEmpty()) { System.out.println("Found deadlock:"); for (ThreadInfo thread : threads) { System.out.println("\t" + thread.getThreadName()); } System.exit(1); } Thread.sleep(1000); } } }
Let's examine the output with our 5 philosophers:
Found deadlock: Phil 1 Phil 2 Phil 3 Phil 4 Phil 5
Setting up the dining philosophers is fairly straightforward. What is more complicated is finding out whether their eating habits have caused a thread deadlock in our system. To do that, we use the ThreadMXBean, as already described in Newsletter 130. I modified it here for our purposes of testing several scenarios in which deadlocks can occur:
import java.lang.management.*; import java.util.*; /** * Used to check whether there are any known deadlocks by * querying the ThreadMXBean. It looks for threads deadlocked * on monitors (synchronized) and on Java 5 locks. Call check() * to get a set of deadlocked threads (might be empty). * * We can also add threads to the ignore set by calling the * ignoreThreads(Thread[]) method. For example, once we have * established that a certain deadlock exists, we might want to * ignore that until we have shut down our program and instead * concentrate on new deadlocks. */ public class DeadlockChecker { private static final ThreadMXBean tmb = ManagementFactory.getThreadMXBean(); private final Set<Long> threadIdsToIgnore = new HashSet<Long>(); /** * Returns set of ThreadInfos that are part of the deadlock; * could be empty if there is no deadlock. */ public Collection<ThreadInfo> check() { Map<Long, ThreadInfo> map = new HashMap<Long, ThreadInfo>(); findDeadlockInfo(map, tmb.findMonitorDeadlockedThreads()); findDeadlockInfo(map, tmb.findDeadlockedThreads()); return map.values(); } public void ignoreThreads(Thread[] threads) { for (Thread thread : threads) { threadIdsToIgnore.add(thread.getId()); } } private void findDeadlockInfo(Map<Long, ThreadInfo> result, long[] ids) { if (ids != null && ids.length > 0) { for (long id : ids) { if (!threadIdsToIgnore.contains(id)) { result.put(id, tmb.getThreadInfo(id)); } } } } }
Now that we have the basic building blocks for figuring out when a deadlock occurs, let's examine what we can do once we find a deadlock.
Databases also have deadlock situations and they typically resolve these by choosing a deadlock victim. Killing one of the queries frees up the other one to finish its work. The deadlock victim then tries again, hopefully with more success this time.
When I tried this the last time in Java, the monitor locked threads could not break out of the deadlock, neither with stop() nor with interrupt(). When the thread is in state BLOCKED, it does not respond well to either of these signals.
However, with the new Java 5 locks, it might actually be possible to resolve a deadlock in Java code. In order to try out these ideas, I wrote an interface with two methods, method1() and method2(). These two methods should be called by two separate threads within a short time interval, which should then cause a deadlock. For example, here is the DeadlockingClass interface with the DeadlockedOnMonitors implementation. You will notice that the interface throws InterruptedException from the methods; we will use this in other examples to break out of deadlocks cleanly.
/** * Method1 and method2 should reliably deadlock every time they * get called at roughly the same time. */ public interface DeadlockingClass { void method1() throws InterruptedException; void method2() throws InterruptedException; } public class DeadlockedOnMonitors implements DeadlockingClass { private final Object lock1 = new Object(); private final Object lock2 = new Object(); public void method1() throws InterruptedException { synchronized (lock1) { Thread.sleep(1000); synchronized (lock2) { } } } public void method2() throws InterruptedException { synchronized (lock2) { Thread.sleep(1000); synchronized (lock1) { } } } }
Next we show two implementations of DeadlockingClass. The
first uses basic Java 5 locks, whereas the second locks
interruptibly on the Java 5 locks. This allows us to
interrupt a deadlock and thus exit cleanly. As you will see
by the results later on, the best approach for locking in
Java 5 is to always use at least
Lock.lockInterruptibly()
and maybe even
Lock.tryLock()
.
import java.util.concurrent.locks.*; public class DeadlockedOnLocks implements DeadlockingClass { private final Lock lock1 = new ReentrantLock(); private final Lock lock2 = new ReentrantLock(); public void method1() throws InterruptedException { lock1.lock(); try { Thread.sleep(1000); lock2.lock(); try { } finally { lock2.unlock(); } } finally { lock1.unlock(); } } public void method2() throws InterruptedException { lock2.lock(); try { Thread.sleep(1000); lock1.lock(); try { } finally { lock1.unlock(); } } finally { lock2.unlock(); } } } import java.util.concurrent.locks.*; public class DeadlockedOnLocksInterruptibly implements DeadlockingClass { private final Lock lock1 = new ReentrantLock(); private final Lock lock2 = new ReentrantLock(); public void method1() throws InterruptedException { lock1.lockInterruptibly(); try { Thread.sleep(1000); lock2.lockInterruptibly(); try { } finally { lock2.unlock(); } } finally { lock1.unlock(); } } public void method2() throws InterruptedException { lock2.lockInterruptibly(); try { Thread.sleep(1000); lock1.lockInterruptibly(); try { } finally { lock1.unlock(); } } finally { lock2.unlock(); } } }
The last two types of deadlocking classes involve hybrid solutions (in keeping with the modern craze everything ecologically sustainable). In the first, thread one locks on a monitor, whereas thread two locks on a Java 5 lock. In the second, it's the other way round. The case where the Java 5 lock is locked interruptibly is left as an exercise to the reader.
import java.util.concurrent.locks.*; public class DeadlockedOnMonitorThenLock implements DeadlockingClass { private final Object lock1 = new Object(); private final Lock lock2 = new ReentrantLock(); public void method1() throws InterruptedException { synchronized (lock1) { Thread.sleep(1000); lock2.lock(); try { } finally { lock2.unlock(); } } } public void method2() throws InterruptedException { lock2.lock(); try { Thread.sleep(1000); synchronized (lock1) { } } finally { lock2.unlock(); } } } import java.util.concurrent.locks.*; public class DeadlockedOnLockThenMonitor implements DeadlockingClass { private final Lock lock1 = new ReentrantLock(); private final Object lock2 = new Object(); public void method1() throws InterruptedException { lock1.lock(); try { Thread.sleep(1000); synchronized (lock2) { } } finally { lock1.unlock(); } } public void method2() throws InterruptedException { synchronized (lock2) { Thread.sleep(1000); lock1.lock(); try { } finally { lock1.unlock(); } } } }
Now that you have seen the five deadlocking classes, I need to explain how the test will be set up. For each deadlocking class, I create two threads (T1 and T2) that invoke methods method1() and method2() respectively. They should all be deadlocked within two seconds of starting the threads. In the first test, we call T1.stop(), wait for a second, then call T2.stop(). We should never call stop() in real code, but this is just to see what would happen if we did.
For the second test, we create another two threads (T3 and T4) and try out what happens if we interrupt() the threads. Here, the only class that would work is DeadlockedOnLocksInterruptibly.
In our tests, we can see that Java 5 locks are released when Thread.stop() is called. Threads locked on monitors with synchronized are not released. The hybrid deadlocks are released only because the thread locked on the Java 5 lock is stopped correctly. However, the stop() method does strange things to the threads deadlocked purely on monitors. You can see that it actually breaks the deadlock detection mechanism of Java. After we have called stop() on the DeadlockedOnMonitors class, we end up with a single thread being shown as deadlocked, which is clearly not correct. If you ever needed another reason not to call the Thread.stop() method, this has got to be it!
Since Thread.stop() is completely unsafe (with a third example of why to follow), the only way that we should attack a deadlock is with Thread.interrupt(). At the very least, we could try interrupt() as a first level attack, after choosing our deadlock victim. If that does not resolve the deadlock, we can interrupt the other thread. After that we could call stop() on both of the threads, to try to dislodge the deadlock. Irrespective of what we do, we also have to log a severe error that we have seen a deadlock in our system.
Here is my test code:
import java.lang.management.ThreadInfo; import java.util.Collection; import java.util.concurrent.TimeUnit; public class DeadlockTest { private static final DeadlockChecker checker = new DeadlockChecker(); public static void main(String[] args) throws Exception { DeadlockingClass[] problemClasses = { new DeadlockedOnLocks(), new DeadlockedOnLocksInterruptibly(), new DeadlockedOnMonitorThenLock(), new DeadlockedOnLockThenMonitor(), new DeadlockedOnMonitors(), }; DeadlockTest test = new DeadlockTest(); for (DeadlockingClass problemClass : problemClasses) { String name = problemClass.getClass().getSimpleName(); System.out.println(name); System.out.println(name.replaceAll(".", "=")); test.resolveByStop(problemClass); System.out.println(); test.resolveByInterrupt(problemClass); System.out.println(); System.out.println(); System.out.println(); } } private void resolveByStop(DeadlockingClass pc) { System.out.println("Testing deadlock resolve by stop()"); System.out.println("----------------------------------"); Thread[] threads = setupThreads(pc, "t1", "t2"); sleepQuietly(2); check(); System.out.println("Trying to resolve through t1.stop()"); threads[0].stop(); sleepQuietly(1); check(); System.out.println("Trying to resolve through t2.stop()"); threads[1].stop(); sleepQuietly(1); check(); checker.ignoreThreads(threads); } private void resolveByInterrupt(DeadlockingClass pc) throws InterruptedException { System.out.println("Testing deadlock resolve by interrupt()"); System.out.println("---------------------------------------"); Thread[] threads = setupThreads(pc, "t3", "t4"); sleepQuietly(2); check(); System.out.println("Trying to resolve by t3.interrupt()"); threads[0].interrupt(); sleepQuietly(1); check(); System.out.println("Trying to resolve by t4.interrupt()"); threads[1].interrupt(); sleepQuietly(1); check(); checker.ignoreThreads(threads); } private static void sleepQuietly(int seconds) { try { TimeUnit.SECONDS.sleep(seconds); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } private static Thread[] setupThreads( final DeadlockingClass problemClass, final String name1, final String name2) { Thread[] threads = { new Thread(name1) { public void run() { try { problemClass.method1(); } catch (InterruptedException e) { Thread.currentThread().interrupt(); System.out.println(name1 + " interrupted"); } finally { System.out.println( "******* " + name1 + " Deadlock resolved"); } } }, new Thread(name2) { public void run() { try { problemClass.method2(); } catch (InterruptedException e) { System.out.println(name2 + " interrupted"); Thread.currentThread().interrupt(); } finally { System.out.println( "******* " + name2 + " Deadlock resolved"); } } }}; for (Thread thread : threads) { thread.setDaemon(true); thread.start(); System.out.println("Started thread " + thread.getName() + " (" + thread.getId() + ")"); } return threads; } private static void check() { printInfo(checker.check()); } private static void printInfo( Collection<ThreadInfo> threads) { if (threads.isEmpty()) { System.out.println("+++ No deadlock (detected)"); } else { System.out.print("--- We have detected a deadlock: "); for (ThreadInfo info : threads) { System.out.print(info.getThreadName() + " (id=" + info.getThreadId() + ") "); } System.out.println(); } } }
When we run this, we see the following output:
DeadlockedOnLocks ================= Testing deadlock resolve by stop() ---------------------------------- Started thread t1 (8) Started thread t2 (9) --- We have detected a deadlock: t1 (id=8) t2 (id=9) Trying to resolve through t1.stop() ******* t1 Deadlock resolved ******* t2 Deadlock resolved +++ No deadlock (detected) Trying to resolve through t2.stop() +++ No deadlock (detected) Testing deadlock resolve by interrupt() --------------------------------------- Started thread t3 (10) Started thread t4 (11) --- We have detected a deadlock: t3 (id=10) t4 (id=11) Trying to resolve by t3.interrupt() --- We have detected a deadlock: t3 (id=10) t4 (id=11) Trying to resolve by t4.interrupt() --- We have detected a deadlock: t3 (id=10) t4 (id=11) DeadlockedOnLocksInterruptibly ============================== Testing deadlock resolve by stop() ---------------------------------- Started thread t1 (12) Started thread t2 (13) --- We have detected a deadlock: t1 (id=12) t2 (id=13) Trying to resolve through t1.stop() ******* t1 Deadlock resolved ******* t2 Deadlock resolved +++ No deadlock (detected) Trying to resolve through t2.stop() +++ No deadlock (detected) Testing deadlock resolve by interrupt() --------------------------------------- Started thread t3 (14) Started thread t4 (15) --- We have detected a deadlock: t3 (id=14) t4 (id=15) Trying to resolve by t3.interrupt() ******* t4 Deadlock resolved t3 interrupted ******* t3 Deadlock resolved +++ No deadlock (detected) Trying to resolve by t4.interrupt() +++ No deadlock (detected) DeadlockedOnMonitorThenLock =========================== Testing deadlock resolve by stop() ---------------------------------- Started thread t1 (16) Started thread t2 (17) --- We have detected a deadlock: t2 (id=17) t1 (id=16) Trying to resolve through t1.stop() ******* t1 Deadlock resolved ******* t2 Deadlock resolved +++ No deadlock (detected) Trying to resolve through t2.stop() +++ No deadlock (detected) Testing deadlock resolve by interrupt() --------------------------------------- Started thread t3 (18) Started thread t4 (19) --- We have detected a deadlock: t4 (id=19) t3 (id=18) Trying to resolve by t3.interrupt() --- We have detected a deadlock: t4 (id=19) t3 (id=18) Trying to resolve by t4.interrupt() --- We have detected a deadlock: t4 (id=19) t3 (id=18) DeadlockedOnLockThenMonitor =========================== Testing deadlock resolve by stop() ---------------------------------- Started thread t1 (20) Started thread t2 (21) --- We have detected a deadlock: t2 (id=21) t1 (id=20) Trying to resolve through t1.stop() --- We have detected a deadlock: t2 (id=21) t1 (id=20) Trying to resolve through t2.stop() ******* t2 Deadlock resolved ******* t1 Deadlock resolved +++ No deadlock (detected) Testing deadlock resolve by interrupt() --------------------------------------- Started thread t3 (22) Started thread t4 (23) +++ No deadlock (detected) Trying to resolve by t3.interrupt() +++ No deadlock (detected) Trying to resolve by t4.interrupt() +++ No deadlock (detected) DeadlockedOnMonitors ==================== Testing deadlock resolve by stop() ---------------------------------- Started thread t1 (24) Started thread t2 (25) --- We have detected a deadlock: t2 (id=25) t1 (id=24) Trying to resolve through t1.stop() --- We have detected a deadlock: t2 (id=25) t1 (id=24) Trying to resolve through t2.stop() --- We have detected a deadlock: t2 (id=25) t1 (id=24) Testing deadlock resolve by interrupt() --------------------------------------- Started thread t3 (26) Started thread t4 (27) --- We have detected a deadlock: t4 (id=27) Trying to resolve by t3.interrupt() --- We have detected a deadlock: t4 (id=27) Trying to resolve by t4.interrupt() --- We have detected a deadlock: t4 (id=27)
As you can see from the output, the only locking mechanism that manages to recover from a deadlock situation cleanly is the Java 5 Lock.lockInterruptibly. Instead of putting the calling thread in a BLOCKED state, it enters the WAITING state, in which it can then be interrupted.
As if you needed another reason not to call stop(), but in the following code:
lock.lock(); try { while(list.isEmpty()) { condition.await(); } return list.removeFirst(); } finally { lock.unlock(); }
The lock.unlock() might throw an exception. In the synchronized construct, when you call stop() on a thread that is in the WAITING state, it first picks up the lock again before continuing. In Java 5 locks, they don't do that. Therefore, when we try to unlock(), we actually don't own the lock anymore. There is a workaround where we use the ReentrantLock interface to figure out whether our own thread holds the lock before unlocking, but that is also not really satisfactory. The best solution is to never ever ever. That is never with a big N. Never call thread.stop().
In addition, if you do get deadlocks, proceed with extreme caution. You should use my deadlock detector to find them more quickly, but then you need to get rid of them from your code. Interrupting the threads is a bit like putting a band aid onto a gaping wound. It might buy you time and keep your system limping along for a bit more, but should not be seen as a long-term solution.
Kind regards
Heinz
P.S. I will be in the San Francisco Bay Area for the next few days for those of you who would like to get together for a chat. Just send me a note please so we can arrange something.
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.