Abstract: How consistent is weakly-consistent iteration? And which collections return weakly-consistent iterators? How do we know? Join us as we explore what weakly-consistent means with concrete examples.
Welcome to the 288th edition of The Java(tm) Specialists' Newsletter, sent to you from the Island of Crete. I noticed yesterday that I had run for 902 days in a row. Some have a 30 day running streak on their bucket list, but this is that squared. I ran most of my miles with Salomon Speedcross GTX shoes. Winter can get wet here on Crete, and these keep my feet toasty warm while jumping in muddy puddles. Plus the Speedcross give great grip and protection on the rugged mountain paths. I've torn through many a sole since I started running 7 years ago, and almost a dozen bluetooth headsets. And yet, after all that, I am still a big guy. I love food too much :-)
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
In the book 97 Things Every Java Programmer Should Know, I wrote a piece entitled "Read OpenJDK Daily". We can learn a lot from the coding practices in the OpenJDK.
I decided to put this into practice by producing a series
of mini-courses to study the java.util.concurrent classes in
the minutest of detail. Since I expected that the
synchronization in ArrayBlockingQueue
would be trivial, I
decided to start with that. It also happened to be the first
alphabetically, but that is just a coincidence. Indeed, the
locking is trivial for the most part. One or two small
surprises, but otherwise easy to understand. You can get the
ArrayBlockingQueue
course here.
What I did not expect is just how complicated the iteration
is. The standard ArrayBlockingQueue
spans 1638 lines of code
(LOC), including comments. Strip out iteration, and we are
left with just 944 LOC. Thus over 42% of the class concerns
itself with iteration. In my ArrayBlockingQueue
Teardown Course, I spend more than an hour explaining
the iteration, and I think I rushed that section a bit.
For contrast, the LinkedBlockingQueue
has 1161 LOC and less
than 10% of that is its iterator. ConcurrentLinkedQueue
has
1075 LOC and 7.3% of that is the iterator. ArrayDeque
has a
similar mechanism of circular array to ArrayBlockingQueue
, but
only 8.5% of its 1235 LOC are needed to create two separate
iterators.
And yet, the iteration behaviour of ArrayBlockingQueue
is not perfect; it differs slightly from that of its cousins LinkedBlockingQueue
et al.
The first question that I want to ask is this: How do we know if an iterator is weakly-consistent? And the second is: What does weakly-consistent even mean? Can we see some examples please?
The iterator()
method returns an object that implements the
java.util.Iterator
interface. It is a classic gang-of-four
factory
method. The Iterator
has methods hasNext()
, next()
,
remove()
and forEachRemaining()
. However, it does not give us
any clues as to what characteristics it might possess.
Is it fail-fast like the iterator from ArrayList
? Or
is it perhaps a snapshot iterator, such as a
CopyOnWriteArrayList
would return? Or could it be
weakly-consistent, such as our dear ArrayBlockingQueue
gives us?
Where the Iterator
is silent, the
Spliterator
shouts from the rooftops. We can
thus use the spliterator()
method to figure out
the characteristics. Weakly-consistent spliterators are all
CONCURRENT
. Here is a demo program that checks
the characteristics of the collections from java.util.**:
import java.util.*; import java.util.concurrent.*; import java.util.function.*; import java.util.stream.*; import static java.util.Spliterator.*; public class WhatHasWeaklyConsistentIteration { public static void main(String... args) { List<Collection<?>> collections = List.of( new ArrayBlockingQueue<>(10), new ArrayDeque<>(), new ArrayList<>(), Collections.singleton(42), Collections.singletonList(42), Collections.emptyList(), Collections.emptySet(), Arrays.asList(1,2,3), ConcurrentHashMap.newKeySet(), new ConcurrentLinkedDeque<>(), new ConcurrentLinkedQueue<>(), new ConcurrentSkipListSet<>(), new CopyOnWriteArrayList<>(), new CopyOnWriteArraySet<>(), new DelayQueue<>(), new HashSet<>(), new LinkedBlockingDeque<>(), new LinkedBlockingQueue<>(), new LinkedHashSet<>(), new LinkedList<>(), new LinkedTransferQueue<>(), List.of(), new PriorityBlockingQueue<>(), new PriorityQueue<>(), Set.of(), new Stack<>(), new TreeSet<>(), new Vector<>() ); Predicate<Collection<?>> concurrent = c -> c.spliterator().hasCharacteristics(CONCURRENT); Predicate<Collection<?>> immutable = c -> c.spliterator().hasCharacteristics(IMMUTABLE); Consumer<Collection<?>> printer = c -> System.out.println("\t" + c.getClass().getName()); System.out.println("Weakly-Consistent (CONCURRENT)"); collections.stream() .filter(concurrent) .forEach(printer); System.out.println("Snapshot (IMMUTABLE)"); collections.stream() .filter(immutable) .forEach(printer); System.out.println("Neither"); collections.stream() .filter(concurrent.or(immutable).negate()) .forEach(printer); } }
The output for Java 16 is (this might change in future versions of Java):
Weakly-Consistent (CONCURRENT) java.util.concurrent.ArrayBlockingQueue java.util.concurrent.ConcurrentHashMap$KeySetView java.util.concurrent.ConcurrentLinkedDeque java.util.concurrent.ConcurrentLinkedQueue java.util.concurrent.ConcurrentSkipListSet java.util.concurrent.LinkedBlockingDeque java.util.concurrent.LinkedBlockingQueue java.util.concurrent.LinkedTransferQueue Snapshot (IMMUTABLE) java.util.Collections$SingletonSet java.util.Collections$SingletonList java.util.concurrent.CopyOnWriteArrayList java.util.concurrent.CopyOnWriteArraySet Neither java.util.ArrayDeque java.util.ArrayList java.util.Collections$EmptyList java.util.Collections$EmptySet java.util.Arrays$ArrayList java.util.concurrent.DelayQueue java.util.HashSet java.util.LinkedHashSet java.util.LinkedList java.util.ImmutableCollections$ListN java.util.concurrent.PriorityBlockingQueue java.util.PriorityQueue java.util.ImmutableCollections$SetN java.util.Stack java.util.TreeSet java.util.Vector
The first list is that of the weakly-consistent spliterators.
We can apply the same characteristics also to the iterators.
Fortunately there are only 8 such collections that we need
to investigate. Of the BlockingQueue
variety we have
ArrayBlockingQueue
, LinkedBlockingDeque/Queue
and
LinkedTransferQueue
. Then we have four concurrent collections
ConcurrentLinkedDeque/Queue
, ConcurrentSkipListSet
and a
set constructed from a ConcurrentHashMap
. They are all marked
as having CONCURRENT
spliterators, and thus we can presume
they are also weakly-consistent.
Next we have IMMUTABLE
iterators, the most important being
the CopyOnWrite collections, which make a copy of the backing
array whenever we change it. The iterator would hold the old
reference to the array, so that the iteration would always be
over the collection as it was when we created the iterator.
We call these snapshot iterators. They are fast to iterate,
but slow to change. The CopyOnWrite collections have proven
useful, but I have never used
singletonList/Set()
. IMMUTABLE
says that the
structure of the collection should not change, though the
contents may. Arrays.asList()
could be IMMUTABLE
, but it is
not a snapshot iterator.
The list under "Neither" is a mixed bag.
List.of(...)
, Set.of(...)
and Collections.emptyList/Set()
cannot throw a ConcurrentModificationException
. Neither can
Arrays.asList()
. The rest are fail-fast iterators and
absolutely can and will throw
ConcurrentModificationExceptions
if we look at them funny.
If nothing changes in a collection after we have made an iterator, we would expect to see all the elements in the collection, albeit not necessarily in the same order as we added them. However, if the collection changes after we have created an iterator, we might or might not see those changes.
To try this out, I wrote a series of tests that experiment with adding before or after the current iterator position, removing before, on, or after the current position, clearing the collection entirely, and a couple of other weird behaviours.
The simplest way of understanding weakly-consistent is that
we will never get a ConcurrentModificationException
, whilst
at the same time never seeing the same element twice (unless
it is there twice) and that we also won't skip over elements
that were always in the collection.
Vector
's Enumeration
was neither fail-fast nor
weakly-consistent. We could create an Enumeration, but even
if hasMoreElements()
was true, nextElement()
could throw a
NoSuchElementException
. Also, with Vector
if we removed an element
before our current position, then we would skip over the next
element. Similarly, adding an element before our current
position would make us see duplicate elements. These effects
do not happen with weakly-consistent iterators.
I wrote a test class that you can find here: WeaklyConsistentIteration.java. Since it is quite long, I will only show snippets in this newsletter.
For our first example, we want to test what happens if we create an iterator prior to adding one element:
test(collection -> { Iterator<Integer> it = collection.iterator(); collection.add(0); it.forEachRemaining(PRINTLN); });
The behaviour is consistent. None of the concurrent iterators print anything.
What if we change it to start with a single element, like so:
test(collection -> { collection.add(0); Iterator<Integer> it = collection.iterator(); collection.add(1); it.forEachRemaining(PRINTLN); });
ArrayBlockingQueue
prints only the 0, all the other
concurrent iterators print 0, 1. Why? Good that you ask,
and I happen to have a
comprehensive ArrayBlockingQueue course for that ;-)
Note though that the ArrayBlockingQueue is allowed to omit 1.
It would not be allowed to omit 0, since that was always
there, from when the iterator was created.
What if the collection starts with two or more elements, then we create the iterator and add one more element?
test(collection -> { collection.add(0); collection.add(1); Iterator<Integer> it = collection.iterator(); collection.add(2); it.forEachRemaining(PRINTLN); });
This time the behaviour is consistent again and we see
0, 1, 2 for all iterators. However, in Java 7, the
ArrayBlockingQueue
only showed 0, 1. That was legal
behaviour, but not desirable. It explains why they reworked
the iteration of ArrayBlockingQueue
for Java 8. In Java 7
this class was only 924 LOC with iteration taking up a mere
9.2%.
The next is a bit more tricky. I want to add an element
before our current iteration position. If the collection is
not ORDERED
, we can assume it is a hash map. If we know a bit
about the inner workings, we can make a wild guess that a
smaller Integer
would be before the bigger Integer
, at least
for small ranges of Integer. This was not the case prior to
Java 8. Similarly, if it is a SORTED
collection, then we can
again add a smaller number. Lastly, if it is a Deque, we can
use addFirst()
. Note that I'm using the Java 16 instanceof
pattern matching. It's a feature that I would love to never
need, but it has proven useful once or twice. For the other
collections we simply skip this test and move on.
test(collection -> { collection.add(5); collection.add(6); Iterator<Integer> it = collection.iterator(); if (!ordered(collection)) { System.out.println("\t\tunordered"); collection.add(0); // happens to put it before the 5 } else if (sorted(collection)) { System.out.println("\t\tsorted"); collection.add(0); } else if (collection instanceof Deque<Integer> deque) { System.out.println("\t\tDeque"); deque.addFirst(0); } else { System.out.println("\t\tskip ..."); return; } it.forEachRemaining(PRINTLN); });
All the collections where we were able to insert in front
showed 5, 6. Vector
under similar conditions would print
5, 5, 6, thus showing one element twice.
Next we want to try removing elements, starting with calling
clear()
after we have created our iterator:
collection.add(0); Iterator<Integer> it = collection.iterator(); collection.clear(); it.forEachRemaining(PRINTLN);
A bit unexpectedly, all the concurrent iterators print 0.
The thinking is, if we have created an iterator, we might
also have called hasNext()
by the time we call next()
.
Thus if there are any elements in the collection when we
create the iterator, it keeps a reference to that first
element in case it needs it. Vector
was more
gung-ho and simply threw NoSuchElementException
.
In our next test we remove the first element of three:
test(collection -> { collection.add(0); collection.add(1); collection.add(2); Iterator<Integer> it = collection.iterator(); collection.remove(0); it.forEachRemaining(PRINTLN); });
Taken what we have learned so far, we could assume that the answer would be 0, 1, 2. And we would be correct :-)
Next we remove an element after our current position:
test(collection -> { collection.add(0); collection.add(1); collection.add(2); Iterator<Integer> it = collection.iterator(); collection.remove(1); it.forEachRemaining(PRINTLN); });
This also works as expected and prints 0, 2.
The next test is a bit more tricky. We start with three elements. We then create an iterator and advance it to the second element. Next we remove the first element and see what is printed out:
test(collection -> { collection.add(0); collection.add(1); collection.add(2); Iterator<Integer> it = collection.iterator(); it.next(); // 0 collection.remove(0); it.forEachRemaining(PRINTLN); });
They all show 1, 2. A Vector/Enumeration
would omit 1 and
only print 2.
Next we add 5 elements, create the iterator and then remove the first two:
test(collection -> { collection.add(0); collection.add(1); collection.add(2); collection.add(3); collection.add(4); Iterator<Integer> it = collection.iterator(); collection.remove(0); collection.remove(1); it.forEachRemaining(PRINTLN); });
All of the iterators consistently print 0, 2, 3, 4. Vector
would have printed 2, 3, 4, which is a bit more correct in
this case, but the other faults make it unusable.
So far, so good. Now it gets peculiar. We fill the collection with 5 elements. We then create the iterator and remove and add another 5 elements.
test(collection -> { for (int i = 0; i < LENGTH; i++) collection.add(i); Iterator<Integer> it = collection.iterator(); for (int i = 0; i < LENGTH * 2; i++) { collection.remove(i); collection.add(i + LENGTH); } it.forEachRemaining(PRINTLN); });
Guess what happens? Well, ArrayBlockingQueue
shows 0, all
others show 0, 10, 11, 12, 13, 14. Both behaviours are legal
for weakly-consistent iterators. In Java 7 the
ArrayBlockingQueue
would have shown 10, 11, 12, 13, 14.
The last experiment tries to remove the last element by
calling it.remove()
after we are done iterating:
test(collection -> { collection.add(0); collection.add(1); collection.add(2); Iterator<Integer> it = collection.iterator(); it.forEachRemaining(PRINTLN); it.remove(); });
The ArrayBlockingQueue
throws an
IllegalStateException
. This is deliberate. In
the class we read this comment:
// Calling forEachRemaining is a strong hint that this // iteration is surely over; supporting remove() after // forEachRemaining() is more trouble than it's worth
I can imagine the JSR166 team late on a Friday night trying
to finish this code and wondering whether anyone on this
planet would ever want to call remove after
forEachRemaining()
. Doug Lea turns to Martin Buchholz and
shakes his head. Nope, no one. Well, there is this fellow
on the Island of Crete ... he might. But let's hope he does
not notice.
Kind regards
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.