Abstract: If we were to use a List as a Queue, which would be better: ArrayList or LinkedList? Turns out that it depends on the length of the list. In this newsletter we try write a self-tuning queue, which swaps over to a suitable implementation based on the number of current elements.
Welcome to the 24th issue of The Java(tm) Specialists' Newsletter. Winter has arrived in Cape Town with great rain and cold temperatures, much to the chagrin of the author. The only way to combat such is to sit next to the heater with a glass of South African 1997 Zonnebloem Cabernet Sauvignon and write Java newsletters. This newsletter has taken the most time to date - performance tests tend to have that effect on me :(((
Before I go on today's voyage, some feedback regarding the Socket Wheel idea:
I'm not quite sure who gave the best contributions regarding possible improvements for last newsletter's SocketWheel. Neil Bartlett (Algorithmics Canada) and John Vlissides (IBM USA) pointed out that the design was very close to the Reactor pattern by Schmidt. This is a great plus mark for patterns; they truly are things that OO developers around the world would come up with independently. Josh Rehman pointed out that it is not too clever to try minimze memory as that is very cheap nowadays, which I agree with. He also pointed out that JDK 1.4 has non-blocking IO making it possibly a lot easier to implement such a SocketWheel. Ecce Jezuch (Poland) suggested using the available() method to find out how many bytes would be available without blocking, but unfortunately under Windows the JDK always returned 0.
James Pereira provided some excellent information regarding sockets. It's quite technical, so I'm including it verbatim:
"Registered Ports, ports between 1024 and 49151, are listed by the IANA and on most systems can be used by applications or programs executed by users. Table C.2 specifies the port used by the server process as its contact port. The IANA registers uses of these ports as a convenience to the Internet community. To the extent possible, these same port assignments are used with UDP. The Registered Ports are in the numerical range of 1024-49151. The Registered Ports between 1024 and 5000 are also referred to as the Ephemeral Ports. At least on Windows , The TCP/Stack (OS) re-uses these ports internally on every socket connection cycling from 1024...5000 wrapping around to 1024 again. This could lead to some interesting problems if sockets are opened and close very quickly as there is usually a time delay before that port is made available again...
"Second, the number of user-accessible ephemeral ports that can be used to source outbound connections is configurable with the MaxUserPort registry entry (HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters). By default, when an application requests any socket from the system to use for an outbound call, a port between the values of 1024 and 5000 is supplied. You can use the MaxUserPort registry entry to set the value of the highest port number to be used for outbound connections. For example, setting this value to 10000 would make approximately 9000 user ports available for outbound connections. For more details, see RFC 793. See also the MaxFreeTcbs and MaxHashTableSize registry settings (HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters).
"BTW: The stack and memory size issue is only an NT issue. On NT the stack size is configurable via linker and EditBin.exe unfortunately we don't want to mess with the VM."
This information should solve the problem that I encountered with too many sockets causing exceptions.
Ideally we should be able to specify stack size per thread as some threads will need a lot and others only a little, so I still think that we have a problem with many threads and their stack size.
Another excellent contribution was made by Craig Larman, author of "Java 2 Performance and Idiom Guide", who suggested using an approach of multiplexing sockets described in his book to minimize the number of sockets needed per client connection. Next week I will try and write about this idea and give you a solution that hopefully will work acceptably. I always recommend Craig's book as an excellent book for the Java fundi who wants to have his mind activated regarding performance ideas. It is very difficult to write anything about performance as it is so dependent on your implementation and hardware. This leads me to this week's newsletters on self-tuning FIFO queues....
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
HotSpot(tm) has caused a lot of trouble for the Java Specialist in the field, especially those of us with a few years experience. All of a sudden, all the hard-earned performance knowledge was wiped out in one foul swoop. The only thing left for us to do was to write the simplest code that could possibly work and let the HotStop compiler sort out performance for us. I liken it to the great financial crisis of Japan in the 90's, where no one knew whether he was coming or going and all the old certainties went out of the window. Luckily, unlike in Japan, we are not taking this too seriously, so our windows are still closed. Fortunately everyone knows that Java is slow ;-)
A lot of the performance tricks we used in old code actually make our code slower under HotSpot. Since we don't know what the performance of our code will be for a specific platform, would it be completely hairbrained to write self-tuning code? Write 3 algorithms, let the program measure on the target platform which performs best, and choose that algorithm for the duration of the VM.
To illustrate this idea, I want to write a FIFO queue that is based on a java.util.List implementation. A while ago I discovered that java.util.ArrayList is sometimes faster than java.util.LinkedList for FIFO queue implementations. The switch over occurs at a specific point in time, which we can measure beforehand. The switch-over point is dependent on the VM, whether we are using HotSpot(tm), etc. For example, on my little notebook with 256MB RAM and Pentium III 700, the cross-over point is 50 elements in the list when I use HotSpot, but 500 elements when I switch off hotspot compiling (sometimes!).
The interface that the FIFO queues will implement is very simple:
public interface FIFO { /** Add an object to the end of the FIFO queue */ boolean add(Object o); /** Remove an object from the front of the FIFO queue */ Object remove(); /** Return the number of elements in the FIFO queue */ int size(); }
We implement this interface and extend ArrayList and LinkedList:
// FIFOArrayList.java import java.util.ArrayList; public class FIFOArrayList extends ArrayList implements FIFO { public Object remove() { return remove(0); } } // FIFOLinkedList.java import java.util.LinkedList; public class FIFOLinkedList extends LinkedList implements FIFO { public Object remove() { return remove(0); } }
We also write a SwappingOverFIFOQueue which has values for HIGH and LOW water marks. When we reach a HIGH water mark and we are busy using an ArrayList, we start using a LinkedList. On the contrary, if we reach a LOW water mark and we are busy using a LinkedList we start using an ArrayList.
In foresight to my next example, I have made it possible to set the watermarks, which also checks the optimal list types for all the lists currently in the system. We have to be careful to not get a memory leak by keeping handles to the instances of the SwappingOverFIFOQueue so we use WeakReferences to hold the references. Have a look at newsletter 15 for a discussion on Weak / Soft References.
// SwappingOverFIFOQueue.java import java.util.*; import java.lang.ref.WeakReference; public final class SwappingOverFIFOQueue implements FIFO { /** The low value after which we switch over to ArrayList */ private static int LOW = 30; /** The high value after which we switch down to LinkedList */ private static int HIGH = 70; /** This list contains weak references of instances of this class */ private static List instances = new LinkedList(); /** We add the weak references in an initializer block */ { instances.add(new WeakReference(this)); } /** When we set the low and high water marks we go through all the existing instances and check for the optimal list type. If the references is null we remove the WeakReference from our instance list. */ public static void setWaterMarks(int low, int high) { LOW = low; HIGH = high; Iterator it = instances.iterator(); while(it.hasNext()) { WeakReference ref = (WeakReference)it.next(); SwappingOverFIFOQueue q = (SwappingOverFIFOQueue)ref.get(); if (q == null) { it.remove(); } else { q.checkOptimalListType(); } } } private List list = new ArrayList(); public Class getListType() { return list.getClass(); } private int size = 0; public boolean add(Object o) { try { list.add(o); return true; } finally { if (++size == HIGH) checkOptimalListType(); } } public Object remove() { try { return list.remove(0); } finally { if (--size == LOW) checkOptimalListType(); } } public int size() { return size; } private void checkOptimalListType() { if (size >= HIGH && (!(list instanceof LinkedList))) { list = new LinkedList(list); } else if (size <= LOW && (!(list instanceof ArrayList))) { list = new ArrayList(list); } } }
My test program takes the number of entries in the queue and then illustrates how often we can add/remove in 2 seconds for each of the queues. I found that you get the best performance results when you run your tests for about 2 seconds each, so I count iterations rather than milliseconds.
// SwappingOverFIFOQueueTest.java import java.util.*; public class SwappingOverFIFOQueueTest { private static int ENTRIES; public static void test(FIFO queue) { for (int i=0; i<ENTRIES; i++) { queue.add(new Object()); } long up_to = System.currentTimeMillis() + 2000; int iterations = 0; while(System.currentTimeMillis() <= up_to) { queue.add(new Object()); queue.remove(); iterations++; } System.out.println(queue.getClass()); System.out.println("\t" + iterations + " iterations"); } public static void main(String[] args) { if (args.length != 1) { System.out.println( "Usage: java SwappingOverFIFOQueueTest entries"); System.exit(1); } ENTRIES = Integer.parseInt(args[0]); System.out.println("Entries = " + ENTRIES); test(new FIFOArrayList()); test(new FIFOLinkedList()); SwappingOverFIFOQueue q = new SwappingOverFIFOQueue(); test(q); System.out.println("Current queue implementation " + q.getListType().getName()); } }
On my notebook, when I run this program with 0 entries, I get the following output:
Entries = 0 class FIFOArrayList 4552883 iterations class FIFOLinkedList 2551017 iterations class SwappingOverFIFOQueue 3594810 iterations
With 100 entries I get:
Entries = 100 class FIFOArrayList 1800877 iterations class FIFOLinkedList 2509328 iterations class SwappingOverFIFOQueue 2158451 iterations
And with 10000 entries I get:
Entries = 10000 class FIFOArrayList 49500 iterations class FIFOLinkedList 812933 iterations class SwappingOverFIFOQueue 758657 iterations
We can thus see that the SwappingFIFOQueue is always faster than the worst case and slower than the best case, as one would logically expect. However, I chose the HIGH and LOW values from some tests that I made on my notebook, for that specific JVM. If I take the JDK 1.2.2 that comes with JBuilder, for 100 entries I get:
Entries = 100 class FIFOArrayList 1434122 iterations class FIFOLinkedList 1307108 iterations class SwappingOverFIFOQueue 1178115 iterations
Or if I use the -Xint mode for JDK 1.3 under Windows to switch off the hotspot compiler, for 100 entries I get
Entries = 100 class FIFOArrayList 497550 iterations class FIFOLinkedList 480599 iterations class SwappingOverFIFOQueue 392314 iterations
In both cases, the values of the SwappingOverFIFOQueue were worse than for both the ArrayList and the LinkedList.
We therefore write a Profiler that finds ideal HIGH/LOW water marks for the JVM that is running on your system and sets up the SwappingOver water marks.
// SwappingOverFIFOQueueProfiler.java import java.util.*; /* For the sake of brevity we only consider two implementations of java.util.List, namely java.util.ArrayList and java.util.LinkedList. */ public class SwappingOverFIFOQueueProfiler { private static boolean isArrayListFaster(int entries) { System.out.println("isArrayListFaster(" + entries + ")"); return countIterations(new ArrayList(), entries) > countIterations(new LinkedList(), entries); } private static int countIterations(List list, int entries) { for (int i=0; i<entries; i++) { list.add(new Object()); } long end = System.currentTimeMillis() + 1000; int iterations = 0; while(System.currentTimeMillis() <= end) { iterations++; list.add(new Object()); list.remove(0); } return iterations; } static { int checks = 0; int watermark = 1; int bestWatermark = 0; for (int i=0; i<16; i++) { if (isArrayListFaster(watermark)) { bestWatermark = Math.max(watermark, bestWatermark); watermark *= 2.0; } else { watermark *= 0.75; if (watermark <= bestWatermark) watermark *= 1.25; } } System.out.println("Best watermark = " + bestWatermark); int low = (int)(bestWatermark * 0.75); int high = (int)(bestWatermark * 1.25); System.out.println("Setting LOW to " + low + " and HIGH to " + high); SwappingOverFIFOQueue.setWaterMarks(low, high); } public static void main(String[] args) { SwappingOverFIFOQueueTest.main(new String[] { "0" }); SwappingOverFIFOQueueTest.main(new String[] { "100" }); SwappingOverFIFOQueueTest.main(new String[] { "10000" }); } }
If we load this class in our system then it will do measurements of where the swap-over between ArrayList and LinkedList performance occurs. On my computer, with JDK 1.3 and HotSpot, the swap-over was measured to happen at about 32 entries in the list. When I switch off the HotSpot, it occurs at about 121 entries, and under JDK 1.2.2 it happens at about 303.
After spending about 10 hours on this stupid newsletter, I have to conclude that it would be better to stick to a LinkedList for a FIFO queue as it is a better "average" performer. Perhaps the lesson I've learnt from this newsletter is that we must be careful of writing code which is too complicated as it tends to be more difficult to optimize. As performance guru Craig Larman pointed out though, we must be sure to not ignore performance altogether; our customers might just kill the project if the prototypes perform like dogs.
I always appreciate any feedback, both positive and negative, so please keep sending your ideas and suggestions. Please also remember to take the time to send this newsletter to others who are interested in Java.
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.