Abstract: We now look at why the best-case scenario for a getMethod() call is O(n), not O(1) as we would expect. We also discover that the throughput of getMethod() has doubled in Java 9.
Welcome to the 254th edition of The Java(tm) Specialists' Newsletter. As a pimply-faced youth, when I needed to do some deep thinking, I'd gather my books and wander down to Botany Bay. I would sit on the rocks, watching the kelp bob up and down in the surf, and think. Def Leppard's "Pour Some Sugar On Me" bellowed from walkman into eardrums. It was the 80s after all. There's something about staring at salty water that seems to stir my creative juices.
Fast-forward thirty years to Crete. Since I am a "polyteknos" -- I have too many children -- we drive a Mercedes Viano. We use it on Sundays to cart the family off to church and a delicious lunch at Mitsos afterwards. The rest of the week it gathers dust, lots of. But not anymore. It is now my mobile executive suite. It looks rather conspicuous, all black with darkened windows. The previous owner chauffeured famous rock stars and royalty. It makes the perfect thinking place. I drive it to random beaches and sit in the back surrounded by Java books. There I philosophy about how the design patterns of old have a place in modern Java.
I will tell you more about my adventures in future newsletters.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
As I mentioned yesterday,
I was searching for examples of the GoF Builder in the JDK.
I thought that perhaps reflection would use it. I knew that
the Method
returned by Class.getMethod()
was a protective
copy. Turns out, this is rather an example of the Prototype.
In order to understand how Class.getMethod()
worked
internally, I stepped into the code with IntelliJ. Whilst
doing so, I noticed that even when we had found the method
we were looking for, it kept on searching. (BTW, if you love
IntelliJ or are curious what it can do, be sure to get my
free IntelliJ
Wizardry Course).
When we call Class.getMethod(name,parameterTypes)
,
it does a linear search through an internal
publicMethods
array, lazily created. The
methods are not sorted in any way and, as I
discovered in my tests, they are also not always stored in
the same order. This is why a linear search is necessary.
Most of the time, the number of methods is going to be fairly
small for one class, although we do include superclass methods
too. We would expect it to be less than 100 though. Is that
reasonable? I did some searching through the Java 8 rt.jar
file. I found almost 500 classes that had 100 or more
methods, including those from superclasses. The worst two
offenders were from CORBA, as you probably expected. The next
38 offenders were Swing or UI classes. The average number of
methods per class is 25, which includes methods from Object.
If we use getDeclaredMethods()
, the average number of methods
per class decreases to just 7.35.
Now comes the clincher: Even when we find a matching method in our jumbled up array, we don't stop searching. Why?
Java 5 gave us covariant return types. This means that if class A
has method Object foo()
, then subclass B can have
method Number foo()
. Same signature, different return type.
A further subclass C of B could contain method Integer foo()
:
public class A { public Object foo() { return new Object(); } } public class B extends A { public Number foo() { return new Integer(42); } } public class C extends B { public Integer foo() { return new Integer(42); } }
C.foo()
's return type must be assigneable from B. Thus we
cannot make it return String. This also means that class C
actually contains three methods foo()
:
import java.lang.reflect.*; public class ReflectionTest { public static void main(String... args) throws NoSuchMethodException { System.out.println(A.class.getMethod("foo")); System.out.println(B.class.getMethod("foo")); System.out.println(C.class.getMethod("foo")); System.out.println("All methods in class C:"); for (Method method : C.class.getMethods()) { System.out.println(method); } } }
When we run this, we see the following output:
public java.lang.Object A.foo() public java.lang.Number B.foo() public java.lang.Integer C.foo() All methods in class C: public java.lang.Object C.foo() public java.lang.Number C.foo() public java.lang.Integer C.foo() ... and all the Object methods
When I saw Class.searchMethods()
in operation,
I realized why they were continuing their search, even when
they had found a good candidate. The reason is that there
might be another method further down in the array that had a
more specific return type. If we call C.class.getMethod("foo"), we want public java.lang.Integer C.foo()
Since the average number of methods per class is small, they decided to not bother sorting the array. A binary search would find the desired method in O(log n) steps. However, sorting would take O(n * log n) steps, so unless we did a lot of lookups, we would never regain the performance loss from sorting. A similar argument would apply for a HashMap.
Usually, the best-case scenario for a search is O(1). For example, if we search through an unsorted list, we would start at the beginning and scan through until the end. If we were lucky, the first element would be the one we were looking for and we could stop. Average case for an unsorted list would be O(n). On average we would need n/2 lookups, but since it is a constant multiplier of 0.5, we would only consider the slowdown as n became larger. Since the average time would be in relation to the size of the list, it would also be written as O(n). The worst-case is what we consider most of the time for searching, and that is also O(n) for an unsorted list. If we have a sorted ArrayList, then best-case would be O(1) and the average and worst cases would be O(log n). If this is all a bit much for you, make sure to get our Data Structures in Java 9 Course, which explains complexity in a lot more detail.
In Class.searchMethods()
, the best-case is O(n),
as are the average and worst cases. However, this is only
since Java 5, when covariant return types were introduced
into the language. Prior to that, the computational
complexities were the same as for an unsorted list.
How can we demonstrate this beyond eyeballing the JDK code?
Here is a ClassGenerator that produces classes with increasing numbers of methods. I wrote the code in Java 1.4 syntax so that I could run it on an old machine. No generics, no Java 8 streams, no try-with-resource, no varargs. It felt a bit weird. Here we go:
import java.io.*; public class ClassGenerator { public static void main(String[] args) throws IOException { for (int methods = 4; methods < 65536; methods *= 5) { PrintStream out = new PrintStream( new FileOutputStream("Methods" + methods + ".java")); out.println("public class Methods" + methods + " {"); for (int i = 0; i < methods; i++) { out.println(" public void foo" + i + "() {}"); } out.println("}"); out.close(); } } }
ClassGenerator spits out 7 classes with the largest containing 62500 methods. The JDK 1.4.0-b92 javac compiler choked with a StackOverflowError on the largest class, so I used the Java 8 compiler with -source 1.4 and -target 1.4. That worked.
To demonstrate the performance degradation post Java 4, I wrote the following class that tests how many times we can find the first, last and middle "foo" methods in our array.
import java.lang.reflect.*; public class MethodLookupCost { public static void main(String[] args) throws Exception { System.out.println("Warmup"); for (int i = 0; i < 2; i++) { test(); } System.out.println(); System.out.println("Proper run"); for (int i = 0; i < 3; i++) { test(); } System.out.println(blackhole); } private static void test() throws Exception { for (int methods = 4; methods < 65536; methods *= 5) { test(Class.forName("Methods" + methods)); } } private static volatile Method blackhole; private static void test(Class clazz) throws Exception { System.out.println(clazz); System.out.println(clazz.toString().replaceAll(".", "-")); Method[] methods = clazz.getMethods(); int firstIndex = fooIndex(methods, 0, 1); Method first = methods[firstIndex]; int lastIndex = fooIndex(methods, methods.length - 1, -1); Method last = methods[lastIndex]; int middleStart = (lastIndex - firstIndex) / 2; int middleIndex = fooIndex(methods, middleStart, 1); Method middle = methods[middleIndex]; System.out.println("indexes = (" + firstIndex + ", " + middleIndex + ", " + lastIndex + ")"); System.out.println("foos = (" + first.getName() + ", " + middle.getName() + ", " + last.getName() + ")"); System.out.println(" first = " + methodFinds(first)); System.out.println(" middle = " + methodFinds(middle)); System.out.println(" last = " + methodFinds(last)); System.out.println(); } private static int fooIndex(Method[] methods, int start, int inc) { int idx = start; while (!methods[idx].getName().startsWith("foo")) idx += inc; return idx; } private static long methodFinds(Method method) throws Exception { return methodFinds(method.getDeclaringClass(), method.getName()); } /** * Counts how many times we can get the method in a second. */ private static long methodFinds(Class clazz, String method) throws Exception { long time = System.currentTimeMillis(); long methodFinds = 0; while (System.currentTimeMillis() - time < 1000) { blackhole = clazz.getMethod(method, (Class[]) null); methodFinds++; } return methodFinds; } }
The purpose of this benchmark was to confirm my suspicion
that prior to Java 5, the best-case lookup cost of
Class.getMethod()
was O(1) and is now O(n). However, we need
to consider the context of the benchmark. It is of almost no
relevance in real code, since we will seldom have more than
100 methods in a class.
Here is the output from Java 9.0.4:
class Methods4 -------------- indexes = (0, 1, 3) foos = (foo0, foo1, foo3) first = 10064862 middle = 10276604 last = 10112908 class Methods20 --------------- indexes = (0, 9, 19) foos = (foo13, foo8, foo4) first = 7009813 middle = 7290529 last = 7367027 class Methods100 ---------------- indexes = (0, 49, 99) foos = (foo25, foo74, foo4) first = 2238002 middle = 2166709 last = 3389455 class Methods500 ---------------- indexes = (0, 249, 499) foos = (foo25, foo249, foo499) first = 585933 middle = 399407 last = 407028 class Methods2500 ----------------- indexes = (0, 1249, 2499) foos = (foo25, foo1649, foo499) first = 72224 middle = 51273 last = 61625 class Methods12500 ------------------ indexes = (0, 6249, 12499) foos = (foo2796, foo9045, foo2795) first = 10671 middle = 10813 last = 10770 class Methods62500 ------------------ indexes = (0, 31249, 62499) foos = (foo2796, foo31545, foo2795) first = 1685 middle = 1174 last = 1732
Interesting is the output for Java 8. It starts off as half
the throughput of Java 9 on my machine. At about 100 methods,
they have the same throughput and then Java 8 ends up faster
than Java 9. I haven't researched this thoroughly, but I
suspect that the reason Java 9 is faster is that they don't
call intern()
on the method name on every single call. I'm
not sure why Java 8 is faster after we get to 100 methods,
but considering the average number of methods per class, I
think that's a reasonable compromise.
class Methods4 -------------- indexes = (0, 1, 3) foos = (foo0, foo1, foo3) first = 5711796 middle = 5611266 last = 5786185 class Methods20 --------------- indexes = (0, 9, 19) foos = (foo13, foo10, foo3) first = 5242688 middle = 5251910 last = 5254697 class Methods100 ---------------- indexes = (0, 49, 99) foos = (foo13, foo49, foo99) first = 3716464 middle = 3595559 last = 3628375 class Methods500 ---------------- indexes = (0, 249, 499) foos = (foo13, foo249, foo499) first = 1400637 middle = 1358047 last = 1380786 class Methods2500 ----------------- indexes = (0, 1249, 2499) foos = (foo13, foo1733, foo499) first = 228720 middle = 223003 last = 223820 class Methods12500 ------------------ indexes = (0, 6249, 12499) foos = (foo13, foo6733, foo499) first = 40339 middle = 40360 last = 40553 class Methods62500 ------------------ indexes = (0, 31249, 62499) foos = (foo12505, foo44215, foo62224) first = 5232 middle = 5391 last = 5397
I also ran this with Java 1.4.0 on a VMWare Fusion instance
with Windows XP. We can see two main differences between
pre and post Java 1.4. First is that the best case
performance stays the same, irrespective of the number of
methods. Second is that it appears that the Method[]
is
ordered, probably according to where the methods appear in
the source file. In the Java 8 output above, in Methods62500
the first method starting with "foo" was foo12505()
. The
middle was foo44215()
. And the last was foo62224()
. Java 9
is even stranger, with first foo2796()
, middle foo31545()
and last foo2795()
. Here is the output from Java 1.4.0:
class Methods4 -------------- indexes = (0, 1, 3) foos = (foo0, foo1, foo3) first = 1243931 middle = 1328329 last = 1372984 class Methods20 --------------- indexes = (0, 9, 19) foos = (foo0, foo9, foo19) first = 1410311 middle = 1482650 last = 1301043 class Methods100 ---------------- indexes = (0, 49, 99) foos = (foo0, foo49, foo99) first = 1305948 middle = 1147514 last = 960986 class Methods500 ---------------- indexes = (0, 249, 499) foos = (foo0, foo249, foo499) first = 1232767 middle = 658452 last = 437830 class Methods2500 ----------------- indexes = (0, 1249, 2499) foos = (foo0, foo1249, foo2499) first = 1230703 middle = 251969 last = 131139 class Methods12500 ------------------ indexes = (0, 6249, 12499) foos = (foo0, foo6249, foo12499) first = 1177412 middle = 56583 last = 29448 class Methods62500 ------------------ indexes = (0, 31249, 62499) foos = (foo0, foo31249, foo62499) first = 1240943 middle = 12028 last = 5926
As we saw, in Java 1.4, when we didn't have covariant return
types, we could return from our search as soon as we found
our first matching method. There could be only one, so there
was no point searching further. The throughput to find the
first method thus stays consistent, irrespective of the size
of the Method[]
. Classic O(1).
We can draw several conclusions from this. Firstly, the cost
of finding any method in a class is dependent on how many
methods are visible, not where in the class the method is
declared. Secondly, it appears that Java 9 has doubled the
throughput of Class.getMethod()
by getting rid of intern()
,
although that would need further research to confirm.
Thirdly, there was a hidden cost in covariant return types
that I had not considered up to now.
Kind regards from Crete
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.