Abstract: Import statements can get out of date quickly as we refactor our code. In this newsletter, our guest author Dr. Cay S. Horstmann shows us how we can clean up our import statements.
Welcome to the 51st edition of The Java(tm) Specialists' Newsletter sent to over 3800 readers in 84 countries, with latest additions of Chile and Kenya. I always get excited when I see an African name on my new subscriber list, since at the moment I only have subscribers in Africa from: Egypt, Ghana, Kenya, Mauritius, Morocco, Namibia, Nigeria, South Africa and Zimbabwe. I say "umkelekile" (welcome in Xhosa) to my new subscriber from Kenya.
This newsletter attracts the elite of Java programmers, since
we cover things that are not usually mentioned on Java
newsletters. If you are a subscriber to this newsletter, it
makes you part of the "elite" :-) A relatively new subscriber
to my newsletter is Dr. Cay S.
Horstmann, famous Java author in the "Core Java"
series. Cay very kindly pointed me to an article that he wrote
about a tool he created for cleaning up import
statements.
Seeing that I am an amateur writer, I did not dare "edit" the
article in case I completely messed it up, so I am sending it
to you as is, with just the syntax highlighting added and font changed
(almost like some students [and lecturers] do at universities).
Before we listen to what Dr. Horstmann has to say about the topic,
I would like to make a few of my own comments about the subject.
Having code with unnecessary import
statements looks unprofessional, but how do you keep them
up-to-date? An IDE which does this very nicely is
Eclipse. You click
on the class and say "Organize Imports", and bang! it
beautifies your import statements. I am sure there are other
IDEs out there that can do the same. However, if you don't have
such an IDE, the technique described in this newsletter is a
great way of solving this problem.
After my last Design Patterns Course, one of my students sent me the following note: "I really enjoyed the course. I origionally thought it was going to be about learning to draw UML diagrams. I was pleasantly surprised to discover the course was actually about different programming strategies. I learnt some very cool tricks and new way of aproaching the problems of designing a system. A must for those who wish to design maintainable systems." Please send me an email if your company would like to receive training in Design Patterns.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
In the Java programming language, the names of classes that are defined inside packages always start with the package name. For example, the class name
java.awt.Rectangle
starts with the name of the package java.awt
. As a convenience
to programmers, the import
mechanism can be used to reference classes
without the package name. For example, it is tedious to write
java.awt.Rectangle box = new java.awt.Rectangle(5, 10, 20, 30);
You can use the more convenient form
Rectangle box = new Rectangle(5, 10, 20, 30);
provided you import the class.
Classes in the same package are automatically imported, as are the classes
in the java.lang
package. For all other classes, you must supply
an import statement to either import a specific class
import java.awt.Rectangle;
or to import all classes in a package, using the wildcard notation
import java.awt.*;
Importing classes can lead to ambiguities. A class name is ambiguous if it occurs in two packages that are imported by wildcards. For example, suppose a program contains the imports
import java.awt.*; import java.util.*;
The class name List
is now ambiguous because there are two classes
java.awt.List
and java.util.List
. You can resolve the ambiguity
by adding a specific import of the class name:
import java.awt.*; import java.util.*; import java.util.List;
However, if you need to refer to both java.awt.List
and
java.util.List
in the same source file, then you have crossed the limits of the import
mechanism. You can use an import
statement to shorten one of the
names to List
, but you need to reference the other by its full name
whenever it occurs in the source text.
Ambiguities are unpleasant because they can arise over time, as libraries
expand. For example, in JDK 1.1, there was no java.util.List
class.
Consider a program that imports java.awt.*
and java.util.*
and uses the name List
as a shortened form of java.awt.List
.
That program compiles without errors under JDK1.1 but fails to compile in Java 2.
Therefore, the use of wildcards for imports is somewhat dangerous. However, importing each class can lead to long import lists that are tedious to manage, especially as code is moved from one class to another during development.
To illustrate this, consider the import list in one of my recent files.
import java.awt.*; import java.awt.geom.*; import java.io.*; import java.util.*;
It turned out that I really needed
import java.awt.Graphics2D; import java.awt.Rectangle; import java.awt.geom.Point2D; import java.awt.geom.Rectangle2D; import java.util.ArrayList;
(Apparently, the need for importing java.io.*
had gone away at
some point during the program's evolution)
Thus, a problem that Java programmers face is how to keep import lists up-to-date when programs change.
One time-honored solution of checking import lists is to comment out one line at a time until compiler errors go away. Naturally, that is tedious.
Another solution is to stop using import lists altogether and referencing the full class names at all times. Naturally, that too is tedious.
Several compilers emit lists of classes that they load as they compile
a program. For example, if you run the compiler in the Sun J2SE SDK 1.4 with
the -verbose
option, you get a list such as
[loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/Font.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/Graphics2D.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/Stroke.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/font/FontenderContext.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/geom/Line2D.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/geom/Point2D.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/geom/Rectangle2D.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/util/ArrayList.class)] [loading ./AbstractEdge.class] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/lang/Object.class)] [loading ./Edge.class] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/io/Serializable.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/lang/Cloneable.class)] [loading ./LineStyle.class] [loading ./ArrowHead.class] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/lang/String.class)] [checking SegmentedLineEdge] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/Graphics.class)] [loading ./SerializableEnumeration.class] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/util/AbstractList.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/util/AbstractCollection.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/geom/Line2D$Double.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/Shape.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/text/CharacterIterator.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/lang/Comparable.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/lang/CharSequence.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/geom/Point2D$Double.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/awt/geom/RectangularShape.class)] [loading /usr/local/j2sdk1.4.0/jre/lib/rt.jar(java/text/AttributedCharacterIterator.class)]
It would be an easy matter to write a script that transforms this output
into a set of import statements. However, the output contains classes that
don't actually need to be imported (such as CharSequence
and
AttributedCharacterIterator
). These classes are loaded because
some of the loaded classes depend on them. It is not clear (at least
to me) how one can weed out the unneeded classes.
I used a different approach. I wrote a utility that harvests the class
file. Unlike source files, class files never contain shortened class files.
Even standard classes such as java.lang.String
are referenced by
their full names.
Class files contain the names of classes as well as field and method descriptors
that contain class names (in an odd format, such as Ljava/lang/String;
). To harvest the class names, one must know the layout of the constant
pool and the field and method descriptors. The class file format is well-documented--see
the references at the end of this document--and only moderately complex.
ImportCleaner
Program
The ImportCleaner
program parses one or more class files, harvests
the class names, removes the unnecessary ones, sorts the remaining ones, and
prints out a list of import statements to System.out
.
Since ImportCleaner
parses class files, your source file must
first be compiled (presumably with a less-than-optimal import statement set).
Then run ImportCleaner
on the class file, capture the output, and
paste the import lines into your source file.
For example, to run ImportCleaner
on its own class files, you use
java -jar importcleaner.jar ImportCleaner
(You can find the ImportCleaner
class files by unzipping
importcleaner.jar
).
The result is this list of imports, printed to System.out
:
import java.io.DataInput; import java.io.DataInputStream; import java.io.File; import java.io.FileInputStream; import java.io.FilenameFilter; import java.io.IOException; import java.io.PrintStream; import java.util.Iterator; import java.util.Set; import java.util.TreeSet;
Typically, your next step is to capture that list of imports and paste it into your source file.
If your source file contains multiple top-level classes, then you need to list all of them on the command line. For example,
java -jar importcleaner.jar MyClass MyHelperClassThatIsDefinedInTheSameFile
However, inner classes are located automatically.
You can supply the name of a class in any of the following forms:
MyClass
.class
suffix: MyClass.class
.java
suffix: MyClass.java
The ImportCleaner
program strips off the suffixes and then looks
for the file MyClass.class
and all files of the
form MyClass$*.class
(for inner classes).
If your class file is located in a package, you need to invoke ImportCleaner
from the base directory. For example, if you use the package com.mycompany.myproject
,
invoke ImportCleaner
from the directory that contains the com
directory. You can supply the package name in either of the following
forms:
com.mycompany.myproject.MyClass
com/mycompany/myproject/MyClass
(or \ on Windows)Capturing the output is very easy if you use the shell mode in Emacs. Other good programming editors have similar features. Alternatively, you can redirect the output to a file:
java -jar importcleaner.jar class(es) > outputfile
The program takes the following options:
-wildcard
: This option produces an import list with wildcards
instead of individual class names.-keepall:
This option keeps all imports, including the ones
from java.lang
and the current package. The default is to suppress
the java.lang
package and the current package
-usecalls
: This option harvests method calls, which is sometimes
beneficial to guess local variable types (see the Limitations section below).
On the other hand, method calls can lead to spurious imports. This happens
when you supply a parameter of a subclass type or catch a return value in
a variable of a superclass type. For example, your code may call the constructor
java.io.DataInputStream(java.io.InputStream)
and pass a parameter
of type java.io.FileInputStream
, like this:
DataInputStream in = new DataInputStream(new FileInputStream(file));Harvesting the method call yields the spurious
import java.io.InputStream
.
Such spurious imports are generally harmless but unsightly.
Unfortunately, constants are not included in the class files, so this utility will miss them. Typical examples are:
BorderLayout.NORTH Color.red
Also, the types of local variables are not included in the class file. This sounds like a big problem, but fortunately, the same class or interface name is often used in a method or field descriptor as well. To illustrate the issue, consider this code:
public void draw(Graphics2D g2) { Stroke oldStroke = g2.getStroke(); g2.setStroke(new BasicStroke()); // . . . g2.setStroke(oldStroke); }
ImportCleaner
includes java.awt.Graphics2D
because it
appears in the method signature of the processed class. It includes java.awt.BasicStroke
because of the constructor call. But by default it won't include java.awt.Stroke
since there is no guaranteed reference for it in the source file.
A remedy is to recompile after pasting in the ImportCleaner
output.
Then look at the error messages and manually insert the missing imports. It
sounds bad, but in actual practice it doesn't seem to be all that bothersome.
Another remedy would be to use fully qualified class names in this situation.
You may also wish to use the -usecalls
option. That option harvests
method calls. In our example, that option finds the java.awt.Stroke
class from the ()Ljava/awt/Stroke;
method descriptor of the
Graphics2D.getStroke
method. Harvesting method descriptors is not the
default because it can lead to spurious imports. (See the description of the
-usecalls
option for more information.)
ImportCleaner
, and
it has a GUI. BCEL (commons.apache.org/proper/commons-bcel/
) is a library for reading and modifying class files used by ImportScrubber.That's it for this week.
Kind regards
Heinz
P.S. Even though my newsletter template says that Maximum Solutions, South Africa, has the copyright on this newsletter, the copyright of the content of the article belongs to Dr. Cay S. Horstmann. Please contact him if you would like to publish this article anywhere.
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.