Abstract: What has a larger serializable size, ArrayList or LinkedList? In this newsletter we examine what the difference is and also why Vector is a poor candidate for a list in a serializable class.
Welcome to the 183rd edition of The Java(tm) Specialists' Newsletter, sent to you from Chania on the beautiful island of Crete.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
Hopefully we all have some idea of how the ArrayList and LinkedList are structured internally. The ArrayList holds a pointer to a single Object array, which grows as the number of elements exceed the size of the array. The ArrayList's underlying Object array grows by about 50% whenever we run out of space. The LinkedList has pointers to the link nodes at the front and end of the list. Each link node is an object and has pointers forward and backwards and to the object that is being held in the list. The memory requirements of the LinkedList is about 4x larger than an equivalently sized ArrayList. In addition, the ArrayList allows random access into the middle, whereas the LinkedList has a lookup complexity of O(n).
One of the questions I enjoy asking in my Java Specialist Master Course is this: Which list uses more bytes when serialized? LinkedList or ArrayList? Think about this for a bit before reading on, considering the internal structure of each list.
Before I show the answer, I'd like to go back about 8.5 years, when I described memory usage in Java. I had chosen an experimental approach to figuring out how many bytes are occupied by Java objects. At the time, one of my readers told me he measured object size by serializing Java objects to a byte array. I argued that the result would differ completely from the bytes in memory. This newsletter shows an example of how different the results may be.
Most programmers guess that the LinkedList would have a larger serialized size than the ArrayList. However, that is never the case:
import java.io.*; import java.util.*; public class ListWritingSize { public static void main(String[] args) throws IOException { test(new LinkedList<String>()); test(new ArrayList<String>()); } public static void test(List<String> list) throws IOException { for (int i = 0; i < 10; i++) { list.add("hello world"); } ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(baos); out.writeObject(list); out.close(); System.out.println(list.getClass().getSimpleName() + " used " + baos.toByteArray().length + " bytes"); } }
When we run this, we see that the LinkedList uses 107 bytes,
whilst the ArrayList uses 117 bytes. Instead of just naively
writing the contents of the collections to the stream,
the lists both contain writeObject()
and
readObject()
methods. These write the contents
of the lists to the stream.
The ArrayList and LinkedList work similarly, in that they write out the number of elements and the actual objects contained in the list. However, ArrayList also writes out the size of the underlying array, used to recreate an identical ArrayList to what was serialized.
This additional int
is what makes up
the 10 byte difference in the serialization size.
At this point I need to also point out that the String "hello world" is only serialized once, after that only a reference number to the object is written.
The old Vector class implements serialization in a naive way.
They simply do the default serialization, which writes the
entire Object[]
as-is into the stream. Thus if
we insert a bunch of elements into the List, then clear it,
the difference between Vector and ArrayList is enormous.
import java.util.*; import java.io.*; public class VectorWritingSize { public static void main(String[] args) throws IOException { test(new LinkedList<String>()); test(new ArrayList<String>()); test(new Vector<String>()); } public static void test(List<String> list) throws IOException { insertJunk(list); for (int i = 0; i < 10; i++) { list.add("hello world"); } ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(baos); out.writeObject(list); out.close(); System.out.println(list.getClass().getSimpleName() + " used " + baos.toByteArray().length + " bytes"); } private static void insertJunk(List<String> list) { for(int i = 0; i<1000 * 1000; i++) { list.add("junk"); } list.clear(); } }
When we run this code, we get the following output:
LinkedList used 107 bytes ArrayList used 117 bytes Vector used 1310926 bytes
Vector can use a staggering amount of bytes when being serialized. The lesson here? Don't ever use Vector as Lists in objects that are Serializable. The potential for disaster is too great.
Here is a question that I puzzled about for a bit. Why does Vector have a writeObject() method that simply does the defaultWriteObject()? It does not have a readObject().
Kind regards
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.