Running on Java 22-ea+27-2262 (Preview)
Home of The JavaSpecialists' Newsletter

183Serialization Size of Lists

Author: Dr. Heinz M. KabutzDate: 2010-05-31Java Version: 1.2Category: Performance
 

Abstract: What has a larger serializable size, ArrayList or LinkedList? In this newsletter we examine what the difference is and also why Vector is a poor candidate for a list in a serializable class.

 

Welcome to the 183rd edition of The Java(tm) Specialists' Newsletter, sent to you from Chania on the beautiful island of Crete.

javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.

Serialization Size of Lists

Hopefully we all have some idea of how the ArrayList and LinkedList are structured internally. The ArrayList holds a pointer to a single Object array, which grows as the number of elements exceed the size of the array. The ArrayList's underlying Object array grows by about 50% whenever we run out of space. The LinkedList has pointers to the link nodes at the front and end of the list. Each link node is an object and has pointers forward and backwards and to the object that is being held in the list. The memory requirements of the LinkedList is about 4x larger than an equivalently sized ArrayList. In addition, the ArrayList allows random access into the middle, whereas the LinkedList has a lookup complexity of O(n).

One of the questions I enjoy asking in my Java Specialist Master Course is this: Which list uses more bytes when serialized? LinkedList or ArrayList? Think about this for a bit before reading on, considering the internal structure of each list.

Before I show the answer, I'd like to go back about 8.5 years, when I described memory usage in Java. I had chosen an experimental approach to figuring out how many bytes are occupied by Java objects. At the time, one of my readers told me he measured object size by serializing Java objects to a byte array. I argued that the result would differ completely from the bytes in memory. This newsletter shows an example of how different the results may be.

Most programmers guess that the LinkedList would have a larger serialized size than the ArrayList. However, that is never the case:

import java.io.*;
import java.util.*;

public class ListWritingSize {
  public static void main(String[] args) throws IOException {
    test(new LinkedList<String>());
    test(new ArrayList<String>());
  }

  public static void test(List<String> list) throws IOException {
    for (int i = 0; i < 10; i++) {
      list.add("hello world");
    }
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectOutputStream out = new ObjectOutputStream(baos);
    out.writeObject(list);
    out.close();
    System.out.println(list.getClass().getSimpleName() +
        " used " + baos.toByteArray().length + " bytes");
  }
}

When we run this, we see that the LinkedList uses 107 bytes, whilst the ArrayList uses 117 bytes. Instead of just naively writing the contents of the collections to the stream, the lists both contain writeObject() and readObject() methods. These write the contents of the lists to the stream.

Why the 10 Byte Difference?

The ArrayList and LinkedList work similarly, in that they write out the number of elements and the actual objects contained in the list. However, ArrayList also writes out the size of the underlying array, used to recreate an identical ArrayList to what was serialized.

This additional int is what makes up the 10 byte difference in the serialization size.

At this point I need to also point out that the String "hello world" is only serialized once, after that only a reference number to the object is written.

What About Vector?

The old Vector class implements serialization in a naive way. They simply do the default serialization, which writes the entire Object[] as-is into the stream. Thus if we insert a bunch of elements into the List, then clear it, the difference between Vector and ArrayList is enormous.

import java.util.*;
import java.io.*;

public class VectorWritingSize {
  public static void main(String[] args) throws IOException {
    test(new LinkedList<String>());
    test(new ArrayList<String>());
    test(new Vector<String>());
  }

  public static void test(List<String> list) throws IOException {
    insertJunk(list);
    for (int i = 0; i < 10; i++) {
      list.add("hello world");
    }
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectOutputStream out = new ObjectOutputStream(baos);
    out.writeObject(list);
    out.close();
    System.out.println(list.getClass().getSimpleName() +
        " used " + baos.toByteArray().length + " bytes");
  }

  private static void insertJunk(List<String> list) {
    for(int i = 0; i<1000 * 1000; i++) {
      list.add("junk");
    }
    list.clear();
  }
}

When we run this code, we get the following output:

    LinkedList used 107 bytes
    ArrayList used 117 bytes
    Vector used 1310926 bytes

Vector can use a staggering amount of bytes when being serialized. The lesson here? Don't ever use Vector as Lists in objects that are Serializable. The potential for disaster is too great.

Last Puzzle

Here is a question that I puzzled about for a bit. Why does Vector have a writeObject() method that simply does the defaultWriteObject()? It does not have a readObject().

Kind regards

Heinz

 

Comments

We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)

When you load these comments, you'll be connected to Disqus. Privacy Statement.

Related Articles

Browse the Newsletter Archive

About the Author

Heinz Kabutz Java Conference Speaker

Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Superpack '23

Superpack '23 Our entire Java Specialists Training in one huge bundle more...

Free Java Book

Dynamic Proxies in Java Book
Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

We can help make your Java application run faster and trouble-shoot concurrency and performance bugs...