Abstract: The DateFormat produces some seemingly unpredictable results parsing the date 2009-01-28-09:11:12 as "Sun Nov 30 22:07:51 CET 2008". In this newsletter we examine why and also show how DateFormat reacts to concurrent access.
Welcome to the 172nd issue of The Java(tm) Specialists' Newsletter. One of my pet peeves is when I am asked to predict the future of Java. As a Java Champion, I am expected to have a better idea than the average person. The truth is I do not have a clue what will happen to Java or any other technology. When cellular telephones were first invented, I dismissed them as something that would never become successful. Far too expensive and besides, who would want their boss to be able to contact them 24x7? I could not even predict the amazing popularity of my Java Specialist Master Course. My Design Patterns for Delphi course, that I was sure would fly, did not sell a single seat. Recently my Toronto buddy Jean suggested I read the book The Black Swan [ISBN 1400063515] , which explains these outliers very nicely and at long last vindicates my "I don't know" answer about the future. It also explains that experts in a field, especially those with a reputation to protect, are notoriously bad at predicting the future as they are too conservative to expect the unexpected. In future, when someone asks me what will happen to Java in the next 5 years, I will take a wild guess and say that Java won't exist in 5 years time.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
A few weeks ago, one of my newsletter readers sent me the following code:
DateFormat df = new SimpleDateFormat("yyyyMMddHHmmss"); Date d = df.parse("2009-01-28-09:11:12"); System.err.println(d);
Since the date format was different to the incoming text, she was getting the rather strange result of "Sun Nov 30 22:07:51 CET 2008".
The SimpleDateFormat
is by default lenient and tries to fit
our dates into the format as best it can. Whilst doing that,
it might cause some strange effects. Here is how I think it
gets interpreted:
yyyyMMddHHmmss 2009-01-28-09:11:12 year = 2009 month = -0 day = 1 hour = -2 minute = 8 second = -09
The year is easy, just 2009. In our interpretation of month, there is no such month as 0. January would be 1. So it would be one month before January, in other words December 2008. The day is the 1st. Next comes the hour, which I would have imagined should have been set to 28, but was read as -2. Perhaps due to the confusing yyyyMMdd start, the time was offset by one character. Since hour is -2, minute is set to 8 and second to -09. If we subtract 2 hours from 1st Dec 2008, we come to 22:00:00 on the 30th Nov 2008. We then add 8 minutes and subtract 9 seconds, thus having 7 minutes and 51 seconds. The end result is 30th Nov 2008 22:07:51.
Similarly, when we have as input "2009-12-31-00:00:00", it will be parsed as:
yyyyMMddHHmmss 2009-12-31-00:00:00 year = 2009 month = -1 day = 2 hour = -3 minute = 1 second = 0
Thus it will be year 2009, month -1, thus November 2008, the second day, but hour -3, thus the 1st of November 2008 at 21:00:00. Minutes would be set to 1 and seconds to 0, thus we get the completely incorrect (by more than 12 months) answer of Sat Nov 01 21:01:00 CET 2008.
We would not have had this problem if we had specified the
DateFormat
to be strict, with
df.setLenient(false)
. In
that case, we would have immediately seen the mistake, rather
than have a date that is completely off.
Another issue with DateFormat
is that it is not thread safe.
Since DateFormat
is an expensive object to create, you might
want to keep a copy available in a static
final
field. That
means, however, that you can only use it from a single thread
at a time, otherwise the results are unpredictable.
Take for example the DateConverter
class:
import java.text.*; import java.util.Date; public class DateConverter { private static final DateFormat df = new SimpleDateFormat("yyyy/MM/dd"); public void testConvert(String date) { try { Date d = df.parse(date); String newDate = df.format(d); if (!date.equals(newDate)) { System.out.println(date + " converted to " + newDate); } } catch (Exception e) { System.out.println(e); } } }
When we call the testConvert()
method, we would
expect date
to always equal newDate
.
However, I managed to get rather strange results in
conversion, such as:
1971/12/04 converted to 0000/09/-730498 1971/12/04 converted to 100083/09/02 1971/12/04 converted to 19711971/12/04 2001/09/02 converted to 1971/02/04 2001/09/02 converted to 1977/04/23
In other words, the results had absolutely nothing to do with
possible values. In production, the probability of calling
the format()
or parse()
methods concurrently might be low,
so you would only see such mangled dates seldomly. However,
that is what makes these "black swans" [ISBN 1400063515]
even more dangerous,
since the values are completely different to what you
expected. Imagine trying to work out the interest due on a
loan, based on the starting date parsed as "0000/09/-730498".
Here is my test code:
import java.text.*; import java.util.Date; import java.util.concurrent.*; public class DateConverterTest { public static void main(String[] args) { ExecutorService pool = Executors.newCachedThreadPool(); convert(pool, "1971/12/04"); convert(pool, "2001/09/02"); } private static void convert(ExecutorService pool, final String date) { pool.submit(new Runnable() { public void run() { DateConverter dc = new DateConverter(); while (true) { dc.testConvert(date); } } }); } }
We can fix the problem of concurrent access to the
DateFormat
either by synchronizing the
testConvert()
method or by
having a separate DateFormat
instance for each thread.
Synchronizing introduces contention, so that is probably not
the best approach. Instead, we should rather create a
ThreadLocal
that gives each thread his own copy
of the DateFormat
class.
With ThreadLocal
, we want to set the
value the first time the thread requests it and then simply
use that in future. The easiest way to do that is by
overriding the initialValue()
method, like so:
import java.text.*; import java.util.Date; public class DateConverter { private static final ThreadLocal<DateFormat> tl = new ThreadLocal<DateFormat>() { protected DateFormat initialValue() { return new SimpleDateFormat("yyyy/MM/dd"); } }; public void testConvert(String date) { try { DateFormat formatter = tl.get(); Date d = formatter.parse(date); String newDate = formatter.format(d); if (!date.equals(newDate)) { System.out.println(date + " converted to " + newDate); } } catch (Exception e) { System.out.println(e); } } }
As long as the thread is alive, this thread local would stay
set, even if he never used the DateFormat
again.
We could instead use a SoftReference
as a value
for the ThreadLocal
:
import java.lang.ref.SoftReference; import java.text.*; import java.util.Date; public class DateConverter { private static final ThreadLocal<SoftReference<DateFormat>> tl = new ThreadLocal<SoftReference<DateFormat>>(); private static DateFormat getDateFormat() { SoftReference<DateFormat> ref = tl.get(); if (ref != null) { DateFormat result = ref.get(); if (result != null) { return result; } } DateFormat result = new SimpleDateFormat("yyyy/MM/dd"); ref = new SoftReference<DateFormat>(result); tl.set(ref); return result; } public void testConvert(String date) { try { DateFormat formatter = getDateFormat(); Date d = formatter.parse(date); String newDate = formatter.format(d); if (!date.equals(newDate)) { System.out.println(date + " converted to " + newDate); } } catch (Exception e) { System.out.println(e); } } }
Now we can use the testConvert()
method from as
many threads as we want, without any fears of racing
conditions on the format()
or
parse()
methods.
Kind regards
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.