Abstract: We look at the difference in bytecode between using ints and bytes. We also compare a += b and the longer a = a + b. They are not equivalent.
Welcome to the 69th edition of The Java(tm) Specialists' Newsletter. After my last newsletter, the listserver told me that quite a few of you did not receive the newsletter. If you are reading this on the internet, and you did not receive this newsletter, be warned - you have probably been deleted from the list! We now have approximately 5593 working email addresses.
Why can we not treat everyone equally? A few weeks ago, my four year old son Maximilian wanted to know the answer to that age-old question. Maxi's 3 year old cousin was visiting and was allowed to stay up and watch cartoons on TV, whilst he had to go to bed to sleep. "Sorry, Maxi, but as you will soon find out: Life is not fair! There is nothing I can do to change that, that is just the way it is."
A well known person (so well known that I have forgotten who it was) once said: "Everyone that I meet is my superior in some way"
This newsletter is about some operations who treat types equally (when perhaps they should not), and others, who due to their own feelings of inadequacy, do not. It all gets rather confusing, as you will soon find out. Remember, if you don't understand this newsletter: Life is not fair! Read the Java VM Spec. Write to the editor of this fine newsletter (editor@dev/null). But don't moan.
I want to thank all those who make the effort to write to me to correct errors in my newsletters. Please continue your good work. It makes the newsletter more useful when we remove the gremlins.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
What is the purpose of a byte
?
Let me ask that question a bit differently: Why would you use a
byte
, as opposed to an int
?
To save memory? Only since JDK 1.4.x does a byte field take
less memory than an int field. To increase computational speed? Maybe the opcodes
for
adding bytes are faster than for ints?
Perhaps byte
should have been left out
of the Java Programming Language? Not just byte
,
but also short
and char
?
The original Oak specification, on which Java is based, had provision
for unsigned integral values, but they did not make it, so why do we
have those other types?
Ahh, but a byte[]
will take up less space than
a char[]
. That is true, so they do have a
reason to exist! But what about computation?
"Most of the instructions in the Java virtual machine instruction set
encode type information about the operations they perform. For instance,
the iload instruction loads the contents of a local variable, which must
be an int
, onto the operand stack. The
fload instruction does the same
with a float
value. The two instructions may have identical
implementations, but have distinct opcodes." - VM Spec The Structure of
the Java Virtual Machine.
This brings along a slight problem: there are too many types in Java for the instruction set. Java has only one byte per opcode, so there are a maximum of 256 opcodes. There is therefore great pressure to have as few opcodes as possible.
The 18 opcodes that are defined for int
, and not for
short
, char
and byte
, are: iconst, iload, istore, iinc, iadd, isub,
imul, idiv, irem, ineg, ishl, ishr, iushr, iand, ior, ixor, if_icmpOP,
ireturn. If these were also defined for the other three primitive
types, we would require at least an additional 54 opcodes.
There is only one opcode that is marked as "unused". It is opcode 0xBA. Go figure. Probably the BSc Computer Science nerds having a dig at all the BA's that are unused ;-) Fries with that?
What does that mean for you and me? Let's look at a code snippet, sent to me by Jeremy Meyer. Jeremy helped me get my first job at a company called DataFusion Systems in South Africa, now called DataVoice, and part of a bigger company called Spescom. DataVoice are probably not hiring anyone at the moment, but they are one fine company to work for, so if ever you are offered a job there, take it at any price! Even working there for free would be a bargain, considering what you will learn there. Thanks Jeremy!
public class ByteFoolish { public static void main(String[] args) { int i = 128; byte b = 0; b |= i; System.out.println("Byte is " + b); i = 0; i |= b; System.out.println("Int is " +i); } }
When we run this, we see the following:
Byte is -128 Int is -128
Here we start with a value bigger than 127 (the maximum positive
byte
value). We store it in an int
,
and then OR the bits into the byte
.
The first System.out.println statement naturally reports the value of
the byte
as -128, which one would expect.
The byte
is, after all, signed and has range
-128 to +127. Then we reset the int
to 0, and
OR the value of the byte
back into the
int
.
The int
now has the value of -128!
How could we OR the int with just the bits that belong to the last byte? We could first bitwise AND it with a mask that only shows the last byte and OR the result with the int.
public class ByteFoolish2 { public static void main(String[] args) { int i = 128; byte b = 0; b |= i; System.out.println("Byte is " + b); i = 0; i |= (b & 0x000000FF); System.out.println("Int is " +i); } }
Now when we run the program, we see:
Byte is -128 Int is 128
Bitwise arithmetic with byte, short and char is challenging. Inside the JVM, these are first translated to ints, worked on, and then converted back to bytes. Let's disassemble the class to make sure that this is what is happening:
public class ByteFoolish3 { public ByteFoolish3() { int i = 128; byte b = 0; b |= i; i = 0; i |= b; } }
We disassemble with javap -c ByteFoolish3 (you know how to do that by now):
0 aload_0 1 invokespecial #9 <Method java.lang.Object()> 4 sipush 128 // push 128 onto stack 7 istore_1 // store in int register 1 8 iconst_0 // push constant "0" onto stack 9 istore_2 // store in int register 2 10 iload_2 // load register 2 11 iload_1 // load register 1 12 ior // OR them together as ints 13 i2b // convert the int on the stack to a byte 14 istore_2 // store the value in register 2 15 iconst_0 // push constant "0" onto stack 16 istore_1 // store this in int register 1 17 iload_1 // load register 1 18 iload_2 // load register 2 19 ior // OR them together 20 istore_1 // store them in register 1 21 return
I was sitting in a seminar on Refactoring by Martin Fowler a few
years ago. The things Martin was saying sounded like music to my
ears. I had refactored my code for many years, but had never heard
such a thorough approach on the subject. The one thing that stuck
in my mind was the difference between i += n
and
i = i + n
.
Would the following compile?
public class Test1 { public static void main(String[] args) { int i = 128; double d = 3.3234123; i = i + d; System.out.println("i is " + i); } }
The answer is that it would not compile. A double is 64 bits with a very big range. There is no way that it would fit into an int without losing precision. It is not safe to run such code, so when we attempt to compile it, we get the following message:
Test1.java:5: possible loss of precision found : double required: int i = i + d; ^ 1 error
This is good. We pick up errors before we run the program. Ya'll know how to cast a double to an int if you definitely want to do that. You can either cast the values individually before you add, or you can cast the result:
public class Test2 { public static void main(String[] args) { int i = 128; double d = Integer.MAX_VALUE + 12345.33; i = i + (int)d; System.out.println("i1 is " + i); i = 128; i = (int)(i + d); System.out.println("i2 is " + i); } }
In a way, I would expect both i's to have the same value, but due to the precision loss of doubles, they are not equal:
i1 is -2147483521 i2 is 2147483647
Let's have a look at the next class, Test3:
public class Test3 { public static void main(String[] args) { int i = 128; double d = Integer.MAX_VALUE + 12345.33; i += d; // oops, forgot to cast! System.out.println("i is " + i); } }
Does this compile? Ooops, we forgot to cast! But, does it compile? Yes it does, and when we run it, we get:
i is 2147483647
Therefore, we can say that i += n
is the same
as i = (type_of_i)(i + n)
.
Kind regards, and thanks for the great feedback after the last newsletter!
Heinz
We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)
We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.