-
Notifications
You must be signed in to change notification settings - Fork 73
2. Some basics about java bytecode
First of all you need to learn some basics about java bytecode. The best way to learn it is to write an example java code (start with "hello world" program), compile it and throw the compiled jar to Recaf. Then find your class, right click, and click Edit -> Edit class in assembler. Here you can see your java bytecode. Try to compare it with your written java code, and find similarities, like how method invocation is done, variable accesses, math operations, etc. If you want to read a bit more about bytecode itself and instructions then there is a great documentation of all JVM instructions. I also highly recommend to read JVM Manual, which explains most JVM concepts in simple and easy-to-understand terms.
In this section, we will cover some basics about java bytecode.
In bytecode, there is a concept called "the stack". You might remember it from an exception called StackOverflowError or from the website stackoverflow.com. We will dig into what is exactly stack, and how bytecode uses it.
Consider this java code example:
public static void main(String[] args) {
System.out.println("Hello World!");
}This is its bytecode:
getstatic java/lang/System.out Ljava/io/PrintStream;
ldc "Hello World!"
invokevirtual java/io/PrintStream.println (Ljava/lang/String;)VLet's break down these instructions.
getstatic java/lang/System.out Ljava/io/PrintStream; # Stack: (System.out)
ldc "Hello World!" # Stack: (System.out, "Hello World!")
# Stack operations:
# 1. Pop the top value - The argument for "println" method
# Stack before: (System.out, "Hello World!")
# Stack after: (System.out)
# 2. Pop the top value - The object the method is invoked from
# Stack before: (System.out)
# Stack after: ()
invokevirtual java/io/PrintStream.println (Ljava/lang/String;)VThe name of the instruction is called an opcode. Here are the opcodes used in the example:
-
getstatic- Gets the value of a static field. -
ldc- Loads a constant value onto the stack. -
invokevirtual- Invokes a method on an object.
Here you can see that the stack is used to pass arguments to methods and to store the object the method is invoked from. The stack is also used to store the return value of the method.
Let's now break down the syntax of these instructions:
getstatic (class name).(field name) (field descriptor)
ldc (any constant value)
invokevirtual (class name).(method name) (method descriptor)The class name, field name and method name are self-explanatory. But what are these field descriptor and method descriptor?
- Field descriptor - Describes the type of the field. For example,
Ljava/lang/String;is the descriptor for theStringclass. Equivalent topublic String someName; - Method descriptor - Describes the method signature (argument types and return type). For example,
(ILjava/lang/String;)Zmeans that method takesintas a first argument,Stringas a second argument and it returnsboolean. Equivalent topublic boolean someName(int arg1, String arg2).
Here you can find the list of all descriptors: https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.3. You need to scroll down a bit to find the table with descriptors. If you can't find it then hit CTRL+F and search for Table 4.3-A. Interpretation of field descriptors.