Hacking Java Bytecode for Programmers (Part3) – Yes, disassemble with Javap ALL OVER THE PLACE!

May 28, 2013

Index

Introduction

In Part 2, I showed you at a high level, what Java Opcodes are and I also walked you through how to manipulate Strings inside of the compiled code. I’ve actually used the exact method discussed in that post to bypass some sanity checks in a Java application I was reverse engineering.

As always, refer to the previous posts if you need to catch up.

Our goal for this blog post will be to get an idea on how to use the disassembler javap to understand compiled code. We will then utilize this knowledge to manipulate the flow control of our program, so that we change the output from “Access Revoked!” to “Access Granted!” by modifying only the compiled source. Simulating privilege escalation.

Setting Up For Success

For this next example, modify your User.java file to mirror mine.

public class User {
 
        protected boolean authenticated = false;
 
        public void setAuthenticatedTrue() {
                this.authenticated = true;
        }
 
        public static void main(String[] args) {
                User user = new User();
                user.run();
        }
 
        public void run() {
                if (this.authenticated == false) {
                        System.out.println("Access Revoked!");
                }
                else {
                        System.out.println("Access Granted!");
                }
        }
}

Compile the file using javac.

$ javac User.java

Again, lets pretend that we were never given the source file. So for clarity, rename the source file to User.java.del.

$ mv User.java User.java.del
$ ls
User.class  User.java.del

Just like the last scenario, despite the fact that you do not have the source code, you still have the compiled class file that the JVM can execute. Lets run it now.

$ java User 
Access Revoked!
$

Disassemble with Javap, Steph-a-nie

Javap is a disassembler. It analyzes a compiled Class file and dumps out incredibly useful information enabling us to reverse engineer unobfuscated java code fairly easily. The output can be overwhelming for newbs, but don’t worry, we will step through it.

Before we begin, do a quick sanity check on the file to validate that it is binary data using file.

$ file -i User.class
User.class: application/x-java-applet; charset=binary
$

After confirming that it is indeed binary, and with no flags specified, run javap against our User.class file.

$ javap User.class
Compiled from "User.java"
public class User {
  protected boolean authenticated;
  public User();
  public void setAuthenticatedTrue();
  public static void main(java.lang.String[]);
  public void run();
}
$

Well that is cool right? We can see that the binary file contains our public class User. We can also see that the class contains the protected value authenticated along with our public methods. But that still doesn’t do much for us. Lets add some flags to see if we can get even more output.

Try adding the -c flag and run the command again.

$ javap -c User.class 
Compiled from "User.java"
public class User {
  protected boolean authenticated;
 
  public User();
    Code:
       : aload_0       
       1: invokespecial #1                  // Method java/lang/Object."":()V
       4: aload_0       
       5: iconst_0      
       6: putfield      #2                  // Field authenticated:Z
       9: return        
 
  public void setAuthenticatedTrue();
    Code:
       : aload_0       
       1: iconst_1      
       2: putfield      #2                  // Field authenticated:Z
       5: return        
 
  public static void main(java.lang.String[]);
    Code:
       : new           #3                  // class User
       3: dup           
       4: invokespecial #4                  // Method "":()V
       7: astore_1      
       8: aload_1       
       9: invokevirtual #5                  // Method run:()V
      12: return        
 
  public void run();
    Code:
       : aload_0       
       1: getfield      #2                  // Field authenticated:Z
       4: ifne          18
       7: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
      10: ldc           #7                  // String Access Revoked!
      12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      15: goto          26
      18: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
      21: ldc           #9                  // String Access Granted!
      23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      26: return        
}
$

Now that looks much more useful!

I’ve highlighted our initial Class, _Protected attribute, and Public_ methods definitions in yellow to give you a bit of context.

img

Next lets focus on the method blocks which I’ve highlighted in red.

img

You can see that directly inside the block definitions is the instruction listing that makes up the method itself. These are the opcode instructions, that when read top to bottom, let you know exactly what is going on.

Given that our eventual goal is to escalate privledges to our application by printing “Access Granted!”, we need to focus on the run(); method.

 public void run();
    Code:
       0: aload_0       
       1: getfield      #2                  // Field  authenticated:Z
       4: ifne          18
       7: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
      10: ldc           #7                  // String Access Revoked!
      12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      15: goto          26
      18: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
      21: ldc           #9                  // String Access Granted!
      23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      26: return

There are a total of  five columns to be read and I’ve separated them in blue.

img

The OffsetMnemonicFieldType, and the ASCII representation.

img

MOAR DISASSEMBLE!

Lets break this down a bit more.

At offset we find the aload_0 mnemonic and it has nothing in its field.

 1  public void run();
 2     Code:
 3        : aload_0       
 4        1: getfield      #2                  // Field  authenticated:Z
 5        4: ifne          18
 6        7: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
 7       10: ldc           #7                  // String Access Revoked!
 8       12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
 9       15: goto          26
10       18: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
11       21: ldc           #9                  // String Access Granted!
12       23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
13       26: return

If we check the wikipedia java instruction listing and search for _aload0 we find the following.

The opcode for_ aload_0_ in hexadecimal is 2A and a the decription of the opcode states that it will “load a reference onto the stack from local variable 0”

What is stored at local variable 0? Simple, a reference to this. Basically a pointer to our User object.

Let’s take a look at the next instruction.

At Offset 1 we see that the mnemonic is getfield.

 1   public void run();
 2     Code:
 3        : aload_0       
 4        1: getfield      #2                  // Field  authenticated:Z
 5        4: ifne          18
 6        7: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
 7       10: ldc           #7                  // String Access Revoked!
 8       12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
 9       15: goto          26
10       18: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
11       21: ldc           #9                  // String Access Granted!
12       23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
13       26: return

Again, we need to refer to our reference and it says that getfield has a bytecode value of B4 and the description states that it will “get a field _value_ of an object objectref, where the field is identified by field reference in the constant pool _index_ (index1 << 8 + index2)”

What the description is trying to communicate, is that getfield will look in its field column, see index_ #2_, and go search the Constant Pool for index _#2_ for use with the next instruction.

“What is the Constant Pool?” you ask.

Lets talk about that in the next section.

Correlating the Constant Pool

Run javap with the -verbose flag over your User.class.

$ javap -verbose  User.class 
Classfile /home/thedude/hackingjavabytecode/UserA/User.class
  Last modified May 23, 2013; size 688 bytes
  MD5 checksum 3d29c83d00756657d7d14a9b016f51d8
  Compiled from "User.java"
public class User
  SourceFile: "User.java"
  minor version: 
  major version: 51
  flags: ACC_PUBLIC, ACC_SUPER
Constant pool:
   #1 = Methodref          #10.#24        //  java/lang/Object."":()V
   #2 = Fieldref           #3.#25         //  User.authenticated:Z
   #3 = Class              #26            //  User
   #4 = Methodref          #3.#24         //  User."":()V
   #5 = Methodref          #3.#27         //  User.run:()V
   #6 = Fieldref           #28.#29        //  java/lang/System.out:Ljava/io/PrintStream;
   #7 = String             #30            //  Access Revoked!
   #8 = Methodref          #31.#32        //  java/io/PrintStream.println:(Ljava/lang/String;)V
   #9 = String             #33            //  Access Granted!
  #10 = Class              #34            //  java/lang/Object
  #11 = Utf8               authenticated
  #12 = Utf8               Z
  #13 = Utf8               
  #14 = Utf8               ()V
  #15 = Utf8               Code
  #16 = Utf8               LineNumberTable
  #17 = Utf8               setAuthenticatedTrue
  #18 = Utf8               main
  #19 = Utf8               ([Ljava/lang/String;)V
  #20 = Utf8               run
  #21 = Utf8               StackMapTable
  #22 = Utf8               SourceFile
  #23 = Utf8               User.java
  #24 = NameAndType        #13:#14        //  "":()V
  #25 = NameAndType        #11:#12        //  authenticated:Z
  #26 = Utf8               User
  #27 = NameAndType        #20:#14        //  run:()V
  #28 = Class              #35            //  java/lang/System
  #29 = NameAndType        #36:#37        //  out:Ljava/io/PrintStream;
  #30 = Utf8               Access Revoked!
  #31 = Class              #38            //  java/io/PrintStream
  #32 = NameAndType        #39:#40        //  println:(Ljava/lang/String;)V
  #33 = Utf8               Access Granted!
  #34 = Utf8               java/lang/Object
  #35 = Utf8               java/lang/System
  #36 = Utf8               out
  #37 = Utf8               Ljava/io/PrintStream;
  #38 = Utf8               java/io/PrintStream
  #39 = Utf8               println
  #40 = Utf8               (Ljava/lang/String;)V
{
  protected boolean authenticated;
    flags: ACC_PROTECTED
 
  public User();
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=1, args_size=1
         : aload_0       
         1: invokespecial #1                  // Method java/lang/Object."":()V
         4: aload_0       
         5: iconst_0      
         6: putfield      #2                  // Field authenticated:Z
         9: return        
      LineNumberTable:
        line 1: 
        line 3: 4
 
  public void setAuthenticatedTrue();
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=1, args_size=1
         : aload_0       
         1: iconst_1      
         2: putfield      #2                  // Field authenticated:Z
         5: return        
      LineNumberTable:
        line 6: 
        line 7: 5
 
  public static void main(java.lang.String[]);
    flags: ACC_PUBLIC, ACC_STATIC
    Code:
      stack=2, locals=2, args_size=1
         : new           #3                  // class User
         3: dup           
         4: invokespecial #4                  // Method "":()V
         7: astore_1      
         8: aload_1       
         9: invokevirtual #5                  // Method run:()V
        12: return        
      LineNumberTable:
        line 10: 
        line 11: 8
        line 12: 12
 
  public void run();
    flags: ACC_PUBLIC
    Code:
      stack=2, locals=1, args_size=1
         : aload_0       
         1: getfield      #2                  // Field authenticated:Z
         4: ifne          18
         7: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
        10: ldc           #7                  // String Access Revoked!
        12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        15: goto          26
        18: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
        21: ldc           #9                  // String Access Granted!
        23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
        26: return        
      LineNumberTable:
        line 15: 
        line 16: 7
        line 19: 18
        line 21: 26
      StackMapTable: number_of_entries = 2
           frame_type = 18 /* same */
           frame_type = 7 /* same */
 
}
$

Don’t get overwhelmed. Instead of focusing on the entire javap dump, I want you just to concentrate on only this particular part. The Constant Pool.

Constant pool:
   #1 = Methodref          #10.#24        //  java/lang/Object."":()V
   #2 = Fieldref           #3.#25         //  User.authenticated:Z
   #3 = Class              #26            //  User
   #4 = Methodref          #3.#24         //  User."":()V
   #5 = Methodref          #3.#27         //  User.run:()V
   #6 = Fieldref           #28.#29        //  java/lang/System.out:Ljava/io/PrintStream;
   #7 = String             #30            //  Access Revoked!
   #8 = Methodref          #31.#32        //  java/io/PrintStream.println:(Ljava/lang/String;)V
   #9 = String             #33            //  Access Granted!
  #10 = Class              #34            //  java/lang/Object
  #11 = Utf8               authenticated
  #12 = Utf8               Z
  #13 = Utf8               
  #14 = Utf8               ()V
  #15 = Utf8               Code
  #16 = Utf8               LineNumberTable
  #17 = Utf8               setAuthenticatedTrue
  #18 = Utf8               main
  #19 = Utf8               ([Ljava/lang/String;)V
  #20 = Utf8               run
  #21 = Utf8               StackMapTable
  #22 = Utf8               SourceFile
  #23 = Utf8               User.java
  #24 = NameAndType        #13:#14        //  "":()V
  #25 = NameAndType        #11:#12        //  authenticated:Z
  #26 = Utf8               User
  #27 = NameAndType        #20:#14        //  run:()V
  #28 = Class              #35            //  java/lang/System
  #29 = NameAndType        #36:#37        //  out:Ljava/io/PrintStream;
  #30 = Utf8               Access Revoked!
  #31 = Class              #38            //  java/io/PrintStream
  #32 = NameAndType        #39:#40        //  println:(Ljava/lang/String;)V
  #33 = Utf8               Access Granted!
  #34 = Utf8               java/lang/Object
  #35 = Utf8               java/lang/System
  #36 = Utf8               out
  #37 = Utf8               Ljava/io/PrintStream;
  #38 = Utf8               java/io/PrintStream
  #39 = Utf8               println
  #40 = Utf8               (Ljava/lang/String;)V

Upon compliation the Java compiler does some cool stuff. One of the things it creates is an optimized pool of stored values. The application will then reference these values when running the instructions.

Personally, I try to visualize the Constant Pool as a multidimensional array of variables that the JVM can reference.

constant_pool = [
  ['#1','Methodref','#10.#24','java/lang/Object."":()V'],
  ['#2','Fieldref','#3.#25','User.authenticated:Z],
  ['#3','Class','#26','User'],
  ['#4','Methodref','#3.#24','User."":()V'],
  ['#5','Methodref','#3.#27','User.run:()V'],
  ['#6','Fieldref','#28.#29','java/lang/System.out:Ljava/io/PrintStream;'],
  ['#7','String','#30','Access Revoked!'],
  ['#8','Methodref','#31.#32','java/io/PrintStream.println:(Ljava/lang/String;)V'],
  ['#9','String','#33','Access Granted!'],
  ['#10','Class','#34','java/lang/Object'],
  ['#11','Utf8','authenticated'],
  ['#12','Utf8','Z'],
  ['#13','Utf8','""'],
  ['#14','Utf8','()V'],
  ['#15','Utf8','Code'],
  ['#16','Utf8','LineNumberTable']
  ['#17','Utf8','setAuthenticatedTrue'],
  ['#18','Utf8','main'],
  ['#19','Utf8','([Ljava/lang/String;)V'],
  ['#20','Utf8','run'],
  ['#21','Utf8','StackMapTable'],
  ['#22','Utf8','SourceFile'],
  ['#23','Utf8','User.java'],
  ['#24','NameAndType','#13:#14','"":()V'],
  ['#25','NameAndType','#11:#12','authenticated:Z'],
  ['#26','Utf8','User'],
  ['#27','NameAndType','#20:#14','run:()V'],
  ['#28','Class','#35','java/lang/System'],
  ['#29','NameAndType','#36:#37','out:Ljava/io/PrintStream;'],
  ['#30','Utf8','Access Revoked!'],
  ['#31','Class','#38','java/io/PrintStream'],
  ['#32','NameAndType','#39:#40','println:(Ljava/lang/String;)V'],
  ['#33','Utf8','Access Granted!'],
  ['#34','Utf8','java/lang/Object'],
  ['#35','Utf8','java/lang/System'],
  ['#36','Utf8','out'],
  ['#37','Utf8','Ljava/io/PrintStream;'],
  ['#38','Utf8','java/io/PrintStream'],
  ['#39','Utf8','println'],
  ['#40','Utf8','(Ljava/lang/String;)V']
]

Please realize this python example of the Constant Pool isn’t functionally accurate. It is not meant to be. It is meant to just share an idea. By allowing this abstract concept to take the shape of an array, it will be way easier for you to visualize how to operate over these values as well as manipulate them.

Now we need to understand how the Contant Pool values correlate to the Opcode instructions in our run(); method.

Tracing Constant Pool Values using Javap Output

Be aware that we are going to feel like a pinball for a bit as we bounce around the references in the Constant Pool.

So where were we? Oh yeah, Offset 1 is getfield.

 public void run();
    Code:
       : aload_0       
       1: getfield      #2                  // Field  authenticated:Z
       4: ifne          18
       7: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
      10: ldc           #7                  // String Access Revoked!
      12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      15: goto          26
      18: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
      21: ldc           #9                  // String Access Granted!
      23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      26: return

Use this list along with the diagram gif I created, to trace the references between the Opcodes and the Constant Pool.

  1. Index #2 is in fact, itself, a reference to #3.#25. Lets trace index #3 first.
  2. Index #3 references a _Class_ found at index #26.
  3. Index #26 references the Utf8 string User.
  4. Index #25 references a NameAndType (This is the latter half of #3.#25)
  5. Index #11 is the Utf8 string authenticated.
  6. Index #12 is the Utf8 string (operand table also below)

img

Prefix

i	integer
l	long
s	short
b	byte
c	character
f	float
d	double
z	boolean
a	reference

link

Now, even though we walked through this using that gif, we didn’t really have to. Lets look closely at index #2 again.

   #2 = Fieldref           #3.#25         //  User.authenticated:Z

You can see that right next to #2 written in ASCII, the User class is there with the authenticated attribute. And assigned to the attribute is the Z operand specifying a boolean.

Why did we do this?

Tracing these values down was an incredibly useful exercise in understanding how they are actually stored.

Understanding the Flow

We are almost to the point where we can hack.

At Offset 4 we find the mnemonic ifne.

 1  public void run();
 2     Code:
 3        : aload_0       
 4        1: getfield      #2                  // Field  authenticated:Z
 5        4: ifne          18
 6        7: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
 7       10: ldc           #7                  // String Access Revoked!
 8       12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
 9       15: goto          26
10       18: getstatic     #6                  // Field  java/lang/System.out:Ljava/io/PrintStream;
11       21: ldc           #9                  // String Access Granted!
12       23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
13       26: return

You’ll notice that in the Field next to ifne there is simply the number 18. This is NOT an index as an index would be denoted with a hash symbole #. Instead, this is a branch. Once you are skilled at interpreting the output of Javap, you’d know immediately that there are a total of two code branches in the instruction set. Just by looking at the method.

We will walk through the flow control. Remember, these instructions operate from top to bottom.

Ifne is actually operating on the value in index #2 that the getfield mnemonic is referencing. In plain english.

“If authenticated in index #2 is NOT EQUAL to false, branch the operation to offset 18.”

img

From that point, the JVM would execute the remaining operations at the specified offsets 18,21,23, and 26. But since we wrote the program, we know that the authenticated boolean is indeed false. It will fail the ifne gate and continue on.

img

Looking at the ASCII display on the right, we can easily identify that we want the _ifne to succeed and branch to the instruction set located at offset 18 in order to get the “Access Granted!_ string to print.

Lets build out the hexadecimal bytecode we will be searching for using Bless.

Below is both the Offset and Bytecode Opcodes up to the fourth offset.

Offset:   1  2  3  4
Hex:   2a b4 00 02 9a

We know that at Offset 4, the mnemonic is ifne, and the hex value is 9a.

We also know that ifne is false. Which is the cause for “Access Revoked!” printing to our screen.

What would happen if we replace the ifne opcode for its counter part, the ifeq opcode?

Let’s try it!

Open up the User.class file in Bless and search for the following string (2a b4 00 02 9a).

img

Replace the opcode 9a (ifne) for the opcode 99 (ifeq) and run javap -c User.class again.

 1   public void run();
 2     Code:
 3        : aload_0       
 4        1: getfield      #2                  // Field authenticated:Z
 5        4: ifeq          18
 6        7: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
 7       10: ldc           #7                  // String Access Revoked!
 8       12: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
 9       15: goto          26
10       18: getstatic     #6                  // Field java/lang/System.out:Ljava/io/PrintStream;
11       21: ldc           #9                  // String Access Granted!
12       23: invokevirtual #8                  // Method java/io/PrintStream.println:(Ljava/lang/String;)V
13       26: return

You’ll notice that ifne has been changed to ifeq.

Now run your program again.

$ java User 
Access Granted!
$

Success! We have successfully manipulated the flow control.

Conclusion

You should now be able to read and comprehend the Javap output. And you should be able to think through and disrupt the flow control by swapping opcodes at the bytecode level.

If you are finding this series valuable, please leave a comment on where you’d like to go to from here. I’d like to wrap this up and avoid a George R. R. Martin where the series never feels like it is going to end.

In the next post, we are going to look at some reverse engineering tools and techniques. Stay tuned.