Skip to content

3.1. Frames and Stack Analysis

EpicPlayerA10 edited this page Sep 5, 2025 · 1 revision

❌ The Naive Approach (Don't Do This)

When beginners encounter obfuscated bytecode, they often try the simple approach of just looking at the previous instruction:

public class NaiveTransformer extends Transformer {
  @Override
  protected void transform() throws Exception {
    scopedClasses().forEach(classWrapper -> classWrapper.methods().forEach(methodNode -> {
      Arrays.stream(methodNode.instructions.toArray())
          .filter(insn -> insn.getOpcode() == INVOKEVIRTUAL)
          .forEach(insn -> {
            MethodInsnNode methodInsn = (MethodInsnNode) insn;
            
            if (methodInsn.owner.equals("java/io/PrintStream") && methodInsn.name.equals("println")) {
              // Naive approach - just look at previous instruction
              AbstractInsnNode prev = methodInsn.getPrevious();
              if (prev instanceof LdcInsnNode) {
                LdcInsnNode ldc = (LdcInsnNode) prev;
                ldc.cst = "Bye, World!";
              }
            }
          });
    }));
  }
}

This approach will likely fail with obfuscated code because the value production and consumption are often separated by other instructions. The correct approach requires frame analysis to track stack values properly.

πŸ“š Working with Stack Values

You can also get stack values that are pushed before the instruction. For example, if you want to replace all strings with "Bye, World!" only in System.out.println calls:

import org.objectweb.asm.tree.AbstractInsnNode;
import org.objectweb.asm.tree.LdcInsnNode;
import org.objectweb.asm.tree.MethodInsnNode;
import org.objectweb.asm.tree.analysis.Frame;
import org.objectweb.asm.tree.analysis.OriginalSourceValue;
import uwu.narumi.deobfuscator.api.asm.ClassWrapper;
import uwu.narumi.deobfuscator.api.asm.InstructionContext;
import uwu.narumi.deobfuscator.api.asm.MethodContext;
import uwu.narumi.deobfuscator.api.context.Context;
import uwu.narumi.deobfuscator.api.transformer.Transformer;

import java.util.Arrays;

public class SomeTransformer extends Transformer {
  @Override
  protected void transform() throws Exception {
    scopedClasses().forEach(classWrapper -> classWrapper.methods().forEach(methodNode -> {
      MethodContext methodContext = MethodContext.framed(classWrapper, methodNode);

      // Find all System.out.println calls and replace the string with "Bye, World!"
      Arrays.stream(methodNode.instructions.toArray())
          .filter(insn -> insn.getOpcode() == INVOKEVIRTUAL) // Match only INVOKEVIRTUAL instructions
          .forEach(insn -> {
            MethodInsnNode methodInsn = (MethodInsnNode) insn;

            // Find System.out.println call
            if (methodInsn.owner.equals("java/io/PrintStream") && methodInsn.name.equals("println") && methodInsn.desc.equals("(Ljava/lang/String;)V")) {
              // Create instruction context. Required for getting stack values.
              InstructionContext insnContext = methodContext.newInsnContext(methodInsn);
              Frame<OriginalSourceValue> frame = insnContext.frame();

              // Get top value from the stack
              OriginalSourceValue sourceValue = frame.getStack(frame.getStackSize() - 1);

              // Remove all instructions that produced the top stack value. We will replace them with our own instruction.
              for (AbstractInsnNode producer : sourceValue.insns) {
                methodNode.instructions.remove(producer);
              }

              // Replace the top stack value with the string "Bye, World!"
              methodNode.instructions.insertBefore(methodInsn, new LdcInsnNode("Bye, World!"));
            }
          });
    }));
  }
}

πŸ”§ Using FramedInstructionsStream

The same effect you can achieve by using a utility class called FramedInstructionsStream. In this way, we are minimizing the boilerplate code:

import org.objectweb.asm.tree.AbstractInsnNode;
import org.objectweb.asm.tree.LdcInsnNode;
import org.objectweb.asm.tree.MethodInsnNode;
import org.objectweb.asm.tree.analysis.Frame;
import org.objectweb.asm.tree.analysis.OriginalSourceValue;
import uwu.narumi.deobfuscator.api.asm.ClassWrapper;
import uwu.narumi.deobfuscator.api.context.Context;
import uwu.narumi.deobfuscator.api.helper.FramedInstructionsStream;
import uwu.narumi.deobfuscator.api.transformer.Transformer;

public class SomeTransformer extends Transformer {
  @Override
  protected void transform() throws Exception {
    FramedInstructionsStream.of(this)
        .editInstructionsStream(stream -> stream.filter(insn -> insn.getOpcode() == INVOKEVIRTUAL)) // Match only INVOKEVIRTUAL instructions
        .forEach(insnContext -> {
          MethodInsnNode methodInsn = (MethodInsnNode) insnContext.insn();

          // Find System.out.println call
          if (methodInsn.owner.equals("java/io/PrintStream") && methodInsn.name.equals("println") && methodInsn.desc.equals("(Ljava/lang/String;)V")) {
            Frame<OriginalSourceValue> frame = insnContext.frame();

            // Get top value from the stack
            OriginalSourceValue sourceValue = frame.getStack(frame.getStackSize() - 1);

            // Remove all instructions that produced the top stack value. We will replace them with our own instruction.
            for (AbstractInsnNode producer : sourceValue.insns) {
              insnContext.methodNode().instructions.remove(producer);
            }

            // Replace the top stack value with the string "Bye, World!"
            insnContext.methodNode().instructions.insertBefore(methodInsn, new LdcInsnNode("Bye, World!"));
          }
        });
  }
}

πŸ’‘ Why Stack Analysis Matters

But why do we need to get a stack value from frame? Can't we just move one instruction up and replace it? The answer is: not always. Sometimes the value is produced somewhere else, so the stack value may be much further produced.

Stack analysis is crucial for proper deobfuscation because obfuscated code often separates the production of values from their consumption. This separation makes it difficult to identify which instructions are related without proper frame analysis.

To see this issue in action, consider this example:

ldc "Hello World!"  # Stack: ("Hello World!")
dup # Stack: ("Hello World!", "Hello World!")
dup # Stack: ("Hello World!", "Hello World!", "Hello World!")
getstatic java/lang/System.out Ljava/io/PrintStream;  # Stack: ("Hello World!", "Hello World!", "Hello World!", System.out)
swap # Stack: ("Hello World!", "Hello World!", System.out, "Hello World!")
invokevirtual java/io/PrintStream.println (Ljava/lang/String;)V # Stack: ("Hello World!", "Hello World!")
pop # Stack: ("Hello World!")
pop # Stack: ()

The above example shows that the value is produced by ldc instruction, but it is used much later. So this is still valid JVM bytecode AND ldc "Hello World!" is not straight before invokevirtual java/io/PrintStream.println (Ljava/lang/String;)V. This is one of a very common obfuscation techniques. Fortunately, there already exists a universal transformer that removes these useless DUP and POP pairs (UselessPopCleanTransformer) but only simple forms of them.