-
Notifications
You must be signed in to change notification settings - Fork 162
fix: properly instrument nested records #1009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: properly instrument nested records #1009
Conversation
When passing ClassReader to ClassWriter, ASM can copy constant pool entries directly from the original bytecode. Attributes that reference the constant pool remain valid. Prior to this fix, a segfault was triggered for nested records that use Jazzer annotations when Jazzer was trying to access data on record components, when trying to create a record mutator.
| } | ||
|
|
||
| val writer = ClassWriter(ClassWriter.COMPUTE_MAXS) | ||
| val writer = ClassWriter(reader, ClassWriter.COMPUTE_MAXS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you figure out that this was causing the bug? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still unsure if this is the 1) right solution, or 2) if we should do this only for records; or 3) if it's a bug in asm.
My debugging approach was to get as far as possible with manual work and then ask AI (claude free vs) before I read any docs myself! Now that the segfault issue seems to be solved, I am actually reading the docs and the source code for ClassWriter to figure out which point above is true (1-3).
After initial analysis, I could minimize the reproducer to the code below. Removing the @NotNull annotation, or moving the record to the top level, or using a local annotation, or some other java annotation (e.g. we tried @Deprecated) works fine. It must be a Jazzer annotation:
package reproducer;
import com.code_intelligence.jazzer.mutation.annotation.NotNull;
import com.code_intelligence.jazzer.junit.FuzzTest;
public class ReproduceCrash {
record NestedRecordWithExternalAnnotation(@NotNull String fileName) {}
@FuzzTest
public void fuzz_test (boolean ignored) {
var components = NestedRecordWithExternalAnnotation.class.getRecordComponents();
}
}Here is the stack trace after a segfault that happens in fuzzing mode only, and not in regression mode:
V [libjvm.so+0x6b09c0] ConstantPool::symbol_at(int) const+0x5c (constantPool.hpp:442)
V [libjvm.so+0xd0ba3f] java_lang_reflect_RecordComponent::create(InstanceKlass*, RecordComponent*, JavaThread*)+0x245 (javaClasses.cpp:3398)
V [libjvm.so+0xe22b58] JVM_GetRecordComponents+0x2d4 (jvm.cpp:1828)
I narrowed down the cause to TraceDataFlowInstrumentor.instrument() by tracing the class with the nested record. Then I saw that simply reading and then writing back the bytecode already triggers the segfault (I removed the for-loop for methods):
val node = ClassNode()
val reader = ClassReader(bytecode)
reader.accept(node, 0)
random = DeterministicRandom("trace", node.name)
//for (method in node.methods) {
// if (shouldInstrument(method)) {
// addDataFlowInstrumentation(method)
// }
//}
val writer = ClassWriter(ClassWriter.COMPUTE_MAXS)
node.accept(writer)
return writer.toByteArray()At this point I asked AI, and its first suggestion was to add COMPUTE_FRAMES reader/writer options, which didn't solve the segfault.
The second try was to add reader to the writer, and here we are!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you save the transformed class file to a file and load it in a separate JVM process, does that also trigger the crash? This looks like a JVM bug first and foremost, but reporting one would probably require a reproducer that doesn't use as many unsafe tricks as Jazzer in fuzzing mode. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still debugging it, and my current guess is that ASM mangles the constant pool when writing without reusing the symbol table of the reader. It has something to do with the annotation, which doesn't even have to be a Jazzer annotation. For example, this also causes a segfault:
package com.example;
import com.code_intelligence.jazzer.junit.FuzzTest;
import java.lang.annotation.Retention;
import java.lang.annotation.Target;
import static java.lang.annotation.ElementType.TYPE_USE;
import static java.lang.annotation.RetentionPolicy.RUNTIME;
public class NestedRecordFuzzer {
record Address(@NotNull int a) {}
@FuzzTest
public void test(Address ignored) {
}
}
@Target(TYPE_USE)
@Retention(RUNTIME)
@interface NotNull {}I hacked my local jdk to print the value when it fails its assertion in constantPool.hpp:441 symbol_at(), and I get Constant pool tag at index 12 is 100. Maybe a 'd' in "java/lang/invoke/MethodHandles$Lookup"? I will find it out!
Why do you think it's a JVM bug and not ASM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably also an ASM bug (it generates a corrupted class file), but the JVM must not crash even on an invalid class file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I assumed that segfaulting after running into assertion errors is the standard JVM way:
Symbol* symbol_at(int which) const {
assert(tag_at(which).is_utf8(), "Corrupted constant pool");
return *symbol_at_addr(which);
}These are all over the place in constantPool.hpp, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption has been that there is a verifier pass that should avoid all these asserts so that pure Java can't hit them, but maybe that's not correct.
When passing ClassReader to ClassWriter, ASM can copy constant pool entries directly from the original bytecode. Attributes that reference the constant pool remain valid.
Prior to this fix, a segfault was triggered for nested records that use Jazzer annotations when Jazzer was trying to access data on record components, when trying to create a record mutator.