Add support alloc-free #923

CapZTr · 2025-08-14T09:50:55Z

This PR adds support to detect heap memory safety issues in:

adding new event MemFree with new tag FREE
modifying event MemAlloc (renamed from Alloc), which has new tag ALLOC when it is a heap allocation
enhancing alias analysis to handle the two events above
adding new event Termination with tag TERMINATION which gets executed if and only if the program terminated normally
adding new base relation AllocPtr and AllocMem and derived ones into models c11 and rc11, NOTE: vmm is still WIP

* init alias analysis for alloc Signed-off-by: Tianrui Zheng <[email protected]> * add MemFree events to the test Signed-off-by: Tianrui Zheng <[email protected]> * add first test for alias analysis Signed-off-by: Tianrui Zheng <[email protected]> * add second test for alias analysis Signed-off-by: Tianrui Zheng <[email protected]> * improve field insensitive for alloc Signed-off-by: Tianrui Zheng <[email protected]> * add third test for alias analysis Signed-off-by: Tianrui Zheng <[email protected]> * add alias analysis for free events Signed-off-by: Tianrui Zheng <[email protected]> * use alias analysis for alloc and free in relation analysis Signed-off-by: Tianrui Zheng <[email protected]> * change native relation analysis Signed-off-by: Tianrui Zheng <[email protected]> * change logic of mustAlias Signed-off-by: Tianrui Zheng <[email protected]> * Using stack pointer in alloc tests * change exception type Signed-off-by: Tianrui Zheng <[email protected]> * Add full alias analysis Signed-off-by: Tianrui Zheng <[email protected]> * Add object alias for alloc -> memory event Signed-off-by: Tianrui Zheng <[email protected]> * Add alloc and free to alias graph Signed-off-by: Tianrui Zheng <[email protected]> --------- Signed-off-by: Tianrui Zheng <[email protected]> Co-authored-by: Tianrui Zheng <[email protected]> Co-authored-by: Natalia Gavrilenko <[email protected]>

Signed-off-by: Tianrui Zheng <[email protected]>

Type of pthread mutex is u1 instead of u32. Remove intrinsics for pthread_mutexattr_setpolicy_np.

ThomasHaas · 2025-08-14T11:09:40Z

Hmm, I haven't checked the details, but I think all of this can be done a lot easier.
We do support "generic memory events" and IMO we can make MemFree such an event.
It is not so straightforward to do the same for Alloc (it cannot be a MemoryEvent because it has no address), but a simple workaround is to use a marker-like AllocMarker memory event that takes an expression as input and marks it as allocated.

In IR, it would look like this:

bv64 r <- heapalloc(....)
AllocMarker(r)     // Memory event with address expression "r" and ALLOC tag

In fact, maybe we should just have two different alloc events altogether: "DynamicAlloc" and MemAlloc (naming is up to debate) where the latter one is a memory event. The former is later "compiled" to the latter. Similar to how we have DynamicThreadCreate and ThreadCreate.

Why do all this?
(1) It is conceptually more sane and more principled.
(2) You don't need to mess up the signature of AliasAnalysis.mayAlias etc.
(3) Allocptr becomes obsolete: you can do let allocptr = [ALLOC];loc;[FREE] since you have the standard loc relation between them. Also, I'm still advocating for a sameObj relation which would make allocmem obsolete, is more general, and more cleanly aligns with the new AliasAnalysis.mayObjectAlias methods.
(4) It is easier and requires less changes (e.g. EncodingContext and IRHelper changes need not be done).

Also, do we really need the Termination event? The notion of memory leak is independet of the memory model since it is independent of time and so we could easily encode is as either a separate property or as part of the program specification (maybe with a switch to enable/disable the leak checks). In other words, we move it out of the CAT file and into the program spec.
In fact, we could move more (or all?) time-independent notions into the program spec such as free-without-alloc.

natgavrilenko · 2025-08-18T17:04:58Z

Hmm, I haven't checked the details, but I think all of this can be done a lot easier. We do support "generic memory events" and IMO we can make MemFree such an event. It is not so straightforward to do the same for Alloc (it cannot be a MemoryEvent because it has no address), but a simple workaround is to use a marker-like AllocMarker memory event that takes an expression as input and marks it as allocated.

In IR, it would look like this:
bv64 r <- heapalloc(....)
AllocMarker(r)     // Memory event with address expression "r" and ALLOC tag
In fact, maybe we should just have two different alloc events altogether: "DynamicAlloc" and MemAlloc (naming is up to debate) where the latter one is a memory event. The former is later "compiled" to the latter. Similar to how we have DynamicThreadCreate and ThreadCreate.

Why do all this? (1) It is conceptually more sane and more principled. (2) You don't need to mess up the signature of AliasAnalysis.mayAlias etc. (3) Allocptr becomes obsolete: you can do let allocptr = [ALLOC];loc;[FREE] since you have the standard loc relation between them. Also, I'm still advocating for a sameObj relation which would make allocmem obsolete, is more general, and more cleanly aligns with the new AliasAnalysis.mayObjectAlias methods. (4) It is easier and requires less changes (e.g. EncodingContext and IRHelper changes need not be done).

It is a valid question whether we should make MemAlloc and MemFree memory events. I see two potential problems with this approach, but none of them should really prevent the implementation.

Firstly, if MemAlloc and MemFree inherit AbstractMemoryCoreEvent, then they will also inherit the 'M' tag. As far as I see, it shouldn't be a problem for existing models (rc11, c11, and vmm), but I can imagine models where this could cause surprising results that might be hard to troubleshoot. As a workaround, we can remove the 'M' tag in the constructors of MemAlloc and MemFree, though it is a bit ugly.

Secondly, with a straightforward implementation of 'loc', there will be a relation between MemAlloc (or MemFree) and a memory access to the beginning of the allocated region, but not to the other indexes of this region. This can cause very subtle bugs, and I am against defining 'loc' this way. But we can filter out MemAlloc and MemFree in the relation analysis of the loc relation, and keep them all together in alias analysis.

In short, I am fine to make MemAlloc and MemFree a GenericMemoryEvent (alias analysis will be indeed cleaner this way), as long as loc, allocmem, and allocptr are kept as they are.

Also, do we really need the Termination event? The notion of memory leak is independet of the memory model since it is independent of time and so we could easily encode is as either a separate property or as part of the program specification (maybe with a switch to enable/disable the leak checks). In other words, we move it out of the CAT file and into the program spec. In fact, we could move more (or all?) time-independent notions into the program spec such as free-without-alloc.

For the termination event, you are right that it can be implemented as a separate property, but it will make the interface of the end user even more convoluted. Actually, it is Hernan who wants to keep the interface clean, but I also like the idea of keeping all the six properties together in a CAT file.

natgavrilenko · 2025-08-18T17:16:25Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/analysis/LazyRelationAnalysis.java

+        }
+
+        @Override
+        // TODO: May and must sets of AllocMem can become very large for some programs.


I will do a detailed review a bit later this week, but just a quick comment for now.

This should use LazyEventGraph. ImmutableEventGraph is the same explicit representation as in the native relation analysis, just with an immutable collection. It could be more efficient to use it in derived relations, but it may still require a lot of memory to store a may set of allocmem. With a LazyEventGraph, it will compute a set each time on demand. This might be a bit slower, but will use less memory.

Also, if alias analysis is thread-safe now, could you also change visitSameLocation to use LazyEventGraph? It used to be one of the most problematic relations in terms of the size of a may set. Hopefully, it should get better with a lazy graph.

And could you also remove the TODO here?

ThomasHaas · 2025-08-18T19:05:23Z

It is a valid question whether we should make MemAlloc and MemFree memory events. I see two potential problems with this approach, but none of them should really prevent the implementation.

Firstly, if MemAlloc and MemFree inherit AbstractMemoryCoreEvent, then they will also inherit the 'M' tag. As far as I see, it shouldn't be a problem for existing models (rc11, c11, and vmm), but I can imagine models where this could cause surprising results that might be hard to troubleshoot. As a workaround, we can remove the 'M' tag in the constructors of MemAlloc and MemFree, though it is a bit ugly.

If M = R | W ought to hold, we should probably define it like that in the stdlib.cat we have (and get rid of the internal M tag). We could also do let M = M \ (Free | Alloc) without the need to internally get rid of the tag. That being said, seeing both of them as proper memory events seems quite reasonable to me since they behave as such in many ways.

Secondly, with a straightforward implementation of 'loc', there will be a relation between MemAlloc (or MemFree) and a memory access to the beginning of the allocated region, but not to the other indexes of this region. This can cause very subtle bugs, and I am against defining 'loc' this way. But we can filter out MemAlloc and MemFree in the relation analysis of the loc relation, and keep them all together in alias analysis.

I don't understand why this can cause subtle bugs. loc is defined to mean that addresses exactly match, even in standard literature. For example, two mixed-size accesses that do only partially overlap are typically not in loc relation. Only the overlapping parts are in loc if the events get split into multiple subevents.
This is important because even the CPU can out-of-order execute the non-overlapping parts which would violate SC-per-loc if loc was defined to relate overlapping events.

So I would explicitly advocate for having alloc and free in loc relation, at least if we consider them as memory events internally.
Again, stdlib.cat could define let loc = loc \ (Free | Alloc) * (Free | Alloc) or let loc = [M];loc;[M] (assuming M does not contain free/alloc) if for some reason those should not be considered in the standard case.
Indeed, all those new relations could be defined in stdlib.cat if a simple sameObj base relation was available (whose implementation would be identical to allocmem just without the restriction to alloc/free). The user wouldn't know and doesn't really need to care I think.
GenMC has a similar loc-overlap relation sc-genmc (Kater), although they only use it to determine races and double-free it seems.

For the termination event, you are right that it can be implemented as a separate property, but it will make the interface of the end user even more convoluted. Actually, it is Hernan who wants to keep the interface clean, but I also like the idea of keeping all the six properties together in a CAT file.

In #878 I have a dual "Nontermination" event, so I'm not against having those types of events in principle. The implementation needs to be different though.

Regarding the properties in CAT: I can also imagine that having them as CAT spec can be kinda annoying, for example, when checking for data races in test code that doesn't care about cleanup (such as all our existing unit tests). You will end up getting memory leak errors left and right. But we could also provide options to disable/enable certain CAT-defined properties in some other way.

natgavrilenko · 2025-08-18T19:19:53Z

It is a valid question whether we should make MemAlloc and MemFree memory events. I see two potential problems with this approach, but none of them should really prevent the implementation.
Firstly, if MemAlloc and MemFree inherit AbstractMemoryCoreEvent, then they will also inherit the 'M' tag. As far as I see, it shouldn't be a problem for existing models (rc11, c11, and vmm), but I can imagine models where this could cause surprising results that might be hard to troubleshoot. As a workaround, we can remove the 'M' tag in the constructors of MemAlloc and MemFree, though it is a bit ugly.

If M = R | W ought to hold, we should probably define it like that in the stdlib.cat we have (and get rid of the internal M tag). We could also do let M = M \ (Free | Alloc) without the need to internally get rid of the tag. That being said, seeing both of them as proper memory events seems quite reasonable to me since they behave as such in many ways.

Secondly, with a straightforward implementation of 'loc', there will be a relation between MemAlloc (or MemFree) and a memory access to the beginning of the allocated region, but not to the other indexes of this region. This can cause very subtle bugs, and I am against defining 'loc' this way. But we can filter out MemAlloc and MemFree in the relation analysis of the loc relation, and keep them all together in alias analysis.

I don't understand why this can cause subtle bugs. loc is defined to mean that addresses exactly match, even in standard literature. For example, two mixed-size accesses that do only partially overlap are typically not in loc relation. Only the overlapping parts are in loc if the events get split into multiple subevents. This is important because even the CPU can out-of-order execute the non-overlapping parts which would violate SC-per-loc if loc was defined to relate overlapping events.

So I would explicitly advocate for having alloc and free in loc relation, at least if we consider them as memory events internally. Again, stdlib.cat could define let loc = loc \ (Free | Alloc) * (Free | Alloc) or let loc = [M];loc;[M] (assuming M does not contain free/alloc) if for some reason those should not be considered in the standard case. Indeed, all those new relations could be defined in stdlib.cat if a simple sameObj base relation was available (whose implementation would be identical to allocmem just without the restriction to alloc/free). The user wouldn't know and doesn't really need to care I think. GenMC has a similar loc-overlap relation sc-genmc (Kater), although they only use it to determine races and double-free it seems.

Please read again to see why it shouldn't be in loc.

For the termination event, you are right that it can be implemented as a separate property, but it will make the interface of the end user even more convoluted. Actually, it is Hernan who wants to keep the interface clean, but I also like the idea of keeping all the six properties together in a CAT file.

In #878 I have a dual "Nontermination" event, so I'm not against having those types of events in principle. The implementation needs to be different though.

I don't see why it needs to be different, we cannot change the code back and forth just because you like it that way.

Regarding the properties in CAT: I can also imagine that having them as CAT spec can be kinda annoying, for example, when checking for data races in test code that doesn't care about cleanup (such as all our existing unit tests). You will end up getting memory leak errors left and right. But we could also provide options to disable/enable certain CAT-defined properties in some other way.

Again, please check the pr first, all the tests have been already taken care of.

ThomasHaas · 2025-08-18T19:34:21Z

Please read again to see why it shouldn't be in loc.

I don't get this. Your only argument was this:

Secondly, with a straightforward implementation of 'loc', there will be a relation between MemAlloc (or MemFree) and a memory access to the beginning of the allocated region, but not to the other indexes of this region.

And my argument is that this behavior is exactly how loc is typically defined, i.e., it behaves as one would expect.

In #878 I have a dual "Nontermination" event, so I'm not against having those types of events in principle. The implementation needs to be different though.

I don't see why it needs to be different, we cannot change the code back and forth just because you like it that way.

My point is rather that the Termination event should not define its semantics in the encodeExec method (me might even want to get rid of this method altogether at some point) and instead should be defined in the ProgramEncoder which already provides encodings for termination like "threadHasTerminatedNormally()". Putting a program-wide reasoning into the event-local encoding is ugly and duplicating the already existing encoding for termination is not nice either.

Regarding the properties in CAT: I can also imagine that having them as CAT spec can be kinda annoying, for example, when checking for data races in test code that doesn't care about cleanup (such as all our existing unit tests). You will end up getting memory leak errors left and right. But we could also provide options to disable/enable certain CAT-defined properties in some other way.

Again, please check the pr first, all the tests have been already taken care of.

Ok, that fixes at least the issue on our side.

natgavrilenko · 2025-08-18T19:46:00Z

Please read again to see why it shouldn't be in loc.

I don't get this. Your only argument was this:

Secondly, with a straightforward implementation of 'loc', there will be a relation between MemAlloc (or MemFree) and a memory access to the beginning of the allocated region, but not to the other indexes of this region.

And my argument is that this behavior is exactly how loc is typically defined, i.e., it behaves as one would expect.

Where is it 'typically defined'? My point is that loc relation defined this way will have edges to events accessing index 0 but not other indexes. Such definition would be very error-prone, because in reality the effect will be the same for any index within the allocated bounds (unless it is a MemFree event), i.e. either accesses to all indexes should have the relation or none.

In #878 I have a dual "Nontermination" event, so I'm not against having those types of events in principle. The implementation needs to be different though.

I don't see why it needs to be different, we cannot change the code back and forth just because you like it that way.

My point is rather that the Termination event should not define its semantics in the encodeExec method (me might even want to get rid of this method altogether at some point) and instead should be defined in the ProgramEncoder which already provides encodings for termination like "threadHasTerminatedNormally()". Putting a program-wide reasoning into the event-local encoding is ugly and duplicating the already existing encoding for termination is not nice either.

It is the existing encoding moved from PropertyEncoder to the event.

Regarding the properties in CAT: I can also imagine that having them as CAT spec can be kinda annoying, for example, when checking for data races in test code that doesn't care about cleanup (such as all our existing unit tests). You will end up getting memory leak errors left and right. But we could also provide options to disable/enable certain CAT-defined properties in some other way.

Again, please check the pr first, all the tests have been already taken care of.

Ok, that fixes at least the issue on our side.

ThomasHaas · 2025-08-18T20:19:04Z

Please read again to see why it shouldn't be in loc.

I don't get this. Your only argument was this:

Secondly, with a straightforward implementation of 'loc', there will be a relation between MemAlloc (or MemFree) and a memory access to the beginning of the allocated region, but not to the other indexes of this region.

And my argument is that this behavior is exactly how loc is typically defined, i.e., it behaves as one would expect.

Where is it 'typically defined'? My point is that loc relation defined this way will have edges to events accessing index 0 but not other indexes. Such definition would be very error-prone, because in reality the effect will be the same for any index within the allocated bounds (unless it is a MemFree event), i.e. either accesses to all indexes should have the relation or none.

That is why I gave the example of mixed-size accesses:

int x[2];
*((long*)&x) = 42; // 64 bit access
x[0]  = 1;             // in loc relation with above access (access to same base address)
x[1] = 2;             // not in loc relation with first access (overlapping, but access to different base address)

Putting the large access into loc relation with the access to x[1] would be wrong according to, e.g., ARM's cat model as then SC-per-loc would prohibit reorderings that are observable on actual hardware. If you treat a MemFree simply as a large memory access over the whole object range (which I think is very reasonable and intuitive), this behavior of loc shouldn't fundametally change: the MemFree is only loc-related to the first index.

If you want to relate accesses to the same region, you instead use the new sameObj (or sameRegion if you prefer that name) relation that I'm talking about.
Since alloc-free pairs need to match on the exact same address, you use loc to check those. But, e.g., a use-after-free does not need to match precisely, so you use the new sameObj to formulate it.
The only real argument against doing it like this is for memory-space reasons, i.e., sameObj might have a too large may-set. That being said. the encoded active set will still only contain events pairs involving frees, i.e., the encoding will likely be identical to what a hardcoded allocmem would produce.

In #878 I have a dual "Nontermination" event, so I'm not against having those types of events in principle. The implementation needs to be different though.

I don't see why it needs to be different, we cannot change the code back and forth just because you like it that way.

My point is rather that the Termination event should not define its semantics in the encodeExec method (me might even want to get rid of this method altogether at some point) and instead should be defined in the ProgramEncoder which already provides encodings for termination like "threadHasTerminatedNormally()". Putting a program-wide reasoning into the event-local encoding is ugly and duplicating the already existing encoding for termination is not nice either.

It is the existing encoding moved from PropertyEncoder to the event.

Well, then move it to the ProgramEncoder I guess. It still stands that program-wide reasoning better fits into the ProgramEncoder rather than the event-local semantics.
In the long term, I would get rid of encodeExec entirely I think. There are only like 2 or 3 usages, all of which the ProgramEncoder could encode instead of spreading the definition of the program semantics all over the place.

Signed-off-by: Tianrui Zheng <[email protected]>

ThomasHaas · 2025-08-21T14:58:06Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/event/core/MemAlloc.java

+    public MemAlloc(Register resultRegister, Type allocType, Expression arraySize, Expression alignment,
+            boolean isHeapAllocation, boolean doesZeroOutMemory) {
+        super(resultRegister, "alloc");


This is not going to work. The address expression (or any expression for that matter) is evaluated before the execution of the event, but at that point in time, resultRegister will not contain the correct address (it is most likely even uninitialized).

That is why I said you probably need to add a new event. Either a dummy event that can be added after the alloc so it can refer to the result register:

p <- alloc(...) // No memory event. Using "p" as address here is wrong. MemAlloc(p); // Visible allocation marker. Memory event with address "p".

Or, a new MemAlloc that replaces Alloc after the memory allocation pass ran, i.e., after memory objects have been allocated. Because then MemAlloc can use the allocated MemoryObject as address.

The first solution is certainly easier to implement for now (probably like 10 lines of code).
For the second solution, I can also imagine MemAlloc just inheriting from Alloc (but then you need to implement the MemoryCoreEvent interface) and adding the new address field. But I guess quite a bit of existing code would need to get updated.

Please give a minimal example of a program where it is not going to work

Here I use resultRegister in constructor and later address will be set to allocatedObject when setAllocatedObject is called.

Please give a minimal example of a program where it is not going to work

No frontend will generate such a program, but code like this would fail to get optimized

int64 p <- ... // alloc or a local assigment int64 p <- alloc(...)

The second alloc appears to make use of p which keeps the previous assignment alive and thus it cannot get deleted although it is dead. It will probably also generate some encoding for that, but I don't think it will result in soundness errors. Though I can also imagine it breaking the alias analysis if we ever want to run it before memory allocation (which is reasonable).
Apart from that, the conceptual error is important in itself IMO.

Here I use resultRegister in constructor and later address will be set to allocatedObject when setAllocatedObject is called.

I know, but that only partly addresses the issue. On construction, the alloc event is essentially broken until the memory allocation pass goes through. Creating a conceptually broken object and then adding a workaround will only cause problems in the long run. At the very least, it deserves comments that document the "hacks" and explain why the hack is not problematic in practice. Though I would advice against it, because then you kinda need to understand what the rest of Dartagnan does and anyone who writes passes in the future can easily run into the trap (e.g., alias analysis running before memory allocation).
In 99% of the cases though, the wrong address will only cause optimization obstacles if anything.

e.g., alias analysis running before memory allocation

I'm confused about it: how can the two solutions suggested by you help with this case?

Currently (and in both of my suggested solutions), the Alloc event is no memory event, so no pass/analysis can accidentally treat it as such. This also means that an alias analysis that uses getAddress() to determine the accessed address of each memory event is going to do the right thing.
In your version, getAddress() gives a wrong value before memory allocation, so the analyses have to special case on Alloc and if they don't, they may produce wrong results without knowing.
And there is no way that, a couple month down the line, everyone remembers that Alloc needs special treatment... Even worse, because it will likely work in many cases (e.g., whenever an analysis runs after memory allocation), it ends up being a time bomb for a very long time.

Btw. the "cheap" solution with the extra allocation marker is somewhat implemented here by a former student of mine. It also uses the sobj (same object) relation to formulate memory errors:

flag ~empty (([M];sobj;[MFR]) \ hb*) as possible-access-after-free let alloc-free-matches = [MAL];loc;[MFR] flag ~empty [MFR] \ [range(alloc-free-matches)] as invalid-free flag ~empty [MAL] \ [domain(alloc-free-matches)] as memory-leak

I have not checked any details of the code base nor the CAT files, so take them with a grain of salt.

Lastly, I think you shouldn't inherit from GenericMemoryEvent. If you want to use that class, you can just create instances of it with the ALLOC/FREE tag. If you create separate classes (which is totally fine), then they should probably extend AbstractMemoryCoreEvent instead.

Signed-off-by: Tianrui Zheng <[email protected]>

ThomasHaas · 2025-08-22T10:08:57Z

cat/c11.cat

+(*
+ * Heap memory safety.
+ * Base relations:
+ * allocptr - relates (ALLOC | FREE) -> (FREE) when both events use the same pointer
+ * allocmem - relates (ALLOC) -> (M) when the second event accesses the memory allocated by the first event
+ *)
+flag ~empty (((ALLOC * FREE) & allocptr) \ hb) as alloc-race
+flag ~empty (((FREE * FREE) \ id) & allocptr) as double-free
+flag ~empty ([FREE] \ [range(allocptr & (ALLOC * FREE))]) as free-without-alloc
+flag ~empty (allocmem \ hb) as use-before-alloc
+flag ~empty ((allocmem^-1 ; (allocptr & (ALLOC * FREE))) \ hb) as use-after-free
+flag ~empty [domain(ALLOC * TERMINATION)] \ [domain(allocptr)] as alloc-without-free


I think you could put this into a separate .cat file and just include it in the other models. I guess it makes sense for any (language-level) memory model that defines a hb relation.

ThomasHaas · 2025-08-22T10:18:17Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/event/EventFactory.java

+    public static Termination newTerminationEvent() {
+        return terminationEvent;
+    }


Don't return an existing instance; just create a new one.

ThomasHaas · 2025-08-22T10:26:06Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/event/core/MemFree.java

+        MemFree other = new MemFree(address);
+        other.setFunction(this.getFunction());
+        other.copyAllMetadataFrom(this);
+        return other;


I would prefer a proper copy constructor. Also, don't set the function of the newly created event: this breaks invariants we try to uphold.

ThomasHaas · 2025-08-22T10:31:35Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/event/core/Termination.java

+    public BooleanFormula encodeExec(EncodingContext ctx) {
+        Program program = getThread().getProgram();
+        final BooleanFormulaManager bmgr = ctx.getBooleanFormulaManager();
+        final BooleanFormula exitReached = bmgr.and(program.getThreads().stream()
+                .filter(t -> !t.equals(getThread()))
+                .map(t -> bmgr.equivalence(ctx.execution(t.getEntry()), ctx.execution(t.getExit())))
+                .toList());
+        final BooleanFormula terminated = program.getThreadEventsWithAllTags(Tag.NONTERMINATION).stream()
+                .map(CondJump.class::cast)
+                .map(jump -> bmgr.not(ctx.jumpTaken(jump)))
+                .reduce(bmgr.makeTrue(), bmgr::and);
+        return bmgr.implication(ctx.execution(this), bmgr.and(ctx.controlFlow(this), exitReached, terminated));
+    }


I would move this encoding into ProgramEncoder. To make it accessible from everywhere, I think it would be reasonable to have EncodingContext provide a hasProgramTerminated() function that returns a fresh boolean variable. ProgramEncoder can then encode hasProgramTerminated <=> "actual encoding" and execution(terminatorEvent) <=> hasProgramTerminated.

EDIT: If ctx.hasProgramTerminated() exists, then it is also fine to let Termination.encodeExec encode exec(TermationEvent) <=> ctx.hasProgramTerminated. Basically, the point is to get the definition of what it means for a program to be terminated outside of the Termination event and into a more central place.

ThomasHaas · 2025-08-22T10:35:28Z

dartagnan/src/main/java/com/dat3m/dartagnan/program/processing/NonterminationDetection.java

                    .forEach(this::instrumentLoop);
        }

+        program.addTerminationThread();


This shouldn't be here. Indeed, the code will not get reached if only spin loop detection is enabled.
If you want to put it into an existing pass, then ThreadCreation might be a better option. But that one might not run for Litmus code IIRC.
So a dedicated pass right at the end of the processing pipeline seems reasonable.

ThomasHaas · 2025-08-22T11:01:09Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/Wmm.java

+            case DATA -> intersection(r, getOrCreatePredefinedRelation(IDDTRANS),
+                    addDefinition(new CartesianProduct(newRelation(),
+                    Filter.union(Filter.byTag(Tag.MEMORY), Filter.byTag(Tag.ALLOC)),
+                    Filter.byTag(Tag.MEMORY))));


I'm really not a big fan of removing the memory tags from Alloc and Free. It messes with so many things.
Is there any imaginable case where treating Alloc/Free as proper memory events is really wrong?
If so, then it suggests that malloc and calloc are ought to be treated differently, since the latter zero-initializes the memory and so almost acts as a write (and hence a proper memory access).

If it is hard to imagine, I would go with the simpler solution and add the tags (or rather, not remove them in the first place) and see if we ever encounter an issue with it.

ThomasHaas · 2025-08-22T11:03:29Z

dartagnan/src/main/java/com/dat3m/dartagnan/wmm/definition/AllocPtr.java

+public class AllocPtr extends Definition {
+
+    public AllocPtr(Relation r) {
+        super(r, ALLOCPTR);
+    }
+
+    @Override
+    public <T> T accept(Visitor<? extends T> v) {
+        return v.visitAllocPtr(this);
+    }
+}


In your current iteration, Alloc and Free have loc-edges (which I think they should), so let allocPtr = [ALLOC];loc;[FREE] holds and you can save yourself this whole relation.

Signed-off-by: Tianrui Zheng <[email protected]>

natgavrilenko and others added 30 commits May 13, 2025 14:26

Alloc-free verification prototype

35ca9fc

Using stack pointer in alloc tests

3940bb8

Init setup

20e84e4

Signed-off-by: Tianrui Zheng <[email protected]>

Add Alloc and MemFree to witness for debugging

ddc0e11

Signed-off-by: Tianrui Zheng <[email protected]>

Modified some memory models and tests

8631ddc

Signed-off-by: Tianrui Zheng <[email protected]>

Fix event and relation extraction of alloc-free for witness

44ead61

Signed-off-by: Tianrui Zheng <[email protected]>

Modify some relation and constraint definitions

e5d6a28

Signed-off-by: Tianrui Zheng <[email protected]>

Merge development into this branch

8ff67e2

Signed-off-by: Tianrui Zheng <[email protected]>

Remove unnecessary definition

5f7440f

Signed-off-by: Tianrui Zheng <[email protected]>

Clean c11 and rc11

9803e52

Signed-off-by: Tianrui Zheng <[email protected]>

Modify as comments

0659c5e

Signed-off-by: Tianrui Zheng <[email protected]>

Clean code to prepare rebase

616dfda

Signed-off-by: Tianrui Zheng <[email protected]>

Merge remote-tracking branch 'upstream/development' into alloc-free-dev

ec9717e

Fix merging

4de382e

Signed-off-by: Tianrui Zheng <[email protected]>

Merge remote-tracking branch 'upstream/development' into alloc-free-dev

fd2593f

Fix CI tests: Attempt 1

d752cab

Signed-off-by: Tianrui Zheng <[email protected]>

By Natalia: Add Termination event

89a4219

Signed-off-by: Tianrui Zheng <[email protected]>

Fix CI tests

5d21aee

Signed-off-by: Tianrui Zheng <[email protected]>

Fix CI tests

cd4a5c1

Signed-off-by: Tianrui Zheng <[email protected]>

Merge remote-tracking branch 'upstream/development' into alloc-free-dev

702527a

Fix bugs from merge

f21d510

Signed-off-by: Tianrui Zheng <[email protected]>

Add AllocPtrGraph & AllocMemGraph, modify as comments

02998bb

Signed-off-by: Tianrui Zheng <[email protected]>

Add Intrinsics pthread_detach (hernanponcedeleon#905)

9143013

Reverse vmm.cat

1cdc4c3

Signed-off-by: Tianrui Zheng <[email protected]>

Merge remote-tracking branch 'upstream/development' into alloc-free-dev

5f715e1

Fix bugs from merge

01e6ac1

Signed-off-by: Tianrui Zheng <[email protected]>

Remove vmm from AllocFreeTest

e602036

Signed-off-by: Tianrui Zheng <[email protected]>

Add DI source location for frees

3a08b77

Signed-off-by: Tianrui Zheng <[email protected]>

Add MemAlloc & MemFree to witness

2b60b6c

Signed-off-by: Tianrui Zheng <[email protected]>

Tianrui Zheng and others added 3 commits August 8, 2025 16:19

Remove unused imports

f4ca35d

Signed-off-by: Tianrui Zheng <[email protected]>

Remove obsolete pthread mutex events. (hernanponcedeleon#921)

8eebe94

Type of pthread mutex is u1 instead of u32. Remove intrinsics for pthread_mutexattr_setpolicy_np.

Merge remote-tracking branch 'upstream/development' into alloc-free-dev

e48a53f

natgavrilenko reviewed Aug 18, 2025

View reviewed changes

Make MemAlloc and MemFree extend GenericMemoryEvent

8d8ca49

Signed-off-by: Tianrui Zheng <[email protected]>

ThomasHaas reviewed Aug 21, 2025

View reviewed changes

Remove M tag when copy alloc

efa54cc

Signed-off-by: Tianrui Zheng <[email protected]>

ThomasHaas reviewed Aug 22, 2025

View reviewed changes

Tianrui Zheng added 8 commits August 22, 2025 16:48

Add HeapAlloc

4c531d7

Signed-off-by: Tianrui Zheng <[email protected]>

Add HeapAlloc class

944045f

Signed-off-by: Tianrui Zheng <[email protected]>

Revert to efa54cc

946bf92

Signed-off-by: Tianrui Zheng <[email protected]>

Make MemAlloc & MemFree extend AbstractMemoryCoreEvent

f1a05c6

Signed-off-by: Tianrui Zheng <[email protected]>

Use LazyEventGraph for loc and allocMem in LazyRelationAnalysis

1d49035

Signed-off-by: Tianrui Zheng <[email protected]>

Fix EqualityAliasAnalysis

2404a04

Signed-off-by: Tianrui Zheng <[email protected]>

Fix EqualityAliasAnalysis

5d28e57

Signed-off-by: Tianrui Zheng <[email protected]>

Small change to LazyRelationAnalysis

d7b7b4e

Signed-off-by: Tianrui Zheng <[email protected]>

hernanponcedeleon force-pushed the development branch from 8eebe94 to dc21c35 Compare September 1, 2025 09:40

Add support alloc-free #923

Are you sure you want to change the base?

Add support alloc-free #923

Uh oh!

Conversation

CapZTr commented Aug 14, 2025

Uh oh!

ThomasHaas commented Aug 14, 2025

Uh oh!

natgavrilenko commented Aug 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ThomasHaas commented Aug 18, 2025

Uh oh!

natgavrilenko commented Aug 18, 2025

Uh oh!

ThomasHaas commented Aug 18, 2025

Uh oh!

natgavrilenko commented Aug 18, 2025

Uh oh!

ThomasHaas commented Aug 18, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ThomasHaas Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ThomasHaas Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ThomasHaas Aug 21, 2025 •

edited

Loading

ThomasHaas Aug 22, 2025 •

edited

Loading