Skip to content

Conversation

@ThomasHaas
Copy link
Collaborator

This PR adds a simple type-based alias analysis (TBAA) using LLVM TBAA metadata generated by Clang.
The analysis is enabled by default and runs in addition to the other core alias analysis.

Overall issues:

  • To generate TBAA metadata in the first place, Clang needs to run with at least -O1. Since we do not want to have -O1 optimizations, we can use clang -O1 -Xclang -disable-llvm-passes to disable the -O1 LLVM passes. However, Clang still seems to produce slightly different code and I'm not sure how to avoid that (-O3 seemed to work better???).
  • TBAA metadata is only generated for direct accesses in the C-code. In particular, r = atomic_load(&x) does not generate TBAA metadata whereas r = x does, even if both are translated to the same code (assuming x is marked as atomic)
  • TBAA metadata is quite weak: it currently does not distinguish between unions, arrays, different pointer types, and nested pointers. This is a long standing issue (for over 10 years already), but there seems to be some recent traction: here and here

As a result of the above, TBAA seems to add no more precision than the current analysis (only tested on EBR so far, since our internal benchmarks are generated without TBAA metadata).
A case where it could be more precise is if the code uses complex pointer tagging which confuses our standard analysis. But for code that does not do this, TBAA might be useless.

Side notes:
Running Clang with higher optimization level also generates other metadata, in particular, !range metadata that limits the outcome of some loads from memory. It seems to give bounds for indices that are loaded to access an array.
This could be valuable both for generating smaller encodings (less bitwidth) and for improving aliasing precision for dynamic indexing into arrays.

Added TBAA as combined analysis to other alias analyses (for testing)
More efficient TBAA alias check.
@hernanponcedeleon
Copy link
Owner

I still have not manage to find an example where this new alias pays off: while we have a more precise analysis, the extra optimization makes the program harder for the solver.

Here are the numbers for harris.c with B=2 ... the mayset of rf is better, but the one of co is worse and the overall solving time is terribly worse.

java -jar dartagnan/target/dartagnan.jar cat/aarch64.cat --target=arm8 --solver=yices2 harris-opt.ll --bound=2
...
#Events: 5108
...
        #must|may rf edges: 3|21121
        #must|may rf edges: 3|21121
        #must|may co edges: 46|3346
        #must|may co edges: 46|3346
...
 ======== Summary ========
Number of iterations: 111
Total native solving time(ms): 931171
   -- Bound check time(ms): 369335
Total CAAT solving time(ms): 4341
   -- Model extraction time(ms): 1118
   -- Population time(ms): 1291
   -- Consistency check time(ms): 19
   -- Reason computation time(ms): 1037
   -- Refining time(ms): 3615
   -- #Computed core reasons: 18996
   -- #Computed core reduced reasons: 12353
   -- Min model size (#events): 143
   -- Average model size (#events): 266
   -- Max model size (#events): 321

[26.02.2024] 11:51:09 [DEBUG] RefinementSolver.runInternal - 
 ===== SMT Statistics (after final iteration) ===== 

[26.02.2024] 11:51:09 [INFO] RefinementSolver.runInternal - Verification finished with result UNKNOWN
UNKNOWN
Total verification time(ms): 948712
java -jar dartagnan/target/dartagnan.jar cat/aarch64.cat --target=arm8 --solver=yices2 harris-O3-opt.ll --bound=2
...
#Events: 5330
...
        #must|may rf edges: 3|19946
        #must|may rf edges: 3|19946
        #must|may co edges: 67|3451
        #must|may co edges: 67|3451
...
 ======== Summary ========
Number of iterations: 199
Total native solving time(ms): 29629180
   -- Bound check time(ms): 9406406
Total CAAT solving time(ms): 11840
   -- Model extraction time(ms): 2796
   -- Population time(ms): 3364
   -- Consistency check time(ms): 42
   -- Reason computation time(ms): 3889
   -- Refining time(ms): 6040
   -- #Computed core reasons: 40707
   -- #Computed core reduced reasons: 26772
   -- Min model size (#events): 217
   -- Average model size (#events): 276
   -- Max model size (#events): 383

[26.02.2024] 10:27:52 [DEBUG] RefinementSolver.runInternal - 
 ===== SMT Statistics (after final iteration) ===== 

[26.02.2024] 10:27:52 [INFO] RefinementSolver.runInternal - Verification finished with result UNKNOWN
UNKNOWN
Total verification time(ms): 29669776

harris.zip

@ThomasHaas
Copy link
Collaborator Author

As long as we cannot get TBAA without adding further optimizations, I agree that this is pointless. We cannot do fair comparisons this way.

@hernanponcedeleon
Copy link
Owner

can this be closed?

@ThomasHaas ThomasHaas closed this Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants