-
Notifications
You must be signed in to change notification settings - Fork 3
Implement std::intrinsics::raw_eq support in KMIR #665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
0ed9d98
e7f3a31
a545912
eb64d3e
28467a0
8f03fd7
9f66b04
95a6b5d
54de2b2
7eac083
d05eaa5
296e95f
3fb15d5
6d3d53f
1ada2ae
105d2aa
e4a9bc0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,73 +1,195 @@ | ||
# Adding Intrinsics | ||
|
||
## Overview | ||
|
||
This guide explains how to add support for new intrinsic functions in KMIR. Intrinsics are compiler built-in functions that don't have regular MIR bodies and require special semantic rules. | ||
|
||
## Architecture | ||
|
||
Intrinsics are handled with direct operand passing: | ||
1. **Direct Operand Passing**: `#execIntrinsic(symbol("name"), ARGS, DEST)` receives operands directly | ||
2. **Pattern Matching**: Rules match on specific operand patterns for each intrinsic | ||
3. **Operand Evaluation**: Each intrinsic rule handles its own operand evaluation as needed | ||
|
||
See implementation in [kmir.md](../../kmir/src/kmir/kdist/mir-semantics/kmir.md) | ||
|
||
## Development Workflow | ||
|
||
### Step 1: Create Test File | ||
Create `tests/rust/intrinsic/your_intrinsic.rs`: | ||
|
||
```rust | ||
fn main() { | ||
let result = your_intrinsic(args); | ||
assert_eq!(result, expected); | ||
} | ||
Create a Rust test file in `kmir/src/tests/integration/data/exec-smir/intrinsic/` that uses your intrinsic and verifies its behavior with assertions. | ||
|
||
### Step 2: Add Test to Integration Suite | ||
|
||
Add entry to `EXEC_DATA` in [test_integration.py](../../kmir/src/tests/integration/test_integration.py): | ||
|
||
```python | ||
( | ||
'your_intrinsic', | ||
EXEC_DATA_DIR / 'intrinsic' / 'your_intrinsic.smir.json', | ||
EXEC_DATA_DIR / 'intrinsic' / 'your_intrinsic.state', | ||
65, # Start with small depth, increase if needed | ||
), | ||
``` | ||
|
||
### Step 2: Generate SMIR and Verify Intrinsic Detection | ||
### Step 3: Generate Initial State (Will Show Stuck Point) | ||
|
||
```bash | ||
# Generate SMIR JSON | ||
make generate-tests-smir | ||
# Generate initial state showing where execution gets stuck | ||
make test-integration TEST_ARGS="-k 'exec_smir and your_intrinsic' --update-expected-output" | ||
|
||
# This will create your_intrinsic.state showing execution stuck at: | ||
# #execIntrinsic(symbol("your_intrinsic"), DEST) | ||
|
||
# Update expected outputs and verify intrinsic is detected | ||
make test-unit TEST_ARGS="--update-expected-output" | ||
# Save this for comparison | ||
cp kmir/src/tests/integration/data/exec-smir/intrinsic/your_intrinsic.state \ | ||
your_intrinsic.state.initial | ||
``` | ||
|
||
Check `tests/expected/unit/test_smir/test_function_symbols/your_intrinsic.expected.json` to confirm the intrinsic appears as `IntrinsicSym`. | ||
### Step 4: Implement K Semantics Rule | ||
|
||
### Step 3: Run Initial Integration Test | ||
Add rules to [kmir.md](../../kmir/src/kmir/kdist/mir-semantics/kmir.md). | ||
|
||
To find implementation patterns for your intrinsic: | ||
```bash | ||
# Run test and update expected output (will show stuck at intrinsic call) | ||
make test-integration TEST_ARGS="-k your_intrinsic --update-expected-output" | ||
# Search for existing intrinsic implementations | ||
grep -A10 "#execIntrinsic(symbol" kmir/src/kmir/kdist/mir-semantics/kmir.md | ||
|
||
# Backup the initial state for comparison | ||
cp tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state \ | ||
tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state.backup | ||
# Look for helper functions and evaluation patterns | ||
grep -B2 -A5 "seqstrict" kmir/src/kmir/kdist/mir-semantics/kmir.md | ||
``` | ||
|
||
### Step 4: Implement K Rule | ||
Edit `kmir/src/kmir/kdist/mir-semantics/kmir.md`: | ||
Study existing intrinsics like `black_box` (simple value operation) and `raw_eq` (reference dereferencing with helper functions) as reference implementations. | ||
|
||
```k | ||
rule <k> #execIntrinsic(mirString("your_intrinsic"), ARGS, DEST) => | ||
/* your implementation */ | ||
... </k> | ||
``` | ||
### Step 5: Add Documentation | ||
|
||
Document your intrinsic in the K semantics file with its implementation. | ||
|
||
### Step 6: Rebuild and Test | ||
|
||
### Step 5: Rebuild and Test | ||
```bash | ||
# Rebuild K semantics | ||
make build | ||
|
||
# Run test again | ||
make test-integration TEST_ARGS="-k your_intrinsic --update-expected-output" | ||
# Run test again and update the state with working implementation | ||
make test-integration TEST_ARGS="-k 'exec_smir and your_intrinsic' --update-expected-output" | ||
|
||
# Compare results | ||
diff tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state.backup \ | ||
tests/expected/integration/test_exec_smir/intrinsic_your_intrinsic.state | ||
# Compare to see the progress | ||
diff your_intrinsic.state.initial \ | ||
kmir/src/tests/integration/data/exec-smir/intrinsic/your_intrinsic.state | ||
``` | ||
|
||
The diff should show progress past the intrinsic call if implementation is correct. | ||
### Step 7: Verify Both Backends | ||
|
||
### Step 6: Verify Results | ||
Ensure the test completes successfully and the intrinsic behaves as expected. | ||
|
||
## Example: black_box | ||
```bash | ||
# Test with both LLVM and Haskell backends | ||
make test-integration TEST_ARGS="-k 'exec_smir and your_intrinsic'" | ||
|
||
Initial state (before rule): | ||
# Both should pass with consistent results | ||
# If not, you may need backend-specific state files | ||
``` | ||
#setUpCalleeData ( IntrinsicFunction ( mirString ( "black_box" ) ) , ...) | ||
|
||
## Examples | ||
|
||
For implementation examples, see existing intrinsics in [kmir.md](../../kmir/src/kmir/kdist/mir-semantics/kmir.md). | ||
|
||
## Common Patterns | ||
|
||
### Pattern 1: Direct Operand Evaluation | ||
Use when the intrinsic needs to evaluate its operands directly. | ||
|
||
### Pattern 2: Reference Dereferencing with #withDeref | ||
Use the `#withDeref` helper function to add dereferencing to operands. | ||
|
||
### Pattern 3: seqstrict for Automatic Evaluation | ||
Use `[seqstrict(2,3)]` attribute to automatically evaluate operand arguments. | ||
|
||
### Pattern 4: Pattern Matching on Operands | ||
Match specific operand patterns directly in the `#execIntrinsic` rule. | ||
|
||
## Testing Best Practices | ||
|
||
1. **Start Simple**: Test with primitive types first | ||
2. **Save Initial State**: Keep the stuck state for comparison | ||
3. **Use Correct Test Filter**: Always use `exec_smir and your_intrinsic` to ensure correct test runs | ||
4. **Check Both Backends**: Ensure LLVM and Haskell produce same results | ||
5. **Document Limitations**: Note what cases aren't handled yet | ||
6. **Create Issue for Future Work**: Track enhancements needed (like #666 for `raw_eq`) | ||
|
||
## Debugging Tips | ||
|
||
### Check Execution State | ||
```bash | ||
# See where execution is stuck with verbose output | ||
make test-integration TEST_ARGS="-k 'exec_smir and your_intrinsic' -vv" | ||
|
||
# Look for the K cell content to see what values are present | ||
``` | ||
|
||
After implementing rule: | ||
### Understanding the State File | ||
The state file shows the complete execution state. Key sections to check: | ||
- `<k>`: Shows current execution point | ||
- `<locals>`: Shows local variable values | ||
- `<functions>`: Should contain `IntrinsicFunction(symbol("your_intrinsic"))` | ||
|
||
### Verify Intrinsic Recognition | ||
```bash | ||
# Check SMIR JSON to confirm intrinsic is recognized | ||
cat kmir/src/tests/integration/data/exec-smir/intrinsic/your_intrinsic.smir.json | grep -A5 your_intrinsic | ||
``` | ||
Program continues execution with value 11 passed through | ||
``` | ||
|
||
## Common Issues | ||
|
||
### Issue: Wrong test runs | ||
**Solution**: Use `-k 'exec_smir and your_intrinsic'` to ensure `test_exec_smir` runs, not other tests. | ||
|
||
### Issue: "Function not found" | ||
**Solution**: The intrinsic should be automatically recognized if it appears in SMIR. Check the SMIR JSON to confirm. | ||
|
||
### Issue: Execution stuck at `#execIntrinsic` | ||
**Solution**: Your rule pattern doesn't match. Check: | ||
- The exact intrinsic name in the symbol | ||
- Value types on K cell (use `ListItem(VAL:Value)` to match any value) | ||
- Number of arguments expected | ||
|
||
### Issue: Recursion/infinite loop | ||
**Solution**: Use helper functions to separate evaluation stages, avoid calling `#execIntrinsic` within itself. | ||
|
||
### Issue: Different backend results | ||
**Solution**: | ||
- Increase execution depth if needed | ||
- Check for backend-specific evaluation order issues | ||
- May need backend-specific expected files (`.llvm.state`, `.haskell.state`) | ||
|
||
### Issue: Test timeout | ||
**Solution**: | ||
- Start with smaller depth (e.g., 65 instead of 1000) | ||
- Optimize your rule to avoid unnecessary computation | ||
- Check for infinite loops in your implementation | ||
|
||
## Important Notes | ||
|
||
### Direct Operand Passing | ||
The current architecture passes operands directly to intrinsic rules: | ||
- **Direct Access**: Rules receive operands directly in `#execIntrinsic` | ||
- **Custom Evaluation**: Each intrinsic controls its own operand evaluation | ||
- **Better Indexing**: K can better index rules with explicit operand patterns | ||
|
||
### When to Use Helper Functions | ||
Always use a helper function when: | ||
- You need automatic operand evaluation (with `seqstrict`) | ||
- The logic is complex enough to benefit from separation | ||
- You want to transform operands before evaluation (like with `#withDeref`) | ||
|
||
### Testing Strategy | ||
1. Write the test with expected behavior first | ||
2. Generate initial state to see where it gets stuck | ||
3. Implement the minimal rule needed | ||
4. Update state to verify progress | ||
5. Iterate to handle edge cases | ||
6. Document limitations for future work | ||
|
||
## References | ||
|
||
- Recent intrinsic PRs in repository history | ||
- [Rust Intrinsics Documentation](https://doc.rust-lang.org/std/intrinsics/) |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -629,7 +629,51 @@ its argument to the destination without modification. | |||||
|
||||||
```k | ||||||
// Black box intrinsic implementation - identity function | ||||||
rule <k> #execIntrinsic(symbol("black_box"), ARG .Operands, DEST) => #setLocalValue(DEST, ARG) ... </k> | ||||||
rule <k> #execIntrinsic(symbol("black_box"), ARG:Operand .Operands, DEST) | ||||||
=> #setLocalValue(DEST, ARG) | ||||||
... </k> | ||||||
``` | ||||||
|
||||||
#### Raw Eq (`std::intrinsics::raw_eq`) | ||||||
|
||||||
The `raw_eq` intrinsic performs byte-by-byte equality comparison of the memory contents pointed to by two references. | ||||||
It returns a boolean value indicating whether the referenced values are equal. The implementation dereferences the | ||||||
provided References to access the underlying values, then compares them using K's built-in equality operator. | ||||||
This intrinsic is typically used for low-level memory comparison operations where type-specific equality methods | ||||||
are not suitable. | ||||||
|
||||||
**Current Limitations:** | ||||||
The current implementation only handles the simple case where References point to values of the same type. | ||||||
More complex scenarios require additional testing and implementation work: | ||||||
- References to different types with the same memory representation | ||||||
- References to composite types (structs, arrays, enums) | ||||||
- References with different alignments or padding | ||||||
|
||||||
Handling different types may require converting values to their byte representations before comparison, | ||||||
which will need to be addressed when such use cases are encountered. | ||||||
|
||||||
```k | ||||||
// Raw eq intrinsic - byte-by-byte equality comparison of referenced values | ||||||
|
||||||
// Helper function to append projectionElemDeref to an operand | ||||||
syntax Operand ::= #withDeref(Operand) [function, total] | ||||||
rule #withDeref(operandCopy(place(LOCAL, PROJ))) | ||||||
=> operandCopy(place(LOCAL, projectionElemDeref PROJ)) | ||||||
rule #withDeref(operandMove(place(LOCAL, PROJ))) | ||||||
=> operandMove(place(LOCAL, projectionElemDeref PROJ)) | ||||||
rule #withDeref(OP) => OP [owise] | ||||||
|
||||||
// Handle raw_eq intrinsic by dereferencing operands | ||||||
rule <k> #execIntrinsic(symbol("raw_eq"), ARG1:Operand ARG2:Operand .Operands, PLACE) | ||||||
=> #execRawEq(PLACE, #withDeref(ARG1), #withDeref(ARG2)) | ||||||
... </k> | ||||||
|
||||||
// Execute raw_eq with operand evaluation via seqstrict | ||||||
syntax KItem ::= #execRawEq(Place, KItem, KItem) [seqstrict(2,3)] | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
rule <k> #execRawEq(DEST, VAL1:Value, VAL2:Value) | ||||||
=> #setLocalValue(DEST, BoolVal(VAL1 ==K VAL2)) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a subtlety here that we have to address at some point: The enum MyEnum {
A(u64),
B(i64),
} and struct MyStruct(u64); then There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could compare the layouts (stable-mir-json is extracting them but they are not currently in the MIR-AST in K), but that will become complex. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, what There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To distinguish this case, we need to store the original type of the As I mentioned this issue #666, should we tackle all the features in this PR? What kind of case can be moved to the furture work? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (PS. Apologies. I'm trying to type in English directly without AI rephrasing for faster response. But I may use some wrong grammar and words.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Maybe we should have a way to provide python hooks for K? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this feature will be used in p-token, @dkcumming? If not, maybe we should move this to future work. And for this case, I'm curious about the real ouput of this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a first version, we could only implement There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like this issue touches the deep about what our architecture should look like, including:
BTW, it looks weird with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the code in |
||||||
... </k> | ||||||
|
||||||
``` | ||||||
|
||||||
### Stopping on Program Errors | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#![feature(core_intrinsics)] | ||
|
||
use std::intrinsics::raw_eq; | ||
|
||
// NOTE: This test demonstrates the raw_eq intrinsic implementation. | ||
// Known issue: Haskell backend produces more verbose symbolic execution output | ||
// compared to LLVM backend due to different handling of symbolic branches. | ||
// Both backends correctly implement the intrinsic but output format differs. | ||
Comment on lines
+5
to
+8
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this true? Then the test should not pass because both HS backend and LLVM backend are tested. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I just changed the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This means the LLVM backend is taking one branch and the HS backend explores more than one branch (non-deterministic branching). That should be investigated. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I found that is a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It could be a problem with one of the new rules. Maybe related to using |
||
|
||
fn main() { | ||
let a = 42; | ||
let b = 42; | ||
|
||
// Test raw_eq intrinsic with identical values | ||
let result = unsafe { raw_eq(&a, &b) }; | ||
|
||
// This assertion should pass as both values are identical | ||
assert!(result); | ||
} |
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should append, not prepend... it has to use the
appendP
helper function (defined somewhere in data.md)