Skip to content

Avoid redundant upcasts in UInt8/UInt16 comparisons#715

Open
elefthei wants to merge 2 commits intomasterfrom
lef/fix-694-eurydice
Open

Avoid redundant upcasts in UInt8/UInt16 comparisons#715
elefthei wants to merge 2 commits intomasterfrom
lef/fix-694-eurydice

Conversation

@elefthei
Copy link
Copy Markdown

This should fix the Eurydice regression from #694 @protz

When both operands of a comparison are atomic (variables, field accesses, function results — not arithmetic expressions), strip the unnecessary Cast(_, UInt32) wrappers and compare at native width.

Before: (uint32_t)a == (uint32_t)b (redundant casts)
After: a == b (clean, no casts)

Arithmetic comparisons are unaffected — they still widen and mask:
((uint32_t)a + (uint32_t)b & 0xFFU) == (uint32_t)c

@elefthei elefthei requested a review from msprotz March 31, 2026 18:16
@elefthei
Copy link
Copy Markdown
Author

Maybe let's add Eurydice to CI and merge this afterwards?

When both operands of a comparison are atomic (variables, field accesses,
function results — not arithmetic expressions), strip the unnecessary
Cast(_, UInt32) wrappers and compare at native width.

Before: (uint32_t)a == (uint32_t)b   (redundant casts)
After:  a == b                       (clean, no casts)

Arithmetic comparisons are unaffected — they still widen and mask:
  ((uint32_t)a + (uint32_t)b & 0xFFU) == (uint32_t)c

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@elefthei elefthei force-pushed the lef/fix-694-eurydice branch from 07b6e4b to aeb42bd Compare March 31, 2026 18:51
@protz
Copy link
Copy Markdown
Collaborator

protz commented Mar 31, 2026

Ok just catching up on this. Thanks. This is an acceptable fix for #716 -- I'll let you decide if you prefer this or my suggestion from #703. I'm talking to @Nadrieril tomorrow morning so we'll see about Eurydice CI then, unless @tahina-pro understands enough to copy/paste the nix build description from Eurydice to here.

@elefthei
Copy link
Copy Markdown
Author

Let me take a stab at it, I will add eurydice NIX on this PR.

@protz
Copy link
Copy Markdown
Collaborator

protz commented Mar 31, 2026

ok interesting diff -- we'd have to review this carefully

@protz
Copy link
Copy Markdown
Collaborator

protz commented Mar 31, 2026

tests> -    outi = ((uint32_t)lhs.ptr[i0] & (uint32_t)mask) | ((uint32_t)rhs.ptr[i0] & (uint32_t)~mask);
tests> +    outi =
tests> +      ((uint32_t)lhs.ptr[i0] & (uint32_t)mask) | ((uint32_t)rhs.ptr[i0] & (~(uint32_t)mask & 0xFFU));

@protz
Copy link
Copy Markdown
Collaborator

protz commented Mar 31, 2026

I think this looks ok but I'd like a second careful review.

Comment on lines +307 to +319
test-eurydice:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master

- name: Install Nix
uses: DeterminateSystems/nix-installer-action@main

- name: Build and test Eurydice against this Karamel
run: |
nix build -L --no-link \
"github:AeneasVerif/eurydice#checks.x86_64-linux.default" \
--override-input karamel "github:FStarLang/karamel/$GITHUB_SHA"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! The only issue is that this will take ages because it recompiles Charon and all Eurydice dependencies from scratch every time. You should be able to pull paths directly from our CI system by using the https://hacl.cachix.org cachix path. Maybe this is enough:

- uses: cachix/cachix-action@v15
  with:
    name: hacl

test-eurydice:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
Copy link
Copy Markdown

@Nadrieril Nadrieril Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need checkout actually, the flake will fetch it

Builds and tests Eurydice against the current Karamel commit using nix.
Uses --override-input to point Eurydice's karamel flake input at
the PR branch, catching extraction regressions in downstream projects.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@elefthei elefthei force-pushed the lef/fix-694-eurydice branch from 418e86c to b0d1b70 Compare April 1, 2026 23:37
@elefthei
Copy link
Copy Markdown
Author

elefthei commented Apr 1, 2026

Indeed, this is not equivalent when variables are 32-bits. Here is an SMT-LIB2 that shows this.

; Equivalence check: bitwise-select regression
; Models C integer promotion and (uint32_t) casts with bitvectors.
;
; Old (pre-patch):
;   outi = ((uint32_t)lhs.ptr[i0] & (uint32_t)mask)
;        | ((uint32_t)rhs.ptr[i0] & (uint32_t)~mask);
;
; New (post-patch):
;   outi = ((uint32_t)lhs.ptr[i0] & (uint32_t)mask)
;        | ((uint32_t)rhs.ptr[i0] & (~(uint32_t)mask & 0xFFU));

(set-logic QF_BV)
(set-option :produce-models true)

; =====================================================================
; Check 1: All source values are uint8_t
;   lhs.ptr[i0], rhs.ptr[i0], mask : uint8_t
; =====================================================================
(echo "=== Check 1: All sources uint8_t ===")
(push)

(declare-const lhs_u8  (_ BitVec 8))
(declare-const rhs_u8  (_ BitVec 8))
(declare-const mask_u8 (_ BitVec 8))

; --- (uint32_t) casts: zero-extend 8 -> 32 ---
(define-fun lhs32 () (_ BitVec 32) ((_ zero_extend 24) lhs_u8))
(define-fun rhs32 () (_ BitVec 32) ((_ zero_extend 24) rhs_u8))
(define-fun mask32 () (_ BitVec 32) ((_ zero_extend 24) mask_u8))

; --- Old expression ---
; C semantics of (uint32_t)~mask when mask is uint8_t:
;   1. mask_u8 undergoes integer promotion to int (32-bit signed)
;      i.e. zero_extend to 32 bits (value fits in non-negative int)
;   2. ~ applied to the promoted int  -> bvnot of 32-bit value
;   3. cast to uint32_t               -> reinterpret (no-op on bits)
; Result: bvnot(zero_extend_24(mask_u8))
(define-fun old_mask_compl () (_ BitVec 32)
  (bvnot ((_ zero_extend 24) mask_u8)))   ; (uint32_t)~mask

(define-fun old_result () (_ BitVec 32)
  (bvor (bvand lhs32 mask32)
        (bvand rhs32 old_mask_compl)))

; --- New expression ---
; C semantics of ~(uint32_t)mask & 0xFFU:
;   1. (uint32_t)mask  -> zero_extend to 32 bits
;   2. ~               -> bvnot of 32-bit value
;   3. & 0xFFU         -> bvand with 0x000000FF (truncate to low byte)
(define-fun new_mask_compl () (_ BitVec 32)
  (bvand (bvnot ((_ zero_extend 24) mask_u8))  ; ~(uint32_t)mask
         #x000000FF))                           ; & 0xFFU

(define-fun new_result () (_ BitVec 32)
  (bvor (bvand lhs32 mask32)
        (bvand rhs32 new_mask_compl)))

(assert (not (= old_result new_result)))
(check-sat)   ; expected: unsat (equivalent for uint8_t inputs)
(pop)

; =====================================================================
; Check 2: lhs, rhs are uint8_t; mask is uint32_t
;   Models a wider mask (e.g. computed or passed as uint32_t).
; =====================================================================
(echo "=== Check 2: lhs/rhs uint8_t, mask uint32_t ===")
(push)

(declare-const lhs_u8_2  (_ BitVec 8))
(declare-const rhs_u8_2  (_ BitVec 8))
(declare-const mask_u32   (_ BitVec 32))

(define-fun lhs32_2 () (_ BitVec 32) ((_ zero_extend 24) lhs_u8_2))
(define-fun rhs32_2 () (_ BitVec 32) ((_ zero_extend 24) rhs_u8_2))

; Old: (uint32_t)~mask -- mask is already uint32_t, ~ is 32-bit NOT
(define-fun old_mask_compl_2 () (_ BitVec 32)
  (bvnot mask_u32))

(define-fun old_result_2 () (_ BitVec 32)
  (bvor (bvand lhs32_2 mask_u32)
        (bvand rhs32_2 old_mask_compl_2)))

; New: ~(uint32_t)mask & 0xFFU -- mask already uint32_t
(define-fun new_mask_compl_2 () (_ BitVec 32)
  (bvand (bvnot mask_u32) #x000000FF))

(define-fun new_result_2 () (_ BitVec 32)
  (bvor (bvand lhs32_2 mask_u32)
        (bvand rhs32_2 new_mask_compl_2)))

(assert (not (= old_result_2 new_result_2)))
(check-sat)   ; result: unsat — equivalent! rhs is 8-bit so upper bits are 0 anyway
(pop)

; =====================================================================
; Check 3: All sources uint32_t (worst case)
; =====================================================================
(echo "=== Check 3: All sources uint32_t ===")
(push)

(declare-const lhs_u32    (_ BitVec 32))
(declare-const rhs_u32    (_ BitVec 32))
(declare-const mask_u32_3 (_ BitVec 32))

(define-fun old_result_3 () (_ BitVec 32)
  (bvor (bvand lhs_u32 mask_u32_3)
        (bvand rhs_u32 (bvnot mask_u32_3))))

(define-fun new_result_3 () (_ BitVec 32)
  (bvor (bvand lhs_u32 mask_u32_3)
        (bvand rhs_u32 (bvand (bvnot mask_u32_3) #x000000FF))))

(assert (not (= old_result_3 new_result_3)))
(check-sat)   ; expected: sat (NOT equivalent)
(get-model)
(pop)

@protz
Copy link
Copy Markdown
Collaborator

protz commented Apr 2, 2026

How do I read the SMT2 file?
What is the counterexample?
Is the new behavior correct or the old behavior correct?

@Nadrieril
Copy link
Copy Markdown

Nix part LGTM

@elefthei
Copy link
Copy Markdown
Author

elefthei commented Apr 2, 2026

If lhs.ptr[i0], rhs.ptr[i0] and mask are u32, there is a counter-example given by running the SMT query. If not, this is sound. I am not sure what their type is from looking at the CI logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants