|
| 1 | +--- |
| 2 | +authors: phisch |
| 3 | +tags: [publications] |
| 4 | +--- |
| 5 | + |
| 6 | +# Foundations for Verification of Continuous Programs with Caesar |
| 7 | + |
| 8 | +The paper [_"Foundations for Deductive Verification of Continuous Probabilistic Programs: From Lebesgue to Riemann and Back"_](https://dl.acm.org/doi/10.1145/3720429) by Kevin Batz, Joost-Pieter Katoen, Francesca Randone, and Tobias Winkler was recently published at [OOPSLA 2025](https://2025.splashcon.org/track/OOPSLA). |
| 9 | + |
| 10 | +Before this work, Caesar was able to only verify simple *discrete* probabilistic programs, i.e. programs that only sample from discrete distributions. |
| 11 | +This paper lays out the formal foundations for us to verify probabilistic programs that sample from continuous distributions, with support for loops, and conditioning. |
| 12 | +One challenge is to integrate the integrals for the expected values that arise from continuous distributions into the deductive verification framework of Caesar. |
| 13 | +The key idea is to soundly under- or over-approximate these integrals via [Riemann sums](https://en.wikipedia.org/wiki/Riemann_sum). |
| 14 | +In addition to theoretical results such as convergence and completeness of the approach, the paper also provides case studies of continuous probabilistic programs that are encoded in HeyVL and verified with Caesar. |
| 15 | + |
| 16 | +**In this post:** |
| 17 | +1. [Encoding Riemann Sums in HeyVL](./2025-04-11-foundations-continuous.md#encoding-riemann-sums-in-heyvl) |
| 18 | +2. [Tortoise-Hare Race Example](./2025-04-11-foundations-continuous.md#tortoise-hare-race-example) |
| 19 | +3. [Beyond The Continuous Uniform Distribution](./2025-04-11-foundations-continuous.md#beyond-the-continuous-uniform-distribution) |
| 20 | + |
| 21 | + |
| 22 | +<!-- truncate --> |
| 23 | + |
| 24 | +## Encoding Riemann Sums in HeyVL |
| 25 | + |
| 26 | +The HeyVL encoding of the Riemann sum approximation is simple. |
| 27 | +Consider sampling the variable `x` from the *continuous* uniform distribution over the $[0,1]$. |
| 28 | + |
| 29 | +The original sampling interval $[0,1]$ is divided into $N+1$ subintervals, where $N$ has to be a chosen integer literal. |
| 30 | +Then, the encoding looks as follows: |
| 31 | +```heyvl |
| 32 | +var x: UReal |
| 33 | +var j: UInt = unif(0, N) |
| 34 | +cohavoc x |
| 35 | +coassume ?!(j / N <= x && x <= (j+1) / N) |
| 36 | +``` |
| 37 | + |
| 38 | +We sample a random integer $j$ from the *discrete* uniform distribution over $[0,N)$ using the [built-in `unif` distribution](/docs/stdlib/distributions.md). |
| 39 | +The sampled integer $j$ is then used to select the subinterval $I = [j/N, (j+1)/N)$. |
| 40 | +To overapproximate the expected value on the subinterval $I$, we *nondeterministically* select a value $x \in I$ such that the expected value is maximized. |
| 41 | +This is done using the `cohavoc` and `coassume` statements. |
| 42 | + |
| 43 | +Increasing the parameter $N$ leads to a more precise approximation of the expected value, but incurs a slowdown in the verification time. |
| 44 | + |
| 45 | +The above encodes the Riemann sum over-approximation of the expected value of a function $f$ when sampling uniformly from the continuous interval $[0,1]$. |
| 46 | +Formally: |
| 47 | + |
| 48 | +$$ |
| 49 | +\int_0^1 f(x) \, dx \quad\leq\quad \sum_{j=0}^N \frac{1}{N+1} \cdot \sup_{ j \in [j/N, (j+1)/N) } f\left(\frac{j}{N+1}\right) |
| 50 | +$$ |
| 51 | + |
| 52 | +A dual version of the encoding can be used to obtain an *under*-approximation of the expected value. |
| 53 | +Simply use `havoc` and `assume` statements instead of `cohavoc` and `coassume`, and use `?(...)` instead of `!?(...)`. |
| 54 | + |
| 55 | +A more detailed explanation can be found in [Section 9.1 of the paper](https://dl.acm.org/doi/pdf/10.1145/3720429#page=21). |
| 56 | + |
| 57 | + |
| 58 | +## Tortoise-Hare Race Example |
| 59 | + |
| 60 | +[Example 9.2.3 from the paper](https://dl.acm.org/doi/pdf/10.1145/3720429#page=23) models a race between a tortoise and a hare. |
| 61 | +As long as the hare did not overtake the tortoise, the hare flips a fair coin to decide whether to move or not. |
| 62 | +If the hare decides to move, it samples a distance uniformly at random from $[0, 10]$. |
| 63 | +The tortoise always moves exactly one step. |
| 64 | +The following HeyVL code encodes a proof that $$\mathrm{wp}\llbracket\texttt{tortoise\_hare}\rrbracket(\texttt{count}) \leq 3.38 \cdot (t - h + 2)~,$$ where $h$ and $t$ are the initial positions of the tortoise and hare, respectively. |
| 65 | +Here, we chose the parameter $N = 8$ for the Riemann sum approximation. |
| 66 | + |
| 67 | +```heyvl |
| 68 | +coproc tortoise_hare(init_h: UReal, init_t: UReal) -> (count: UReal) |
| 69 | + pre 3.38*((init_t - init_h) + 2) |
| 70 | + post count |
| 71 | +{ |
| 72 | + var h: UReal = init_h |
| 73 | + var t: UReal = init_t |
| 74 | +
|
| 75 | + count = 0 |
| 76 | +
|
| 77 | + @invariant(ite(h <= t, count + 3.38*((t-h) + 2), count)) |
| 78 | + while h <= t { |
| 79 | + var choice: Bool = flip(0.5) |
| 80 | + if choice { |
| 81 | +
|
| 82 | + // ------------------------------------------- |
| 83 | + // Over-approximating the continuous sampling: |
| 84 | + // inc = unif[0,1] |
| 85 | + // |
| 86 | + var inc: UReal |
| 87 | + var N: UInt = 8 |
| 88 | + var j: UInt = unif(0, 7) // discrete sampling |
| 89 | + cohavoc inc |
| 90 | + coassume ?!(j / N <= inc && inc <= (j+1) / N) |
| 91 | + // ------------------------------------------- |
| 92 | +
|
| 93 | + inc = 10 * inc |
| 94 | + h = h + inc |
| 95 | + } else {} |
| 96 | +
|
| 97 | + t = t + 1 |
| 98 | + count = count + 1 |
| 99 | + } |
| 100 | +} |
| 101 | +``` |
| 102 | + |
| 103 | +[Section 9.3 of the paper](https://dl.acm.org/doi/pdf/10.1145/3720429#page=24) contains a detailed experimental evaluation with more examples. |
| 104 | +[An associated artifact is available on Zenodo](https://doi.org/10.5281/zenodo.15175355). |
| 105 | + |
| 106 | +## Beyond The Continuous Uniform Distribution |
| 107 | + |
| 108 | +In the paper, only a statement to sample from the continuous uniform distribution is provided. |
| 109 | +However, as their [*Remark 1*](https://dl.acm.org/doi/pdf/10.1145/3720429#page=10) states, this does not limit expressiveness. |
| 110 | +By applying the [inverse transform sampling method](https://en.wikipedia.org/wiki/Inverse_transform_sampling), any continuous distribution can be sampled. |
| 111 | +[This paper](https://link.springer.com/chapter/10.1007/978-3-030-55089-9_3) by Marcin Szymczak and Joost-Pieter Katoen contains some examples of how to sample from e.g. the normal distribution. |
0 commit comments