Delving Simplicity Part Ⅲ: Building Data Types #310
roconnor-blockstream
announced in
Delving Simplicity
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
This post is cross-posted in https://delvingbitcoin.org/t/delving-simplicity-part-building-data-types/1956.
In Part Ⅱ of this series, we introduced Simplicity’s type system and its core set of computational combinators. However, with only three kinds of type formers and nine core combinators, the language may seem too meager for practical computations. Can we really build any computation using only the three fundamental forms of composition we described in Part Ⅰ of this series? In the same way that computers are built out of logic gates, we start with the humble bit and build up abstractions from it. Let’s delve in.
Boolean Logic
In Part Ⅱ, we introduced the Boolean type, which is
𝟙 + 𝟙
. We use the notation𝟚
for this type. It has two values:σᴸ⟨⟩
andσᴿ⟨⟩
. By convention, we use the notation0
orfalse
forσᴸ⟨⟩
, and the notation1
ortrue
forσᴿ⟨⟩
. Using our computational combinators, we can build Boolean logic operators.And operation
The logical
and : 𝟚 × 𝟚 ⊢ 𝟚
operation takes two bits as input and returns one bit as output. Since Simplicity expressions have only a single input type, we use the product of the two input arguments as the input type. There are several ways of implementing this function. One way is to branch on the first bit: if the first bit isfalse
, we returnfalse
; otherwise, we return the second bit.We can check this definition on the input
⟨false, false⟩
by using equational reasoning:We get the expected result of
false
.Let’s try it again on the input
⟨true, true⟩
.Again, we get the expected result of
true
. We will leave trying the remaining inputs for the curious reader.Other logic operations
The definition of the
not
operation requires a helper combinator:The initial
iden ▵ unit : A ⊢ A × 𝟙
adds an empty “environment” to the input, enabling thecase
combinator to apply. The use oftake
in the two branches drops this empty environment to executef
org
.Now we can define all other Boolean logical operations:
or ≔ case (drop iden) (injr unit) : 𝟚 × 𝟚 ⊢ 𝟚
not ≔ copair (injr unit) (injl unit) : 𝟚 ⊢ 𝟚
xor ≔ case (drop iden) (drop not) : 𝟚 × 𝟚 ⊢ 𝟚
Bit Adders
In digital logic, a “half-adder” is a circuit that takes two bits and adds them, producing a two-bit output: a carry bit and the sum bit. We can implement this in Simplicity by pairing up the
and
andxor
operations:This function has two output bits, so we return a pair of the two output types.
A “full-adder” takes an additional third “carry-in” bit and adds them all together, producing the same two-bit output. We must use a nested tuple to represent three inputs. We have two choices:
𝟚 × (𝟚 × 𝟚)
or(𝟚 × 𝟚) × 𝟚
. These types are not quite the same, and we must choose one. For this example, we will choose the latter.Types in Simplicity are often nested tuples, representing an “environment” of parameters. It is common to use sequences of
take
,drop
, andiden
to access specific parameters from such an environment. For sequences of these combinators, a compact notation will be used:O f
is notation fortake f
.I f
is notation fordrop f
.H
is notation foriden
.For example,
I O H
meansdrop (take iden) : A × (B × C) ⊢ B
, which extracts the middle value out of its input.🛈 The use of
O
andI
fortake
anddrop
is meant to evoke a sense of binary digits. If we think of a nested tuple as a binary tree and label the tree’s positions with positive binary numbers appropriately, then our notation becomes the reversed binary digits (thinking ofH
as the leading 1 digit) of a location within that tree from which we are retrieving data. Such expressions form a kind of De Bruijn indices for Simplicity⚠ The
I
,O
, andH
notation is only used for subexpressions that consist only oftake
,drop
, andiden
. They are not used for other occurrences oftake
ordrop
.A full-adder is the composition of two half-adders, taking the logical
or
of the two carry bits:In the first line,
take half-adder ▵ I H : (𝟚 × 𝟚) × 𝟚 ⊢ (𝟚 × 𝟚) × 𝟚
, we run the half-adder on the first two bits, and we save the last bit. In the second line,O O H ▵ (O I H ▵ I H ⨾ half-adder) : (𝟚 × 𝟚) × 𝟚 ⊢ 𝟚 × (𝟚 × 𝟚)
, we save the first bit (which is the carry-out bit of the first half-adder) and run the half-adder on the last two bits. In the last line,(O H ▵ I O H ⨾ or) ▵ I I H: 𝟚 × (𝟚 × 𝟚) ⊢ 𝟚 × 𝟚
, we take the logical or of the first two bits (which are the carry-outs of the two half-adders) and also return the sum-out bit of the second half-adder.This code illustrates how one programs in Simplicity. We use the
I
,O
, andH
notation to reference bits of data from the subexpression’s input type, which is then formed into a suitable “environment” that is used to “call” other functions using sequential composition.Simplicity users do not have to define such low-level operations themselves. Later in this series, we will discuss our standard library of jets that implement common functions.
Furthermore, end users are not expected to directly program in Simplicity, in the same way that end users are not expected to directly program with Bitcoin Script. Instead, we expect end users will use higher-level languages, such as SimplicityHL, to generate Simplicity. For example, SimplicityHL manages the “environment” of each subexpression, letting one use named variables, which are compiled into appropriate sequence of
take
s anddrop
s.Vectors
Given a Simplicity type
A
, we can define fixed-length vectors by forming iterated products ofA
:A² ≔ A × A
A⁴ ≔ A² × A²
A⁸ ≔ A⁴ × A⁴
…
These types may be alternatively written as
A^2
,A^4
,A^8
, etc.We will define vectors only for lengths that are powers of two. One can define notation for other powers, but a convention needs to be chosen for how to bracket the product types.
Given an expression
f : A ⊢ B
, we can repeatedly pair it with itself in order to “map” it over a vector of any fixed length:f² ≔ f ▵ f : A² ⊢ B²
f⁴ ≔ f² ▵ f² : A⁴ ⊢ B⁴
f⁸ ≔ f⁴ ▵ f⁴ : A⁸ ⊢ B⁸
Given a function
f : A × B ⊢ B
, we can iterate or “fold” it over a vector of any fixed length:fold-right-2 f ≔ O O H ▵ (O I H ▵ I H ⨾ f) ⨾ f : A² × B ⊢ B
fold-right-4 f ≔ fold-right-2 (fold-right-2 f) : A⁴ × B ⊢ B
fold-right-8 f ≔ fold-right-2 (fold-right-4 f) : A⁸ × B ⊢ B
There are many variations of these constructions we can implement. Given
f : A × B ⊢ C
, we can “zip” it over a pair of vectors withzip-n f : (Aⁿ × Bⁿ) ⊢ Cⁿ
. Givenf : (A × B) × C ⊢ C
, we can fold it over a pair of vectors withbifold-right-n f : (Aⁿ × Bⁿ) ⊢ C
. We can combinemap
andfold-right
into an accumulating combinator that takesf : A × C ⊢ C × B
and results inmap-accum-right-n f : Aⁿ × C ⊢ C × Bⁿ
. Many more variants are possible.Multi-bit Words
A vector of bits gives us multi-bit integers. For example,
𝟚³²
is a 32-bit word type.𝟚²⁵⁶
is a 256-bit word type, which is suitable for hashes and other cryptographic operations.Given our “full-adder” on bits, we can use a variant of the vector operations above to define a “ripple carry adder” over multi-bit words:
full-adder-n
takes two n-bit binary numbers and a one-bit carry-input value and returns a one-bit carry-out flag and an n-bit sum.SHA-256
In this manner, we can recursively define all our arithmetic operations on multi-bit words: subtraction, multiplication, division, etc. We can define bit-wise logical operations such as logical and, logical or, logical xor, etc. By repeatedly combining these operations, we can even build SHA-256’s block compression function:
sha256-hash-block ≔ … : 𝟚²⁵⁶ × 𝟚⁵¹² ⊢ 𝟚²⁵⁶
But you don’t have to trust us. The SHA-256 compression is formally defined using Simplicity within the Rocq proof assistant (formerly named the Coq proof assistant), and it comes with a formal proof that the
sha256-hash-block
implementation is correct.Of course, the compression runs too slowly to be practical when executing it as raw Simplicity. We will introduce jets later in this series, which are used to execute common functions such as the SHA-256 compression function natively. However, we will still make use of such pure Simplicity implementations as formal specifications for our jets.
Option Types
Option types are made by taking a sum with the unit type:
Option A ≔ 𝟙 + A
The type
Option A
may also be written asA?
or𝕊 A
(where𝕊
stands for “successor”). As with vectors, functions can be “mapped” over the option type:We can also define monadic combinators such as bind:
Variable Length Buffers
We can also define what we call “buffers,” which are types for partially filled vectors.
Aᑉ² ≔ A?
Aᑉ⁴ ≔ A²? × Aᑉ²
Aᑉ⁸ ≔ A⁴? × Aᑉ⁴
…
🛈 The type
Xᑉ⁸
expands to(1 + X⁴) × ((1 + X²) × (1 + X))
. If we pretend this is a polynomial and expand it, we get1 + X + X² + X³ + X⁴ + X⁵ + X⁶ + X⁷
. Interpreting this polynomial as a type, it is the sum of all possible tuples of X up to 7, including the empty tuple. In other words, it is exactly the type of lists whose length is strictly less than 8.As with vectors, we can define various mapping and folding operations over buffers. We can also define stack operations such as
push-<n : Aᑉⁿ × A ⊢ Aⁿ + Aᑉⁿ
andpop-<n : Aᑉⁿ ⊢ (Aᑉⁿ × A)?
.push-<n
takes a buffer and a new item and appends it to the buffer, returning a full vector in case the buffer overflows.pop-<n
takes a buffer and removes an item, returning the smaller buffer and the item removed, optionally returning nothing in case the original buffer was empty.For example, we can define
push-<n
recursively:…
As you can see, raw Simplicity becomes inscrutable once one reaches a certain level of complexity. Again, we expect end users to utilize higher-level languages, such as SimplicityHL, that can generate these idiomatic expressions.
Conclusion
In this part, we illustrated how to build logical operations starting from bits. From these logic operations, we were able to build bit-level arithmetic and illustrate how to reason about their execution. After developing vector types, we showed how to iterate over multi-bit words to define arithmetic. Continuing in this way, we can define cryptographic operations, such as SHA-256 and Schnorr signature validation, using just our computational Simplicity combinators. All of which we have actually defined using Simplicity.
This part is not meant as a comprehensive guide to the various possible data types and operations that we can build in Simplicity, but instead, it serves to illustrate how it is possible to build practical functionality within Simplicity’s constraints. Even though Simplicity’s types are all finitely bounded, we can still define useful vectors and buffer types and operations that iterate over values within these structures.
The actual specifications of operations in our standard library are a little different from the definitions used here. For instance, the full-adder in our standard library uses a 3-way xor and a “majority” logic function in its implementation, rather than using two half-adders as shown here.
Later, we will see that, in practice, Simplicity programs use jets for arithmetic and cryptographic operations. However, jets are limited to replacing expressions. The combinators for iterating over buffers and vectors we have defined cannot be replaced with jets. These defined combinators will appear in actual Simplicity programs. Although, rather than directly using these combinators, we expect end users will use higher-level languages, such as SimplicityHL, to generate such Simplicity expressions.
A keen-eyed reader may have noticed that our recursively defined combinators appear to grow exponentially in expression size. However, this is not a problem. Later, when we talk about serialization, we will see that expressions are encoded as a DAG (directed acyclic graph) rather than as a tree. So the actual presentation of these expressions will only grow linearly in size.
So far, we have only been considering pure computations. However, we need some way to interact with transaction data for tasks like signing transactions, and we need to be able to have the program fail somehow if the signature is not valid. In the next part, we will discuss side-effects in Simplicity.
Beta Was this translation helpful? Give feedback.
All reactions