Skip to content

Commit 206b731

Browse files
committed
added new chapter "value holders"
1 parent 3e55ab4 commit 206b731

File tree

6 files changed

+286
-1
lines changed

6 files changed

+286
-1
lines changed

Chapters/5-valueholders.md

Lines changed: 278 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,278 @@
1+
# Pointer arithmetic and value holders
2+
3+
One of the reasons for the power of C is that, despite the appearance of having a type system, it effectively lacks one: every value is ultimately reduced to a memory position.
4+
5+
An integer value, for example `x = 42`, is nothing more than a sequence of bytes stored somewhere in memory. The exact number of bytes depends on the platform and the C type, but in memory it will look something like this:
6+
`x = | 2A | 00 | 00 | 00 |`
7+
(depending on the encoding, but that is another story).
8+
9+
What matters here is that `x` exists at a **position in memory**, inside a chunk of memory assigned by the program, and the C compiler is able to manipulate it by performing operations on that memory location.
10+
11+
C exposes this fact explicitly through a fundamental operator: `&`.
12+
The `&` operator gives you the **address of** a variable --- that is, the position in memory where the variable is stored. Instead of giving you the value stored at that position, it gives you the position itself.
13+
14+
This is fundamental in C for constructing arrays and other data structures, **but it is also fundamental for passing data between functions**. If I can obtain the address of a variable, I can pass that address to another function, allowing it to modify the original variable without having direct access to it.
15+
16+
For example:
17+
18+
```language=c
19+
void store_value(int *value) {
20+
*value = 42;
21+
}
22+
23+
void main() {
24+
int x;
25+
store_value(&x);
26+
printf("x = %d\n", x);
27+
/* It will print "x = 42" */ }
28+
```
29+
30+
Here, `store_value` receives not the value of `x`, but its address. By dereferencing that address, the function modifies the memory where `x` is stored.
31+
32+
## What does this mean in uFFI?
33+
34+
Historically, this has been handled in uFFI by passing a `ByteArray` to the C function. A `ByteArray` is essentially a reification of a chunk of memory, which allows C to operate on it directly.
35+
36+
Using the low-level mechanisms provided by uFFI, this looks like:
37+
38+
```language=smalltalk
39+
| buffer x |
40+
41+
buffer := ByteArray new: 4.
42+
self store_value: buffer.
43+
"store_value defined as: <ffiCall: #(void store_value(int *x))>"
44+
x := buffer signedLongAt: 1.
45+
```
46+
47+
In short, we pass a `ByteArray` to a C function that expects a pointer to an integer, then extract the value from that memory location using a primitive.
48+
49+
**This is straightforward, but it is low-level and error-prone.**
50+
It requires detailed knowledge of memory layout, sizes, and access primitives.
51+
52+
## ... enter value holders
53+
54+
To simplify this complexity and make code easier to write and understand, we introduce **value holders**.
55+
56+
Value holders are not magic. They are simply a more expressive, *Pharoish* way of representing the same low-level mechanism: "a place in memory where C will store a value".
57+
58+
Using value holders, the same example becomes:
59+
60+
```language=smalltalk
61+
| xHolder x |
62+
xHolder := FFIInt32 newValueHolder.
63+
self store_value: xHolder.
64+
x := xHolder value.
65+
```
66+
This is still more verbose than the equivalent C code, but the *meaning* of what is happening is much clearer. Let's break it down.
67+
68+
##### 1. Value holder creation
69+
70+
```language=smalltalk
71+
xHolder := FFIInt32 newValueHolder.
72+
```
73+
74+
Here we create a container --- a place where an `int32` value will be stored by C.
75+
76+
Yes, this requires knowing the C type and its corresponding uFFI type. But if you are calling a C function, we assume you know what you are doing 😜.
77+
78+
##### 2. Call the C function
79+
80+
```language=smalltalk
81+
self store_value: xHolder.
82+
```
83+
84+
The C function is called exactly as before. The only difference is that we pass a value holder instead of a raw memory buffer. No changes to the function declaration are required.
85+
86+
##### 3. Retrieve the value
87+
88+
```language=smalltalk
89+
x := xHolder value.
90+
```
91+
92+
You retrieve the value by simply asking the holder for it. There is no need to remember how to read an `int32` from a pointer or a byte array using low-level primitives.
93+
94+
This mechanism works with all basic C types defined in uFFI, including:
95+
96+
`FFIBool`, `FFIExternalString`, `FFIOop`, `FFIBoolean32`, `FFIFloat128`, `FFIFloat16`, `FFIFloat32`, `FFIFloat64`, `FFISizeT`,
97+
`FFIUInt8`, `FFIUInt16`, `FFIUInt32`, `FFIUInt64`,
98+
`FFIInt8`, `FFIInt16`, `FFIInt32`, `FFIInt64`, `FFILong`, `FFIULong`.
99+
100+
## What happens with structures, unions, and external objects?
101+
102+
Value holders work naturally for basic C types, but what about more complex ones?
103+
104+
### Structures (and unions)
105+
106+
When you pass a structure *by value* in C, it is always copied. This means the function receives a copy of the structure's contents and can only read them.
107+
108+
For example:
109+
110+
```language=c
111+
typedef struct MyStruct {
112+
int value1;
113+
int value2;
114+
} mystructtype;
115+
116+
int sum(mystructtype t) {
117+
return t.value1 + t.value2;
118+
}
119+
```
120+
121+
```language=smalltalk
122+
v := MyStruct new.
123+
v
124+
value1: 10;
125+
value2: 10.
126+
result := aFFILibrary sum: v.
127+
```
128+
129+
This works correctly.
130+
131+
However, if you need the C function to **modify** the structure and observe those changes later, you must pass a *reference* --- that is, the address of the structure.
132+
133+
For example:
134+
135+
```language=c
136+
typedef struct MyStruct {
137+
int value1;
138+
int value2;
139+
} mystructtype;
140+
141+
void fill_values(mystructtype *t) {
142+
t->value1 = 10;
143+
t->value2 = 10;
144+
}
145+
```
146+
147+
This example is intentionally simplified and not realistic, but it captures the essence: modifying a structure through a pointer.
148+
149+
### Passing structure value holders
150+
151+
Using value holders, this is straightforward:
152+
153+
```language=smalltalk
154+
structHolder := MyStruct newValueHolder.
155+
aFFILibrary fill_values: structHolder.
156+
v := structHolder value.
157+
Transcript show: ('{1} + {2} = {3}' format: {
158+
v value1.
159+
v value2.
160+
v value1 + v value2 })
161+
```
162+
163+
This ensures uniform access to structures and unions, just like any other type.
164+
165+
### Passing a reference to a structure
166+
167+
Sometimes you already have a structure instance and need to pass it by reference. This is common when a structure must be initialized first and then modified by a C function.
168+
169+
For this case, structures and unions provide the `referenceTo` message:
170+
171+
```language=smalltalk
172+
v := MyStruct new.
173+
aFFILibrary fill_values: v referenceTo.
174+
Transcript show: ('{1} + {2} = {3}' format: {
175+
v value1.
176+
v value2.
177+
v value1 + v value2 })
178+
```
179+
180+
**Note:**
181+
Before this mechanism existed, uFFI relied on mangling magic for single indirection (pointer depth = 1). Because structures are internally stored in byte arrays, passing the structure itself also worked as a reference.
182+
183+
This behavior is subtle and relies on internal implementation details. Now that an explicit mechanism exists, **we do not recommend relying on that behavior**.
184+
## Multiple pointer indirection (pointer depth \> 1)
185+
186+
In C, it is common --- especially when dealing with lists --- to encounter arguments with more than one level of indirection.
187+
Of this cases, we will focus on arrays, since other cases follow the same pattern.
188+
### Passing arrays
189+
In C, arrays are just pointer arithmetic. A function declared with `int *` or `char **` often expects a list of values.
190+
191+
uFFI provides an abstraction for this through the `FFIArray` class. `FFIArray` can be used both to define array types and to create instances that manage storing and retrieving data through C pointers.
192+
193+
#### Sending arrays as part of a callout
194+
195+
Consider this C function:
196+
197+
```language=c
198+
int sum_list(const int *list, size_t size) {
199+
int sum = 0;
200+
for (size_t i = 0; i < size; i++) {
201+
sum += list[i];
202+
}
203+
return sum;
204+
}
205+
```
206+
207+
The corresponding Pharo binding:
208+
209+
```language=smalltalk
210+
sumList: list size: size
211+
^ self ffiCall: #(int sum_list(const int *list, size_t size))
212+
```
213+
214+
Usage:
215+
216+
```language=smalltalk
217+
arrayOfIntegers := FFIArray newType: FFIInt32 size: 5. 1 to: 5 do: [ :i | arrayOfIntegers at: i put: i factorial ]. result := self sumList: arrayOfIntegers size: 5.
218+
```
219+
220+
#### Retrieving arrays
221+
222+
Retrieving arrays is more complex, because memory allocation is often done by the C function itself.
223+
224+
##### Case 1: list of structures with known size
225+
226+
```language=c
227+
void collect_times(time_t *times, int samples) {
228+
time_t *t = (time_t *)
229+
malloc(sizeof(time_t) * samples);
230+
for (i = 0; i < samples; i++) {
231+
t[i] = time();
232+
}
233+
*times = *t;
234+
}
235+
```
236+
237+
Since the size is known:
238+
239+
```language=smalltalk
240+
times := FFIArray newType: TimeT size: 3.
241+
aFFILibrary collect_times: times samples: 3.
242+
time1 := times at: 1.
243+
time2 := times at: 2.
244+
time3 := times at: 3.
245+
```
246+
247+
##### Case 2: list of structures with unknown size
248+
249+
```language=c
250+
int collect_times(time_t **times) {
251+
int samples = 3;
252+
time_t *t = malloc(sizeof(time_t) * samples);
253+
for (i = 0; i < samples; i++) {
254+
t[i] = time();
255+
}
256+
*times = t;
257+
return samples;
258+
}
259+
```
260+
261+
Here the C function allocates the memory and returns the size:
262+
263+
```language=smalltalk
264+
timesHolder := FFIOop newValueHolder.
265+
samples := aFFILibrary collect_times: timesHolder.
266+
times := FFIArray
267+
fromHandle: timesHolder value
268+
type: TimeT
269+
size: samples.
270+
time1 := times at: 1.
271+
time2 := times at: 2.
272+
time3 := times at: 3.
273+
"NOTICE THAT IN THIS CASE IT IS YOUR RESPONSIBILITY TO RELEASE THE ALLOCATED MEMORY"
274+
```
275+
276+
#### Other cases
277+
278+
Even though we focused on arrays, the same pattern applies to all cases involving pointer depth greater than one: you pass a `FFIOop` value holder and interpret the result accordingly.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

index.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,8 @@
1-
<!inputFile|path=Chapters/0-introduction.md!><!inputFile|path=Chapters/1-callouts.md!><!inputFile|path=Chapters/2-marshalling.md!><!inputFile|path=Chapters/3-complextypes.md!><!inputFile|path=Chapters/4-objectsandderivedtypes.md!><!inputFile|path=Chapters/6-memorymanagement.md!><!inputFile|path=Chapters/7-threadedffi.md!>
1+
<!inputFile|path=Chapters/0-introduction.md!>
2+
<!inputFile|path=Chapters/1-callouts.md!>
3+
<!inputFile|path=Chapters/2-marshalling.md!>
4+
<!inputFile|path=Chapters/3-complextypes.md!>
5+
<!inputFile|path=Chapters/4-objectsandderivedtypes.md!>
6+
<!inputFile|path=Chapters/5-valueholders.md!>
7+
<!inputFile|path=Chapters/7-memorymanagement.md!>
8+
<!inputFile|path=Chapters/8-threadedffi.md!>

0 commit comments

Comments
 (0)