Skip to content

Commit e63190d

Browse files
Copilotjpfeuffer
andcommitted
Add implementation summary and documentation
Co-authored-by: jpfeuffer <8102638+jpfeuffer@users.noreply.github.com>
1 parent ff85f31 commit e63190d

File tree

1 file changed

+217
-0
lines changed

1 file changed

+217
-0
lines changed

IMPLEMENTATION_SUMMARY.md

Lines changed: 217 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,217 @@
1+
# Array Wrapper Implementation Summary
2+
3+
## Overview
4+
5+
This implementation adds generic array wrapper classes with Python buffer protocol support to autowrap, enabling zero-copy integration between C++ `std::vector` and NumPy arrays.
6+
7+
## Key Design Decisions
8+
9+
### 1. **No C++ Wrapper Layer**
10+
- **Decision**: Cython classes directly hold `libcpp_vector` or raw pointers
11+
- **Rationale**: Simpler, no extra indirection, Cython can manage C++ types directly
12+
- **Result**: Less code, easier to maintain
13+
14+
### 2. **Bool Member for Constness**
15+
- **Decision**: Use `readonly` bool flag instead of separate `ConstArrayView` classes
16+
- **Rationale**: Reduces code duplication, simpler API
17+
- **Implementation**: `ArrayView` has a `readonly` member that controls buffer protocol behavior
18+
19+
### 3. **Factory Functions for Views**
20+
- **Decision**: Create views using factory functions (`_create_view_*`) instead of `__cinit__`
21+
- **Rationale**: Cython `__cinit__` cannot accept C-level pointers when called from generated code
22+
- **Result**: Factory functions can be called from C level in generated wrappers
23+
24+
### 4. **Owning Wrappers for Value Returns**
25+
- **Decision**: Use `ArrayWrapper` + `swap()` for value returns instead of memcpy
26+
- **Rationale**: The returned vector is already a copy, so just transfer ownership
27+
- **Benefit**: Zero extra copies, efficient memory transfer
28+
29+
## Architecture
30+
31+
```
32+
┌─────────────────────────────────────────────────────────────┐
33+
│ Python/NumPy Layer │
34+
│ - numpy.ndarray │
35+
│ - Uses buffer protocol │
36+
└──────────────────────┬──────────────────────────────────────┘
37+
│ buffer protocol
38+
┌──────────────────────┴──────────────────────────────────────┐
39+
│ Cython Wrapper Layer (ArrayWrappers.pyx) │
40+
│ │
41+
│ ┌────────────────────────┐ ┌─────────────────────────┐ │
42+
│ │ ArrayWrapper[T] │ │ ArrayView[T] │ │
43+
│ │ - libcpp_vector[T] vec │ │ - T* ptr │ │
44+
│ │ - Owns data │ │ - size_t _size │ │
45+
│ │ │ │ - object owner │ │
46+
│ │ │ │ - bool readonly │ │
47+
│ │ │ │ - Does NOT own data │ │
48+
│ └────────────────────────┘ └─────────────────────────┘ │
49+
│ │
50+
│ Factory functions: _create_view_*() │
51+
└──────────────────────────────────────────────────────────────┘
52+
53+
┌──────────────────────┴──────────────────────────────────────┐
54+
│ C++ Layer │
55+
│ - std::vector<T> │
56+
│ - Raw memory │
57+
└─────────────────────────────────────────────────────────────┘
58+
```
59+
60+
## Type Coverage
61+
62+
All numeric types are supported:
63+
- **Floating point**: `float`, `double`
64+
- **Signed integers**: `int8_t`, `int16_t`, `int32_t`, `int64_t`
65+
- **Unsigned integers**: `uint8_t`, `uint16_t`, `uint32_t`, `uint64_t`
66+
67+
Each type has:
68+
- An owning wrapper class (e.g., `ArrayWrapperDouble`)
69+
- A view class (e.g., `ArrayViewDouble`)
70+
- A factory function (e.g., `_create_view_double()`)
71+
72+
## Integration with ConversionProvider
73+
74+
The `StdVectorAsNumpyConverter` in `ConversionProvider.py` uses these wrappers:
75+
76+
### For Reference Returns (`const T&` or `T&`)
77+
```cython
78+
# Zero-copy view
79+
cdef double* _ptr = vec.data()
80+
cdef size_t _size = vec.size()
81+
cdef ArrayViewDouble view = _create_view_double(_ptr, _size, owner=self, readonly=True/False)
82+
cdef object arr = numpy.asarray(view)
83+
arr.base = view # Keep view (and owner) alive
84+
```
85+
86+
### For Value Returns (`T`)
87+
```cython
88+
# Owning wrapper (swap, no extra copy)
89+
cdef ArrayWrapperDouble wrapper = ArrayWrapperDouble()
90+
wrapper.set_data(vec) # Swaps data, O(1)
91+
cdef object arr = numpy.asarray(wrapper)
92+
arr.base = wrapper # Keep wrapper alive
93+
```
94+
95+
## Memory Management
96+
97+
### Owning Wrappers
98+
- **Lifetime**: Wrapper owns the data
99+
- **Safety**: Must keep wrapper alive while numpy array is in use (via `.base`)
100+
- **Copies**: One copy when C++ returns by value, then swap (no extra copy)
101+
102+
### Views
103+
- **Lifetime**: View does NOT own data, relies on owner
104+
- **Safety**: Must keep both view AND owner alive (view.owner reference + arr.base)
105+
- **Copies**: Zero copies, direct access to C++ memory
106+
107+
### Lifetime Chain
108+
```
109+
numpy array --> .base --> ArrayView --> .owner --> C++ object
110+
(no data) (has data)
111+
```
112+
113+
## Buffer Protocol Implementation
114+
115+
Both `ArrayWrapper` and `ArrayView` implement:
116+
117+
```cython
118+
def __getbuffer__(self, Py_buffer *buffer, int flags):
119+
# Set up buffer with:
120+
# - buf: pointer to data
121+
# - len: total bytes
122+
# - shape: [size]
123+
# - strides: [itemsize]
124+
# - format: 'f', 'd', 'i', etc.
125+
# - readonly: 0 or 1
126+
127+
def __releasebuffer__(self, Py_buffer *buffer):
128+
pass # No cleanup needed
129+
```
130+
131+
## Usage Patterns Generated by autowrap
132+
133+
### Pattern 1: Value Return
134+
```cython
135+
def get_data(self):
136+
_r = self.inst.get().getData() # Returns by value
137+
# Use owning wrapper
138+
cdef ArrayWrapperDouble _wrapper_py_result = ArrayWrapperDouble()
139+
_wrapper_py_result.set_data(_r)
140+
cdef object py_result = numpy.asarray(_wrapper_py_result)
141+
py_result.base = _wrapper_py_result
142+
return py_result
143+
```
144+
145+
### Pattern 2: Const Reference Return
146+
```cython
147+
def get_const_ref(self):
148+
_r = self.inst.get().getConstRef() # Returns const &
149+
# Use readonly view
150+
cdef double* _ptr_py_result = _r.data()
151+
cdef size_t _size_py_result = _r.size()
152+
cdef ArrayViewDouble _view_py_result = _create_view_double(
153+
_ptr_py_result, _size_py_result, owner=self, readonly=True
154+
)
155+
cdef object py_result = numpy.asarray(_view_py_result)
156+
py_result.base = _view_py_result
157+
return py_result
158+
```
159+
160+
### Pattern 3: Non-Const Reference Return
161+
```cython
162+
def get_mutable_ref(self):
163+
_r = self.inst.get().getMutableRef() # Returns &
164+
# Use writable view
165+
cdef double* _ptr_py_result = _r.data()
166+
cdef size_t _size_py_result = _r.size()
167+
cdef ArrayViewDouble _view_py_result = _create_view_double(
168+
_ptr_py_result, _size_py_result, owner=self, readonly=False
169+
)
170+
cdef object py_result = numpy.asarray(_view_py_result)
171+
py_result.base = _view_py_result
172+
return py_result
173+
```
174+
175+
## Files Modified/Created
176+
177+
### Created
178+
- `autowrap/data_files/autowrap/ArrayWrappers.pyx` - Main implementation (1300+ lines)
179+
- `autowrap/data_files/autowrap/ArrayWrappers.pxd` - Cython declarations
180+
- `autowrap/data_files/autowrap/README_ARRAY_WRAPPERS.md` - Documentation
181+
- `tests/test_array_wrappers.py` - Test suite
182+
- `tests/test_files/array_wrappers/` - Test examples
183+
184+
### Modified
185+
- `autowrap/ConversionProvider.py` - Updated `StdVectorAsNumpyConverter`
186+
- `autowrap/CodeGenerator.py` - Added ArrayWrapper imports when numpy enabled
187+
188+
### Removed
189+
- `ArrayWrapper.hpp` - Not needed (Cython handles C++ directly)
190+
- `ArrayWrapper.pxd` - Not needed (functionality in ArrayWrappers.pxd)
191+
192+
## Performance Characteristics
193+
194+
| Operation | Old (memcpy) | New (wrapper) | New (view) |
195+
|-----------|--------------|---------------|------------|
196+
| Value return | 1 copy | 1 copy | N/A |
197+
| Const ref return | 1 copy | N/A | 0 copies |
198+
| Non-const ref return | 1 copy | N/A | 0 copies |
199+
| Memory safety | Safe | Safe (with .base) | Safe (with .base) |
200+
201+
## Key Benefits
202+
203+
1. **Zero-copy for references**: Views provide direct access to C++ memory
204+
2. **Efficient value returns**: Swap instead of second copy
205+
3. **Type safety**: Full type coverage for all numeric types
206+
4. **Memory safety**: Proper lifetime management via Python references
207+
5. **Simple implementation**: No C++ layer, all in Cython
208+
6. **Flexible**: Support for both readonly and writable buffers
209+
210+
## Future Enhancements
211+
212+
Potential improvements:
213+
- Multi-dimensional array support (2D, 3D, etc.)
214+
- Strided array support for non-contiguous data
215+
- Support for more types (complex numbers, bool)
216+
- Integration with other array protocols (e.g., `__array_interface__`)
217+
- Optional bounds checking for debug builds

0 commit comments

Comments
 (0)