Skip to content

Commit 169db1b

Browse files
authored
In vlen, define and use const HEADER_LENGTH (#723)
* In `vlen`, define and use `const` `HEADER_LENGTH` Previously `HEADER_LENGTH` was defined in `vlen`, but not as `const`. So it could concievably be changed. While this didn't happen in practice, it does mean the compiler needs to generate code to check and use this value throughout. However `HEADER_LENGTH` is intended to be `const`. So define it as such. That way the compiler can generate more efficient code when using it. Further define it in terms of the type whose size it references (namely `uint32_t`). Also make sure that type is `cimport`ed for referencing. Finally consistently use `HEADER_LENGTH` when it applies throughout `vlen`. Previously `HEADER_LENGTH` was sometimes used and in other cases the value `4` was used. So consistently use `HEADER_LENGTH` throughout for improved code clarity. Now that `HEADER_LENGTH` is marked `const` this is equally efficient.
1 parent 96086c3 commit 169db1b

File tree

2 files changed

+22
-15
lines changed

2 files changed

+22
-15
lines changed

docs/release.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@ Enhancements
2323
Improvements
2424
~~~~~~~~~~~~
2525

26+
* In ``vlen``, define and use ``const`` ``HEADER_LENGTH``.
27+
By :user:`John Kirkham <jakirkham>`, :issue:`723`
28+
2629
Fixes
2730
~~~~~
2831

numcodecs/vlen.pyx

Lines changed: 19 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77

88
cimport cython
99

10-
from libc.stdint cimport uint8_t
10+
from libc.stdint cimport uint8_t, uint32_t
1111
from libc.string cimport memcpy
1212

1313
from cpython.buffer cimport PyBuffer_IsContiguous
@@ -39,8 +39,12 @@ from .abc import Codec
3939
from .compat import ensure_contiguous_ndarray
4040

4141

42-
# 4 bytes to store number of items
43-
cdef Py_ssize_t HEADER_LENGTH = 4
42+
# Define header size used to store number of items that follow.
43+
cdef extern from *:
44+
"""
45+
const Py_ssize_t HEADER_LENGTH = sizeof(uint32_t);
46+
"""
47+
const Py_ssize_t HEADER_LENGTH
4448

4549

4650
def check_out_param(out, n_items):
@@ -116,7 +120,7 @@ class VLenUTF8(Codec):
116120
b = PyUnicode_AsUTF8String(u)
117121
l = PyBytes_GET_SIZE(b)
118122
encoded_values[i] = b
119-
data_length += l + 4 # 4 bytes to store item length
123+
data_length += l + HEADER_LENGTH
120124
encoded_lengths[i] = l
121125

122126
# setup output
@@ -132,7 +136,7 @@ class VLenUTF8(Codec):
132136
for i in range(n_items):
133137
l = encoded_lengths[i]
134138
store_le32(<uint8_t*>data, l)
135-
data += 4
139+
data += HEADER_LENGTH
136140
encv = PyBytes_AS_STRING(encoded_values[i])
137141
memcpy(data, encv, l)
138142
data += l
@@ -178,10 +182,10 @@ class VLenUTF8(Codec):
178182
# https://github.com/cython/cython/issues/1608
179183
data += HEADER_LENGTH
180184
for i in range(n_items):
181-
if data + 4 > data_end:
185+
if data + HEADER_LENGTH > data_end:
182186
raise ValueError('corrupt buffer, data seem truncated')
183187
l = load_le32(<uint8_t*>data)
184-
data += 4
188+
data += HEADER_LENGTH
185189
if data + l > data_end:
186190
raise ValueError('corrupt buffer, data seem truncated')
187191
out[i] = PyUnicode_FromStringAndSize(data, l)
@@ -247,7 +251,7 @@ class VLenBytes(Codec):
247251
elif not PyBytes_Check(b):
248252
raise TypeError('expected byte string, found %r' % b)
249253
l = PyBytes_GET_SIZE(b)
250-
data_length += l + 4 # 4 bytes to store item length
254+
data_length += l + HEADER_LENGTH
251255
lengths[i] = l
252256

253257
# setup output
@@ -263,7 +267,7 @@ class VLenBytes(Codec):
263267
for i in range(n_items):
264268
l = lengths[i]
265269
store_le32(<uint8_t*>data, l)
266-
data += 4
270+
data += HEADER_LENGTH
267271
encv = PyBytes_AS_STRING(values[i])
268272
memcpy(data, encv, l)
269273
data += l
@@ -309,10 +313,10 @@ class VLenBytes(Codec):
309313
# https://github.com/cython/cython/issues/1608
310314
data += HEADER_LENGTH
311315
for i in range(n_items):
312-
if data + 4 > data_end:
316+
if data + HEADER_LENGTH > data_end:
313317
raise ValueError('corrupt buffer, data seem truncated')
314318
l = load_le32(<uint8_t*>data)
315-
data += 4
319+
data += HEADER_LENGTH
316320
if data + l > data_end:
317321
raise ValueError('corrupt buffer, data seem truncated')
318322
out[i] = PyBytes_FromStringAndSize(data, l)
@@ -399,7 +403,7 @@ class VLenArray(Codec):
399403
raise ValueError('only 1-dimensional arrays are supported')
400404
l = v.nbytes
401405
normed_values[i] = v
402-
data_length += l + 4 # 4 bytes to store item length
406+
data_length += l + HEADER_LENGTH
403407
lengths[i] = l
404408

405409
# setup output
@@ -415,7 +419,7 @@ class VLenArray(Codec):
415419
for i in range(n_items):
416420
l = lengths[i]
417421
store_le32(<uint8_t*>data, l)
418-
data += 4
422+
data += HEADER_LENGTH
419423

420424
value_mv = ensure_continguous_memoryview(normed_values[i])
421425
value_pb = PyMemoryView_GET_BUFFER(value_mv)
@@ -468,10 +472,10 @@ class VLenArray(Codec):
468472
# https://github.com/cython/cython/issues/1608
469473
data += HEADER_LENGTH
470474
for i in range(n_items):
471-
if data + 4 > data_end:
475+
if data + HEADER_LENGTH > data_end:
472476
raise ValueError('corrupt buffer, data seem truncated')
473477
l = load_le32(<uint8_t*>data)
474-
data += 4
478+
data += HEADER_LENGTH
475479
if data + l > data_end:
476480
raise ValueError('corrupt buffer, data seem truncated')
477481

0 commit comments

Comments
 (0)