Skip to content

Commit 52c93c3

Browse files
jakirkhamalimanfoo
authored andcommitted
Use (new) buffer protocol in Pickle.decode on Python 2 (#150)
* Use (new) buffer protocol in `Pickle.decode` On Python 3, `pickle.loads` is able to take anything that conforms to the (new) buffer protocol. So there has been no issue giving it an `ndarray` to work with. Unfortunately, on Python 2, `pickle.loads` requires a `bytes` object specifically and is not able to take any type implementing the (new) buffer protocol. So we have been going ahead and coercing everything to `bytes` on Python 2. However it turns out that `cStringIO`'s `StringIO` on Python 2 does support the buffer protocol. Thus a `StringIO` object can be created on Python 2 without copying the data. While this still cannot be used with `pickle.loads`, it can be used with `pickle.load`, which special cases reading from `StringIO` leveraging the read function, which amounts to sharing a pointer between `StringIO` and `pickle.loads`. Thus achieving a no copying unpickler for Python 2. ref: http://www.hydrogen18.com/blog/unpickling-buffers.html ref: https://github.com/python/cpython/blob/2.7/Modules/cStringIO.c#L716 ref: https://github.com/python/cpython/blob/2.7/Modules/cStringIO.c#L681 ref: https://github.com/python/cpython/blob/2.7/Modules/cPickle.c#L614 ref: https://github.com/python/cpython/blob/2.7/Modules/cStringIO.c#L160 * Link this PR to `Pickle.decode` release note
1 parent 0c264a1 commit 52c93c3

File tree

2 files changed

+7
-5
lines changed

2 files changed

+7
-5
lines changed

docs/release.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Release notes
77
-----
88

99
* Handle (new) buffer protocol conforming types in ``Pickle.decode``.
10-
By :user:`John Kirkham <jakirkham>`, :issue:`143`.
10+
By :user:`John Kirkham <jakirkham>`, :issue:`143`, :issue:`150`.
1111

1212
* Fix other ``VLen*`` encode() methods to return numpy arrays as well.
1313
By :user:`John Kirkham <jakirkham>`, :issue:`144`.

numcodecs/pickles.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,12 @@
66

77

88
from .abc import Codec
9-
from .compat import PY2, ensure_bytes, ensure_contiguous_ndarray
9+
from .compat import PY2, ensure_contiguous_ndarray
1010

1111

1212
if PY2: # pragma: py3 no cover
1313
import cPickle as pickle
14+
from cStringIO import StringIO
1415
else: # pragma: py2 no cover
1516
import pickle
1617

@@ -48,12 +49,13 @@ def encode(self, buf):
4849
return pickle.dumps(buf, protocol=self.protocol)
4950

5051
def decode(self, buf, out=None):
52+
buf = ensure_contiguous_ndarray(buf)
53+
5154
if PY2: # pragma: py3 no cover
52-
buf = ensure_bytes(buf)
55+
dec = pickle.load(StringIO(buf))
5356
else: # pragma: py2 no cover
54-
buf = ensure_contiguous_ndarray(buf)
57+
dec = pickle.loads(buf)
5558

56-
dec = pickle.loads(buf)
5759
if out is not None:
5860
np.copyto(out, dec)
5961
return out

0 commit comments

Comments
 (0)