You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"The newest and coolest way to store data in python-blosc2 is through a SChunk (super-chunk) object. Here the data is split into chunks of the same size. So in the past, the only way of working with it was chunk by chunk (see tutorials-basics.ipynb). But now, python-blosc2 can retrieve, update or append data all at once (i.e. avoiding doing it chunk by chunk). To see how this works, let's first create our SChunk."
" schunk.decompress_chunk(i, out[200 * 1000 * i : 200 * 1000 * (i + 1)])"
47
+
]
48
+
},
49
+
{
50
+
"cell_type": "markdown",
51
+
"metadata": {},
52
+
"source": [
53
+
"But instead of the code above, we can simply use the `__getitem__` or the `get_slice` methods. Let's begin with `__getitem__`:"
54
+
]
55
+
},
56
+
{
57
+
"cell_type": "code",
58
+
"execution_count": 3,
59
+
"metadata": {
60
+
"pycharm": {
61
+
"name": "#%%\n"
62
+
}
63
+
},
64
+
"outputs": [
65
+
{
66
+
"name": "stdout",
67
+
"output_type": "stream",
68
+
"text": [
69
+
"b'\\x00\\x00\\x00\\x00'\n"
70
+
]
71
+
}
72
+
],
73
+
"source": [
74
+
"out_slice = schunk[:]\n",
75
+
"print(out_slice[:4])"
76
+
]
77
+
},
78
+
{
79
+
"cell_type": "markdown",
80
+
"metadata": {},
81
+
"source": [
82
+
"As you can see, the data is returned as a bytestring. If we want to better visualize the data, we will use `get_slice`. You can pass any Python object (supporting the Buffer Protocol) as the `out` param to fill it with the data."
"So now, we are able to get or set data all at once. But what if we would like to add data? Well, you can still do it with `__setitem__`. Indeed, this method can update and append data at the same time. To do so, `stop` will be the new SChunk nitems:"
"In this case we set the `copy` param to `True`. If you do not want to copy the buffer,\n",
218
+
"be mindful that you will have to keep its reference until you do not\n",
219
+
"want the SChunk anymore.\n",
220
+
"\n",
221
+
"## Compressing NumPy arrays\n",
222
+
"\n",
223
+
"If the object you want to get as a compressed buffer is a NumPy array, you can use the newer and faster functions to store it in-memory or on-disk.\n",
224
+
"\n",
225
+
"### In-memory\n",
226
+
"\n",
227
+
"To store it in-memory you can use `pack_array2`. In comparison with its former version, it is faster (see `pack_compress.py` bench) and does not have the 2 GB size limitation."
"Now python-blosc2 has an easy way of creating, getting, setting, deleting and expanding data in a SChunk. Moreover, you can get a contiguous compressed representation (aka [cframe](https://github.com/Blosc/c-blosc2/blob/main/README_CFRAME_FORMAT.rst)) of it and create it again latter. And you can do the same with NumPy arrays faster than with the former functions.\n"
0 commit comments