|
59 | 59 | "metadata": {}, |
60 | 60 | "outputs": [ |
61 | 61 | { |
62 | | - "name": "stdout", |
63 | | - "output_type": "stream", |
64 | | - "text": [ |
65 | | - "b'\\x00\\x00\\x00\\x00'\n" |
66 | | - ] |
| 62 | + "data": { |
| 63 | + "text/plain": "bytes" |
| 64 | + }, |
| 65 | + "execution_count": 3, |
| 66 | + "metadata": {}, |
| 67 | + "output_type": "execute_result" |
67 | 68 | } |
68 | 69 | ], |
69 | 70 | "source": [ |
70 | 71 | "out_slice = schunk[:]\n", |
71 | | - "print(out_slice[:4])" |
| 72 | + "type(out_slice)" |
72 | 73 | ] |
73 | 74 | }, |
74 | 75 | { |
75 | 76 | "cell_type": "markdown", |
76 | 77 | "metadata": {}, |
77 | 78 | "source": [ |
78 | | - "As you can see, the data is returned as a bytestring. If we want to better visualize the data, we will use `get_slice`. You can pass any Python object (supporting the Buffer Protocol) as the `out` param to fill it with the data." |
| 79 | + "As you can see, the data is returned as a bytes object. If we want to get a more meaningful container instead, we can use `get_slice`, where you can pass any Python object (supporting the Buffer Protocol) as the `out` param to fill it with the data. In this case we will use a NumPy array contaner." |
79 | 80 | ] |
80 | 81 | }, |
81 | 82 | { |
|
102 | 103 | "cell_type": "markdown", |
103 | 104 | "metadata": {}, |
104 | 105 | "source": [ |
105 | | - "That looks better!\n", |
| 106 | + "That's the expected data indeed!\n", |
106 | 107 | "\n", |
107 | 108 | "## Setting data in a SChunk\n", |
108 | 109 | "\n", |
109 | | - "We can also set the data of an area to any python object supporting the Buffer Protocol. Let's see a quick example:" |
| 110 | + "We can also set the data of a `SChunk` area from any Python object supporting the Buffer Protocol. Let's see a quick example:" |
110 | 111 | ] |
111 | 112 | }, |
112 | 113 | { |
|
125 | 126 | "cell_type": "markdown", |
126 | 127 | "metadata": {}, |
127 | 128 | "source": [ |
128 | | - "So now, we are able to get or set data all at once. But what if we would like to add data? Well, you can still do it with `__setitem__`. Indeed, this method can update and append data at the same time. To do so, `stop` will be the new SChunk nitems:" |
| 129 | + "We have seen how to get or set data. But what if we would like to add data? Well, you can still do that with `__setitem__`." |
129 | 130 | ] |
130 | 131 | }, |
131 | 132 | { |
|
142 | 143 | "schunk[start:new_nitems] = new_value" |
143 | 144 | ] |
144 | 145 | }, |
| 146 | + { |
| 147 | + "cell_type": "markdown", |
| 148 | + "source": [ |
| 149 | + "Here, `start` is less than the number of elements in `SChunk` and `new_items` is larger than this; that means that `__setitem__` can update and append data at the same time, and you don't have to worry about whether you are exceeding the limits of the `SChunk`." |
| 150 | + ], |
| 151 | + "metadata": { |
| 152 | + "collapsed": false |
| 153 | + } |
| 154 | + }, |
145 | 155 | { |
146 | 156 | "cell_type": "markdown", |
147 | 157 | "metadata": {}, |
|
155 | 165 | "cell_type": "code", |
156 | 166 | "execution_count": 7, |
157 | 167 | "metadata": {}, |
158 | | - "outputs": [ |
159 | | - { |
160 | | - "data": { |
161 | | - "text/plain": "b'\\x9e\\xa8b2f'" |
162 | | - }, |
163 | | - "execution_count": 7, |
164 | | - "metadata": {}, |
165 | | - "output_type": "execute_result" |
166 | | - } |
167 | | - ], |
| 168 | + "outputs": [], |
168 | 169 | "source": [ |
169 | | - "buf = schunk.to_cframe()\n", |
170 | | - "buf[:5]" |
| 170 | + "buf = schunk.to_cframe()" |
171 | 171 | ] |
172 | 172 | }, |
173 | 173 | { |
|
190 | 190 | "cell_type": "markdown", |
191 | 191 | "metadata": {}, |
192 | 192 | "source": [ |
193 | | - "In this case we set the `copy` param to `True`. If you do not want to copy the buffer,\n", |
194 | | - "be mindful that you will have to keep a reference to it until you do not\n", |
195 | | - "want the SChunk anymore.\n", |
| 193 | + "In this case we set the `copy` param to `True`. If you do not want to copy the buffer, be mindful that you will have to keep a reference to it until you do not want the SChunk anymore.\n", |
196 | 194 | "\n", |
197 | 195 | "\n", |
198 | 196 | "## Serializing NumPy arrays\n", |
|
201 | 199 | "\n", |
202 | 200 | "### In-memory\n", |
203 | 201 | "\n", |
204 | | - "For obtaining an in-memory representation, you can use `pack_array2`. In comparison with its former version (`pack_array`), it is way faster and does not have the 2 GB size limitation:" |
| 202 | + "For obtaining an in-memory representation, you can use `pack_tensor`. In comparison with its former version (`pack_array`), it is way faster and does not have the 2 GB size limitation:" |
205 | 203 | ] |
206 | 204 | }, |
207 | 205 | { |
|
210 | 208 | "metadata": {}, |
211 | 209 | "outputs": [], |
212 | 210 | "source": [ |
213 | | - "np_array = np.arange(2**30 + 1, dtype=np.int32) # 2 GB (+4) array\n", |
| 211 | + "np_array = np.arange(2**30, dtype=np.int32) # 4 GB array\n", |
214 | 212 | "\n", |
215 | | - "packed_arr2 = blosc2.pack_array2(np_array)\n", |
216 | | - "unpacked_arr2 = blosc2.unpack_array2(packed_arr2)" |
| 213 | + "packed_arr2 = blosc2.pack_tensor(np_array)\n", |
| 214 | + "unpacked_arr2 = blosc2.unpack_tensor(packed_arr2)" |
217 | 215 | ] |
218 | 216 | }, |
219 | 217 | { |
|
222 | 220 | "source": [ |
223 | 221 | "### On-disk\n", |
224 | 222 | "\n", |
225 | | - "To store the serialized buffer on-disk you want to use `save_array` and `load_array`:" |
| 223 | + "To store the serialized buffer on-disk you want to use `save_tensor` and `load_tensor`:" |
226 | 224 | ] |
227 | 225 | }, |
228 | 226 | { |
229 | 227 | "cell_type": "code", |
230 | 228 | "execution_count": 10, |
231 | 229 | "metadata": {}, |
232 | | - "outputs": [], |
| 230 | + "outputs": [ |
| 231 | + { |
| 232 | + "data": { |
| 233 | + "text/plain": "True" |
| 234 | + }, |
| 235 | + "execution_count": 10, |
| 236 | + "metadata": {}, |
| 237 | + "output_type": "execute_result" |
| 238 | + } |
| 239 | + ], |
233 | 240 | "source": [ |
234 | | - "blosc2.save_array(np_array, urlpath=\"ondisk_array.b2frame\", mode=\"w\")\n", |
235 | | - "np_array2 = blosc2.load_array(\"ondisk_array.b2frame\")\n", |
| 241 | + "blosc2.save_tensor(np_array, urlpath=\"ondisk_array.b2frame\", mode=\"w\")\n", |
| 242 | + "np_array2 = blosc2.load_tensor(\"ondisk_array.b2frame\")\n", |
236 | 243 | "np.array_equal(np_array, np_array2)" |
237 | 244 | ] |
238 | 245 | }, |
|
0 commit comments