You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/slicing_and_beyond.ipynb
+22-62Lines changed: 22 additions & 62 deletions
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@
6
6
"source": [
7
7
"# Slicing chunks and beyond\n",
8
8
"\n",
9
-
"The newest and coolest way to store data in python-blosc2 is through a SChunk (super-chunk) object. Here the data is split into chunks of the same size. So in the past, the only way of working with it was chunk by chunk (see tutorials-basics.ipynb). But now, python-blosc2 can retrieve, update or append data all at once (i.e. avoiding doing it chunk by chunk). To see how this works, let's first create our SChunk."
9
+
"The newest and coolest way to store data in python-blosc2 is through a `SChunk` (super-chunk) object. Here the data is split into chunks of the same size. In the past, the only way of working with it was chunk by chunk (see tutorials-basics.ipynb), but now, python-blosc2 can retrieve, update or append data at item level (i.e. avoiding doing it chunk by chunk). To see how this works, let's first create our SChunk."
10
10
]
11
11
},
12
12
{
@@ -56,11 +56,7 @@
56
56
{
57
57
"cell_type": "code",
58
58
"execution_count": 3,
59
-
"metadata": {
60
-
"pycharm": {
61
-
"name": "#%%\n"
62
-
}
63
-
},
59
+
"metadata": {},
64
60
"outputs": [
65
61
{
66
62
"name": "stdout",
@@ -85,11 +81,7 @@
85
81
{
86
82
"cell_type": "code",
87
83
"execution_count": 4,
88
-
"metadata": {
89
-
"pycharm": {
90
-
"name": "#%%\n"
91
-
}
92
-
},
84
+
"metadata": {},
93
85
"outputs": [
94
86
{
95
87
"name": "stdout",
@@ -120,11 +112,7 @@
120
112
{
121
113
"cell_type": "code",
122
114
"execution_count": 5,
123
-
"metadata": {
124
-
"pycharm": {
125
-
"name": "#%%\n"
126
-
}
127
-
},
115
+
"metadata": {},
128
116
"outputs": [],
129
117
"source": [
130
118
"start = 34\n",
@@ -143,11 +131,7 @@
143
131
{
144
132
"cell_type": "code",
145
133
"execution_count": 6,
146
-
"metadata": {
147
-
"pycharm": {
148
-
"name": "#%%\n"
149
-
}
150
-
},
134
+
"metadata": {},
151
135
"outputs": [],
152
136
"source": [
153
137
"schunk_nelems = 1000 * 200 * nchunks\n",
@@ -162,9 +146,9 @@
162
146
"cell_type": "markdown",
163
147
"metadata": {},
164
148
"source": [
165
-
"## Getting a SChunk from/as a contiguous buffer\n",
149
+
"## Building a SChunk from/as a contiguous buffer\n",
166
150
"\n",
167
-
"Furthermore, you can pass from a SChunk to a contiguousbuffer and viceversa. Let's get that buffer:"
151
+
"Furthermore, you can convert a SChunk to a contiguous, serialized buffer and vice-versa. Let's get that buffer (aka `cframe`) first:"
"In this case we set the `copy` param to `True`. If you do not want to copy the buffer,\n",
218
-
"be mindful that you will have to keep its reference until you do not\n",
194
+
"be mindful that you will have to keep a reference to it until you do not\n",
219
195
"want the SChunk anymore.\n",
220
196
"\n",
221
-
"## Compressing NumPy arrays\n",
197
+
"## Serializing NumPy arrays\n",
222
198
"\n",
223
-
"If the object you want to get as a compressed buffer is a NumPy array, you can use the newer and faster functions to store it in-memory or on-disk.\n",
199
+
"If what you want is to create a serialized, compressed version of a NumPy array, you can use the newer (and faster) functions to store it either in-memory or on-disk. The specification of such a contiguous compressed representation, aka **cframe** can be seen at: https://github.com/Blosc/c-blosc2/blob/main/README_CFRAME_FORMAT.rst.\n",
224
200
"\n",
225
201
"### In-memory\n",
226
202
"\n",
227
-
"To store it in-memory you can use `pack_array2`. In comparison with its former version, it is faster (see `pack_compress.py` bench) and does not have the 2 GB size limitation."
203
+
"For obtaining an in-memory representation, you can use `pack_array2`. In comparison with its former version (`pack_array`), it is way faster and does not have the 2 GB size limitation:"
"Now python-blosc2 has an easyway of creating, getting, setting, deleting and expanding data in a SChunk. Moreover, you can get a contiguous compressed representation (aka [cframe](https://github.com/Blosc/c-blosc2/blob/main/README_CFRAME_FORMAT.rst)) of it and create it again latter. And you can do the same with NumPy arrays faster than with the former functions.\n"
247
+
"Now python-blosc2 offers an easy, yet fast way of creating, getting, settingand expanding data via the `SChunk` class. Moreover, you can get a contiguous compressed representation (aka [cframe](https://github.com/Blosc/c-blosc2/blob/main/README_CFRAME_FORMAT.rst)) of it and re-create it again later with no sweat.\n"
0 commit comments