Skip to content

Commit 1442d61

Browse files
authored
remove dask and unneeded variables (#49)
* remove dask and unneeded variables * clear nb output
1 parent 8cdb6af commit 1442d61

File tree

1 file changed

+28
-16
lines changed

1 file changed

+28
-16
lines changed

book/itslive/nbs/4_exploratory_data_analysis_single.ipynb

Lines changed: 28 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -143,19 +143,18 @@
143143
"outputs": [],
144144
"source": [
145145
"# Read raster\n",
146-
"single_glacier_raster = xr.open_zarr(\"../data/raster_data/single_glacier_itslive.zarr\", decode_coords=\"all\")\n",
146+
"single_glacier_raster = xr.open_zarr(\n",
147+
" \"../data/raster_data/single_glacier_itslive.zarr\", decode_coords=\"all\", chunks=None\n",
148+
")\n",
147149
"# Read vector\n",
148150
"single_glacier_vector = gpd.read_file(\"../data/vector_data/single_glacier_vec.json\")"
149151
]
150152
},
151153
{
152-
"cell_type": "code",
153-
"execution_count": null,
154+
"cell_type": "markdown",
154155
"metadata": {},
155-
"outputs": [],
156156
"source": [
157-
"# Take a look at raster data object\n",
158-
"single_glacier_raster"
157+
"We can also drop variables we won't be using:"
159158
]
160159
},
161160
{
@@ -164,7 +163,19 @@
164163
"metadata": {},
165164
"outputs": [],
166165
"source": [
167-
"single_glacier_raster.nbytes / 1e9"
166+
"vars_to_keep = [\n",
167+
" \"v\",\n",
168+
" \"vx\",\n",
169+
" \"vy\",\n",
170+
" \"v_error\",\n",
171+
" \"vy_error\",\n",
172+
" \"vx_error\",\n",
173+
" \"acquisition_date_img1\",\n",
174+
" \"acquisition_date_img2\",\n",
175+
" \"satellite_img1\",\n",
176+
" \"satellite_img2\",\n",
177+
"]\n",
178+
"single_glacier_raster = single_glacier_raster[vars_to_keep]"
168179
]
169180
},
170181
{
@@ -173,7 +184,8 @@
173184
"metadata": {},
174185
"outputs": [],
175186
"source": [
176-
"np.unique(single_glacier_raster.satellite_img1)"
187+
"# Take a look at raster data object\n",
188+
"single_glacier_raster"
177189
]
178190
},
179191
{
@@ -182,16 +194,16 @@
182194
"metadata": {},
183195
"outputs": [],
184196
"source": [
185-
"np.unique(single_glacier_raster.satellite_img2)"
197+
"single_glacier_raster.nbytes / 1e9"
186198
]
187199
},
188200
{
189-
"cell_type": "markdown",
201+
"cell_type": "code",
202+
"execution_count": null,
190203
"metadata": {},
204+
"outputs": [],
191205
"source": [
192-
"The above code cells show us that this dataset contains observations from Sentinel 1 & 2 and Landsat 4,5,6,7,8 & 9 satellite sensors. The dataset is 3.3 GB. \n",
193-
"\n",
194-
"Next, we want to perform computations that require us to load this object into memory. To do this, we use the [dask .compute()](https://docs.dask.org/en/stable/generated/dask.dataframe.DataFrame.compute.html) method, which turns a 'lazy' object into an in-memory object. If you try to run compute on too large of an object, your computer may run out of RAM and the kernel being used in this python session will die (if this happens, click 'restart kernel' from the kernel drop down menu above). "
206+
"np.unique(single_glacier_raster.satellite_img1)"
195207
]
196208
},
197209
{
@@ -200,14 +212,14 @@
200212
"metadata": {},
201213
"outputs": [],
202214
"source": [
203-
"single_glacier_raster = single_glacier_raster.compute()"
215+
"np.unique(single_glacier_raster.satellite_img2)"
204216
]
205217
},
206218
{
207219
"cell_type": "markdown",
208220
"metadata": {},
209221
"source": [
210-
"Now, if you expand the data object to look at the variables, you will see that they no long hold `dask.array` objects."
222+
"The above code cells show us that this dataset contains observations from Sentinel 1 & 2 and Landsat 4,5,6,7,8 & 9 satellite sensors. The dataset is 1.1 GB. "
211223
]
212224
},
213225
{
@@ -1326,7 +1338,7 @@
13261338
"metadata": {
13271339
"celltoolbar": "Tags",
13281340
"kernelspec": {
1329-
"display_name": "geospatial_datacube_book_env",
1341+
"display_name": "Python 3 (ipykernel)",
13301342
"language": "python",
13311343
"name": "python3"
13321344
},

0 commit comments

Comments
 (0)