Skip to content

Commit c2f5ace

Browse files
authored
A few small updates (#183)
1 parent 8cea1b3 commit c2f5ace

File tree

3 files changed

+12
-7
lines changed

3 files changed

+12
-7
lines changed

00_overview.ipynb

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,8 @@
7373
"\n",
7474
" conda env create -f binder/environment.yml\n",
7575
" conda activate dask-tutorial\n",
76+
" jupyter labextension install @jupyter-widgets/jupyterlab-manager\n",
77+
" jupyter labextension install @bokeh/jupyter_bokeh\n",
7678
" \n",
7779
"Do this *before* running this notebook."
7880
]
@@ -180,7 +182,7 @@
180182
"name": "python",
181183
"nbconvert_exporter": "python",
182184
"pygments_lexer": "ipython3",
183-
"version": "3.7.6"
185+
"version": "3.8.3"
184186
}
185187
},
186188
"nbformat": 4,

01_dask.delayed.ipynb

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -274,7 +274,7 @@
274274
"cell_type": "markdown",
275275
"metadata": {},
276276
"source": [
277-
"How do the graph visualizations compare with the given solution, compared to a version with the `sum` function used directly rather than wrapped with `delay`? Can you explain the latter version? You might find the result of the following expression illuminating\n",
277+
"How do the graph visualizations compare with the given solution, compared to a version with the `sum` function used directly rather than wrapped with `delayed`? Can you explain the latter version? You might find the result of the following expression illuminating\n",
278278
"```python\n",
279279
"delayed(inc)(1) + delayed(inc)(2)\n",
280280
"```"
@@ -564,6 +564,7 @@
564564
"2. Calling the `.compute()` method works well when you have a single output. When you have multiple outputs you might want to use the `dask.compute` function:\n",
565565
"\n",
566566
" ```python\n",
567+
" >>> from dask import compute\n",
567568
" >>> x = delayed(np.arange)(10)\n",
568569
" >>> y = x ** 2\n",
569570
" >>> min_, max_ = compute(y.min(), y.max())\n",
@@ -656,6 +657,8 @@
656657
},
657658
"outputs": [],
658659
"source": [
660+
"%%time\n",
661+
"\n",
659662
"# This is just one possible solution, there are\n",
660663
"# several ways to do this using `delayed`\n",
661664
"\n",
@@ -748,7 +751,7 @@
748751
"name": "python",
749752
"nbconvert_exporter": "python",
750753
"pygments_lexer": "ipython3",
751-
"version": "3.7.6"
754+
"version": "3.8.3"
752755
}
753756
},
754757
"nbformat": 4,

05_distributed.ipynb

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@
3939
"or set the current default, either temporarily or globally\n",
4040
"```python\n",
4141
"with dask.config.set(scheduler='processes'):\n",
42-
" # set temporarily fo this block only\n",
42+
" # set temporarily for this block only\n",
4343
" myvalue.compute()\n",
4444
"\n",
4545
"dask.config.set(scheduler='processes')\n",
@@ -236,14 +236,14 @@
236236
"cell_type": "markdown",
237237
"metadata": {},
238238
"source": [
239-
"In this particular case, this should be as fast or faster than the best case, threading, above. Why do you suppose this is? You should start your reading [here](https://distributed.dask.org/en/latest/index.html#architecture), and in particular note that the distributed scheduler was a complete rewrite with more intelligence around sharing of intermediate results and which tasks run on which worker. This will result in better performance in *some* cases, but still larger latency and overhead compared to the threaded scheduler, so there will be rare cases where it performs worse. Fortunately, the dashboard now gives us a lot more [diagnostic information](https://distributed.dask.org/en/latest/diagnosing-performance.html). Look at the Profile page of the dashboard to fund out what takes the biggest fraction of CPU time for the computation we just performed?"
239+
"In this particular case, this should be as fast or faster than the best case, threading, above. Why do you suppose this is? You should start your reading [here](https://distributed.dask.org/en/latest/index.html#architecture), and in particular note that the distributed scheduler was a complete rewrite with more intelligence around sharing of intermediate results and which tasks run on which worker. This will result in better performance in *some* cases, but still larger latency and overhead compared to the threaded scheduler, so there will be rare cases where it performs worse. Fortunately, the dashboard now gives us a lot more [diagnostic information](https://distributed.dask.org/en/latest/diagnosing-performance.html). Look at the Profile page of the dashboard to find out what takes the biggest fraction of CPU time for the computation we just performed?"
240240
]
241241
},
242242
{
243243
"cell_type": "markdown",
244244
"metadata": {},
245245
"source": [
246-
"If all you want to do is execute computations created using delayed, or run calculations based on the higher-level data collections (see the coming sections), then that is about all you need to know to scale your work up to cluster scale. However, there is more detail to know about the distributed scheduler that will help with efficient usage. See the chapter Distributed, Advanced."
246+
"If all you want to do is execute computations created using delayed, or run calculations based on the higher-level data collections, then that is about all you need to know to scale your work up to cluster scale. However, there is more detail to know about the distributed scheduler that will help with efficient usage. See the chapter Distributed, Advanced."
247247
]
248248
},
249249
{
@@ -332,7 +332,7 @@
332332
"name": "python",
333333
"nbconvert_exporter": "python",
334334
"pygments_lexer": "ipython3",
335-
"version": "3.7.6"
335+
"version": "3.8.3"
336336
}
337337
},
338338
"nbformat": 4,

0 commit comments

Comments
 (0)