Skip to content

Commit b1929fa

Browse files
committed
scripts: misc updates for clarity
1 parent c72700c commit b1929fa

File tree

5 files changed

+44
-28
lines changed

5 files changed

+44
-28
lines changed

content/scripts.rst

Lines changed: 31 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ Jupyter notebooks can be parameterized for instance using `papermill <https://pa
4545

4646
Within JupyterLab, you can export any Jupyter notebook to a Python script:
4747

48-
.. figure:: https://jupyterlab.readthedocs.io/en/stable/_images/exporting_menu.png
48+
.. figure:: https://jupyterlab.readthedocs.io/en/stable/_images/exporting-menu.png
4949

5050
Select File (top menu bar) → Export Notebook as → **Export notebook to Executable Script**.
5151

@@ -69,9 +69,13 @@ Exercises 1
6969

7070
1. Download the :download:`weather_observations.ipynb <../resources/code/scripts/weather_observations.ipynb>` and the weather_data file and upload them to your Jupyterlab. The script plots the temperature data for Tapiola in Espoo. The data is originally from `rp5.kz <https://rp5.kz>`_ and was slightly adjusted for this lecture.
7171

72-
**Note:** If you haven't downloaded the file directly to your Jupyterlab folder, it will be located in your **Downloads** folder or the folder you selected. In Jupyterlab click on the 'upload file' button, navigate to the folder containing the file and select it to load it into your Jupyterlab folder.
72+
**Hint:** Copy the URL above (right-click) and in JupyterLab, use
73+
File → Open from URL → Paste the URL. It will both download it to
74+
the directory JupyterLab is in and open it for you.
7375

74-
2. Open a terminal in Jupyter (File → New → Terminal).
76+
2. Open a terminal in Jupyter: File → New Launcher, then click
77+
"Terminal" there. (if you do it this way, it will be in the right
78+
directory. File → New → Terminal might not be.)
7579

7680
3. Convert the Jupyter script to a Python script by calling::
7781

@@ -81,6 +85,8 @@ Exercises 1
8185

8286
$ python weather_observations.py
8387

88+
89+
8490
Command line arguments with :data:`sys.argv`
8591
--------------------------------------------
8692

@@ -100,29 +106,31 @@ and any further argument (separated by space) is appended to this list, like suc
100106
$ # sys.argv[2] is 'B'
101107
102108
Lets see how it works: We modify the **weather_observations.py** script such that we allow start
103-
and end times as well as the output file to be passed in as arguments to the function:
109+
and end times as well as the output file to be passed in as arguments
110+
to the function. Open it (find the ``.py`` file from the JupyterLab
111+
file browser) and make these edits:
104112

105113
.. code-block:: python
106-
:emphasize-lines: 1,5-6,8,16
114+
:emphasize-lines: 1,5-6,8,14-15
107115
108116
import sys
109117
import pandas as pd
110118
111-
# set start and end time
112-
start_date = pd.to_datetime(sys.argv[1],dayfirst=True)
113-
end_date = pd.to_datetime(sys.argv[2],dayfirst=True)
114-
115-
output_file_name = sys.argv[3]
116-
119+
# define the start and end time for the plot
120+
start_date = pd.to_datetime(sys.argv[1], dayfirst=True)
121+
end_date = pd.to_datetime(sys.argv[2], dayfirst=True)
117122
...
118123
119124
# select the data
120125
weather = weather[weather['Local time'].between(start_date,end_date)]
121126
...
122127
128+
# save the figure
129+
output_file_name = sys.argv[3]
123130
fig.savefig(output_file_name)
124131
125-
We can try it out:
132+
We can try it out (see the file ``spring_in_tapiola.png`` made in the
133+
file browser):
126134

127135
.. code-block:: console
128136
@@ -185,6 +193,7 @@ would show the following message:
185193

186194
.. code-block:: console
187195
196+
$ python birthday.py --help
188197
usage: birthday.py [-h] [-d DATE] N
189198
190199
positional arguments:
@@ -201,7 +210,7 @@ Exercises 2
201210
.. challenge:: Scripts-2
202211

203212
1. Take the Python script (``weather_observations.py``) we have written in the preceding exercise and use
204-
:py:mod:`argparse` to specify the input and output files and allow the start and end dates to be set.
213+
:py:mod:`argparse` to specify the input (URL) and output files and allow the start and end dates to be set.
205214

206215
* Hint: try not to do it all at once, but add one or two arguments, test, then add more, and so on.
207216
* Hint: The input and output filenames make sense as positional arguments, since they must always be given. Input is usually first, then output.
@@ -236,6 +245,7 @@ Exercises 2
236245
237246
- We can now process different input files without changing the script.
238247
- We can select multiple time ranges without modifying the script.
248+
- We can easily save these commands to know what we did.
239249
- This way we can also loop over file patterns (using shell loops or similar) or use
240250
the script in a workflow management system and process many files in parallel.
241251
- By changing from :data:`sys.argv` to :mod:`argparse` we made the script more robust against
@@ -287,9 +297,9 @@ Exercises 3 (optional)
287297
.. challenge:: Scripts-3
288298

289299
1. Download the :download:`optionsparser.py <https://raw.githubusercontent.com/AaltoSciComp/python-for-scicomp/master/resources/code/scripts/optionsparser.py>`
290-
function and load it into your working folder in Jupyterlab.
300+
function and load it into your working folder in Jupyterlab (Hint: in JupyterLab, File → Open from URL).
291301
Modify the previous script to use a config file parser to read all arguments. The config file is passed in as a single argument on the command line
292-
(using e.g. argparse or sys.argv) still needs to be read from the command line.
302+
(using e.g. :mod:`argparse` or :data:`sys.argv`) still needs to be read from the command line.
293303

294304

295305
2. Run your script with different config files.
@@ -303,6 +313,12 @@ Exercises 3 (optional)
303313
:language: python
304314
:emphasize-lines: 5,9-12,15-27,30,33,36-37,58
305315

316+
What did this config file parser get us? Now, we have separated the
317+
code from the configuration. We could save all the configuration in
318+
version control - separately and have one script that runs them. If
319+
done right, our work could be much more reproducible and
320+
understandable.
321+
306322

307323
.. admonition:: Further reading
308324

resources/code/scripts/weather_observations.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,12 +12,12 @@
1212
"weather = pd.read_csv(url,comment='#')\n",
1313
"\n",
1414
"# define the start and end time for the plot \n",
15-
"start_date=pd.to_datetime('01/06/2021',dayfirst=True)\n",
16-
"end_date=pd.to_datetime('01/10/2021',dayfirst=True)\n",
15+
"start_date=pd.to_datetime('01/06/2021', dayfirst=True)\n",
16+
"end_date=pd.to_datetime('01/10/2021', dayfirst=True)\n",
1717
"\n",
1818
"# The date format in the file is in a day-first format, which matplotlib does nto understand.\n",
1919
"# so we need to convert it.\n",
20-
"weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)\n",
20+
"weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)\n",
2121
"# select the data\n",
2222
"weather = weather[weather['Local time'].between(start_date,end_date)]\n"
2323
]

resources/code/scripts/weather_observations.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,10 +8,10 @@
88
weather = pd.read_csv(url,comment='#')
99

1010
# define the start and end time for the plot
11-
start_date=pd.to_datetime('01/06/2021',dayfirst=True)
12-
end_date=pd.to_datetime('01/10/2021',dayfirst=True)
11+
start_date=pd.to_datetime('01/06/2021', dayfirst=True)
12+
end_date=pd.to_datetime('01/10/2021', dayfirst=True)
1313
#Preprocess the data
14-
weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)
14+
weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)
1515
# select the data
1616
weather = weather[weather['Local time'].between(start_date,end_date)]
1717

resources/code/scripts/weather_observations_argparse.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,19 @@
55
parser.add_argument("input", type=str, help="Input data file")
66
parser.add_argument("output", type=str, help="Output plot file")
77
parser.add_argument("-s", "--start", default="01/01/2019", type=str, help="Start date in DD/MM/YYYY format")
8-
parser.add_argument("-e", "--end", default="16/10/2021", type=str, help="End date in DD/MM/YYYY format")
8+
parser.add_argument("-e", "--end", default="16/10/2021", type=str, help="End date in DD/MM/YYYY format")
99

1010
args = parser.parse_args()
1111

1212
# load the data
1313
weather = pd.read_csv(args.input,comment='#')
1414

1515
# define the start and end time for the plot
16-
start_date=pd.to_datetime(args.start,dayfirst=True)
17-
end_date=pd.to_datetime(args.end,dayfirst=True)
16+
start_date=pd.to_datetime(args.start, dayfirst=True)
17+
end_date=pd.to_datetime(args.end, dayfirst=True)
1818

1919
# preprocess the data
20-
weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)
20+
weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)
2121
# select the data
2222
weather = weather[weather['Local time'].between(start_date,end_date)]
2323

resources/code/scripts/weather_observations_config.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,11 @@
3333
weather = pd.read_csv(parameters.input,comment='#')
3434

3535
# obtain start and end date
36-
start_date=pd.to_datetime(parameters.start,dayfirst=True)
37-
end_date=pd.to_datetime(parameters.end,dayfirst=True)
36+
start_date=pd.to_datetime(parameters.start, dayfirst=True)
37+
end_date=pd.to_datetime(parameters.end, dayfirst=True)
3838

3939
# Data preprocessing
40-
weather['Local time'] = pd.to_datetime(weather['Local time'],dayfirst=True)
40+
weather['Local time'] = pd.to_datetime(weather['Local time'], dayfirst=True)
4141
# select the data
4242
weather = weather[weather['Local time'].between(start_date,end_date)]
4343

0 commit comments

Comments
 (0)