Skip to content

Commit 4a21c54

Browse files
committed
cleaned and final update of type hints
1 parent 1c32867 commit 4a21c54

File tree

2 files changed

+101
-59
lines changed

2 files changed

+101
-59
lines changed

.gitignore

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,8 +5,5 @@ tmp/
55
.DS_Store
66
.nox
77
__pycache__
8-
<<<<<<< Updated upstream
98
*notes-from-review.md
10-
=======
119
*.idea*
12-
>>>>>>> Stashed changes

documentation/write-user-documentation/document-your-code-api-docstrings.md

Lines changed: 101 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -211,15 +211,15 @@ def add_me(aNum, aNum2):
211211

212212
## Beyond docstrings: type hints
213213

214-
We can use docstrings to describe data types that we pass into functions as parameters or
215-
into classes as attributes. We do it with package users in mind.
214+
We use docstrings to describe data types that we pass into functions as parameters or
215+
into classes as attributes. *We do it with our users in mind.*
216216

217-
What with us – developers? We can think of ourselves and the new contributors
217+
**What with us – developers?** We can think of ourselves and the new contributors,
218218
and start using *type hinting* to make our journey safer!
219219

220220
There are solid reasons why to use type hints:
221221

222-
- Development and debugging is faster,
222+
- Development and debugging are faster,
223223
- We clearly see data flow and its transformations,
224224
- We can use tools like `mypy` or integrated tools of Python IDEs for static type checking and code debugging.
225225

@@ -230,22 +230,22 @@ The icing on the cake is that the code in our package will be aligned with the b
230230
But there are reasons to *skip* type hinting:
231231

232232
- Type hints may make code unreadable, especially when a parameter’s input takes multiple data types and we list them all,
233-
- It doesn’t make sense to write type hints for simple scripts and functions that perform obvious operations.
233+
- Writing type hints for simple scripts and functions that perform obvious operations don't make sense.
234234

235235
Fortunately for us, type hinting is not all black and white.
236236
We can gradually describe the parameters and outputs of some functions but leave others as they are.
237-
Type hinting can be an introductory task for new contributors in seasoned packages,
238-
that way their learning curve about data flow and dependencies between API endpoints will be smoother.
237+
Type hinting can be a task for new contributors to get them used to the package structure.
238+
That way, their learning curve about data flow and dependencies between API endpoints will be smoother.
239239

240240
## Type hints in practice
241241

242242
Type hinting was introduced with Python 3.5 and is described in [PEP 484](https://peps.python.org/pep-0484/).
243-
**PEP 484** defines the scope of type hinting. Is Python drifting towards compiled languages with type hinting?
244-
It is not. Type hints are optional and static and they will work like that in the future where Python is Python.
245-
The power of type hints lies somewhere between docstrings and unit tests, and with it we can avoid many bugs
243+
**PEP 484** defines the scope of type hinting. Is Python drifting towards compiled languages with this feature?
244+
It is not. Type hints are optional and static. They will work like that in the future until Python is Python.
245+
The power of type hints lies somewhere between docstrings and unit tests, and with it, we can avoid many bugs
246246
throughout development.
247247

248-
We've seen type hints in the simple example earlier, and we will change it slightly:
248+
We've seen type hints in the simple example earlier. Let's come back to it and change it slightly:
249249

250250

251251
```python
@@ -254,23 +254,28 @@ from typing import Dict, List
254254

255255
def extent_to_json(ext_obj: List) -> Dict:
256256
"""Convert bounds to a shapely geojson like spatial object."""
257-
pass
257+
...
258+
258259
```
259260

260-
Here we focus on the new syntax. First, we have described the parameter `ext_obj` as the `List` class. How do we do it? By adding a colon after parameter and the name of a class that is passed into a function. It’s not over and we see that the function definition after closing parenthesis is expanded. If we want to inform type checker what type function returns, then we create the arrow sign `->` that points to a returned type and after it we have function’s colon. Our function returns Python dictonray (`Dict`).
261+
Here we focus on the new syntax. First, we described the parameter `ext_obj` as the `List` class. How do we do it?
262+
Add a colon after the parameter (variable) and the name of a class that is passed into a function.
263+
It’s not over. Do you see, that the function definition after closing parenthesis is expanded?
264+
If we want to inform the type checker what the function returns, then we create the arrow sign `->` that points to a returned type,
265+
and after it, we put the function’s colon. Our function returns a Python dictionary (`Dict`).
261266

262267
```{note}
263-
We have exported classes `List` and `Dict` from `typing` module but we may use
268+
We have exported classes `List` and `Dict` from the `typing` module, but we may use
264269
`list` or `dict` keywords instead. We will achieve the same result.
265270
Capitalized keywords are required when our package uses Python versions that are lower than
266-
Python 3.9. Python 3.7 will be deprecated in June 2023, Python 3.8 in October 2024.
267-
Thus, if your package supports the whole ecosystem, it should use `typing` module syntax.
271+
Python 3.9. Python 3.7 will be deprecated in June 2023, and Python 3.8 in October 2024.
272+
Thus, if your package supports the whole ecosystem, it should use the `typing` module syntax.
268273
```
269274

270275
### Type hints: basic example
271276

272277
The best way to learn is by example. We will use the [pystiche](https://github.com/pystiche/pystiche/tree/main) package.
273-
To avoid confusion, we start from a mathematical operation with basic data types:
278+
To avoid confusion, we start with a mathematical operation:
274279

275280
```python
276281
import torch
@@ -283,7 +288,7 @@ def _norm(x: torch.Tensor, dim: int = 1, eps: float = 1e-8) -> torch.Tensor:
283288

284289
The function has three parameters:
285290

286-
- `x` that is required and its type is `torch.Tensor`,
291+
- `x` that is required, and its type is `torch.Tensor`,
287292
- `dim`, optional `int` with a default value equal to `1`,
288293
- `eps`, optional `float` with a default value equal to `1e-8`.
289294

@@ -295,7 +300,7 @@ As we see, we can use basic data types to mark simple variables. The basic set o
295300
- `bool`
296301
- `complex`.
297302

298-
Most frequently we will use those types within a simple functions that are *close to data*.
303+
We will most frequently use those types within simple functions that are *close to data*.
299304
However, sometimes our variable will be a data structure that isn't built-in within Python itself
300305
but comes from other packages:
301306

@@ -304,14 +309,14 @@ but comes from other packages:
304309
- `DataFrame` from `pandas`,
305310
- `Session` from `requests`.
306311

307-
To perform type checking we must import those classes, then we can set those as a parameter's type.
312+
To perform type checking, we must import those classes. Then we can set those as a parameter's type.
308313
The same is true if we want to use classes from within our package (but we should avoid **circular imports**,
309-
the topic that we will uncover later).
314+
the topic we will uncover later).
310315

311316
### Type hints: complex data types
312317

313318
We can use type hints to describe other objects available in Python.
314-
The little sample of those objects are:
319+
A little sample of those objects are:
315320

316321
- `List` (= `list`)
317322
- `Dict` (= `dict`)
@@ -330,13 +335,13 @@ def _extract_prev(self, idx: int, idcs: List[int]) -> Optional[str]:
330335

331336
```
332337

333-
The function has two parameters. Parameter `idcs` is a list of integers. We may write it as `List[int]` or `List` without
338+
The function has two parameters. The parameter `idcs` is a list of integers. We may write it as `List[int]` or `List` without
334339
square brackets and data type that is within a list.
335340

336-
The `_extract_prev` function returns `Optional` type. It is a special type that is used to describe inputs and output
341+
The `_extract_prev` function returns the `Optional` type. It is a special type that describes inputs and output
337342
that can be `None`. There are more interesting types that we can use in our code:
338343

339-
- `Union` – we can use it to describe a variable that can be of multiple types, the common example could be:
344+
- `Union` – we can use it to describe a variable of multiple types. An example could be:
340345

341346
```python
342347
from typing import List, Union
@@ -349,8 +354,8 @@ def process_data(data: Union[np.ndarray, pd.DataFrame, List]) -> np.ndarray:
349354

350355
```
351356

352-
What's the problem with the example above? With more data types that can be passed into parameter `data`, the function definition
353-
becomes unreadable. We have two solutions for this issue. The first one is to use `Any` type that is a wildcard type:
357+
What's the problem with the example above? The function definition becomes unreadable with more data types passed into the parameter `data`.
358+
We have two solutions for this issue. The first one is to use the `Any` type, which is a wildcard that is equal to not passing any type.
354359

355360
```python
356361
from typing import Any
@@ -361,15 +366,15 @@ def process_data(data: Any) -> np.ndarray:
361366

362367
```
363368

364-
The second solution is to think what is a high level representation of passed data types. The examples are:
369+
The second solution is to think what is a high-level representation of a passed data type. The examples are:
365370

366-
- `Sequence` – we can use it to describe a variable that is a sequence of elements. Sequential are `list`, `tuple`, `range` and `str`.
367-
- `Iterable` – we can use it to describe a variable that is iterable. Iterables are `list`, `tuple`, `range`, `str`, `dict` and `set`.
371+
- `Sequence` – we can use it to describe a variable as a sequence of elements. Sequential are `list`, `tuple`, `range` and `str`.
372+
- `Iterable` – we can use it to describe an iterable variable. Iterables are `list`, `tuple`, `range`, `str`, `dict` and `set`.
368373
- `Mapping` – we can use it to describe a variable that is a mapping. Mappings are `dict` and `defaultdict`.
369-
- `Hashable` – we can use it to describe a variable that is hashable. Hashables are `int`, `float`, `str`, `tuple` and `frozenset`.
370-
- `Collection` - we can use it to describe a variable that is a collection. Collections are `list`, `tuple`, `range`, `str`, `dict`, `set` and `frozenset`.
374+
- `Hashable` – we can use it to describe a hashable variable. Hashables are `int`, `float`, `str`, `tuple` and `frozenset`.
375+
- `Collection` - we can use it to describe a collection variable. Collections are `list`, `tuple`, `range`, `str`, `dict`, `set` and `frozenset`.
371376

372-
Thus, the function could look like:
377+
Thus, the function could look like this:
373378

374379
```python
375380
from typing import Iterable
@@ -380,11 +385,11 @@ def process_data(data: Iterable) -> np.ndarray:
380385

381386
```
382387

383-
### Type hints: special typing objects
388+
### Type hints: unique objects and interesting cases
384389

385390
The `typing` module provides us with more objects that we can use to describe our variables.
386-
Interesting object is `Callable` that we can use to describe a variable that is a function. Usually,
387-
when we write decorators or wrappers, we use `Callable` type. The example in the context of `pystiche` package:
391+
An interesting object is `Callable` that we can use to describe a variable that is a function. Usually,
392+
when we write decorators or wrappers, we use the `Callable` type. The example in the context of the `pystiche` package:
388393

389394
```python
390395
from typing import Callable
@@ -393,14 +398,13 @@ from typing import Callable
393398
def _deprecate(fn: Callable) -> Callable:
394399
...
395400

396-
397401
```
398402

399-
The `Callable`can be used as a single word or as a word with square brackets that has two parameters: `Callable[[arg1, arg2], return_type]`.
400-
The first parameter is a list of arguments, the second one is a return type.
403+
The `Callable`can be used as a single word or a word with square brackets with two parameters: `Callable[[arg1, arg2], return_type]`.
404+
The first parameter is a list of arguments, and the second is a function output's data type.
401405

402-
There is one more important case around type hints. Sometimes we want to describe a variable that comes from within
403-
our package. Usually we can do it without any problems:
406+
There is an important case around type hints. Sometimes we want to describe a variable that comes from within
407+
our package. Usually, we can do it without problems:
404408

405409
```python
406410
from my_package import my_data_class
@@ -411,10 +415,10 @@ def my_function(data: my_data_class) -> None:
411415

412416
```
413417

414-
and it will work fine. But we may encounter *circual imports* that are a problem. What is a *circular import*?
415-
It is a case when we want to import module B into module A but module A is already imported into module B.
416-
It seems like we are importing the same module twice into itself. The issue is rare when we program without type
417-
hinting. However, with type hints it could be tedious.
418+
And it will work fine. But we may encounter *circular imports* that need to be fixed. What is a *circular import*?
419+
It is a case when we want to import module B into module A, but module A is already imported into module B.
420+
We are importing the same module into itself. The issue is rare when we program without type
421+
hinting. However, with type hints, it could be tedious.
418422

419423
Thus, if you encounter this error:
420424

@@ -431,12 +435,13 @@ def my_function(data: my_data_class) -> None:
431435
ImportError: cannot import name 'my_data_class' from partially initialized module 'my_package' (most likely due to a circular import) (/home/user/my_package/__init__.py)
432436
```
433437

434-
Then you should use `typing.TYPE_CHECKING` clause to avoid circular imports. The example:
438+
Then you should use the `typing.TYPE_CHECKING` clause to avoid circular imports. The example:
435439

436440
```python
437441
from __future__ import annotations
438442
from typing import TYPE_CHECKING
439443

444+
440445
if TYPE_CHECKING:
441446
from my_package import my_data_class
442447

@@ -446,18 +451,58 @@ def my_function(data: my_data_class) -> None:
446451

447452
```
448453

449-
Unfortunately, the solution is dirty because we have to
450-
use `if TYPE_CHECKING` clause and `from __future__ import annotations` import to make it work! Type hinting
451-
is not only roses and butterflies!
454+
Unfortunately, the solution is *dirty* because we have to
455+
use the `if TYPE_CHECKING` clause and `from __future__ import annotations` import to make it work. It make our
456+
script messier! Type hinting is not only the roses and butterflies!
457+
458+
The nice feature of type hinting is that we can define variable's type within a function:
459+
460+
```python
461+
from typing import Dict
462+
import numpy as np
463+
464+
465+
def validate_model_input(data: np.ndarray) -> Dict:
466+
"""
467+
Function checks if dataset has enough records to perform modeling.
468+
469+
Parameters
470+
----------
471+
data : np.ndarray
472+
Input data.
473+
474+
Returns
475+
-------
476+
: Dict
477+
Dictionary with `data`, `info` and `status` to decide if pipeline can proceed with modeling.
478+
"""
479+
480+
output: Dict = None # type hinting
481+
482+
# Probably we don't have the lines below yet
483+
484+
# if data.shape[0] > 50:
485+
# output = {"data": data, "info": "Dataset is big enough for statistical tests.", "status": True}
486+
# else:
487+
# output = {"data": data, "info": "Dataset is too small for statistical tests.", "status": False}
488+
489+
return output
490+
491+
```
492+
493+
We will use this feature rarely. The most probable scenario is when we start defining a function and its output, but
494+
we don't know how we will process data. In this context, we can still run type checking to be sure that the
495+
function behaves as we expect within the newly designed pipeline.
452496

453-
### Type hinting: final remarks and tools
497+
(Another scenario: we will be forced to add type hints to silence dynamic type checkers from some IDEs ;) ).
454498

455-
There are few tools designed for static type checking. The most popular one is [`mypy`](https://mypy.readthedocs.io/en/stable/).
456-
It's a good idea to add it to your Continuous Integration (CI) pipeline.
457-
Other tools are integrated with popular IDEs like `PyCharm` or `VSCode`, most of them are based on `mypy` logic.
458499

459-
At this point, we have a good understanding of type hints and how to use them in our code. There is one last thing to
460-
remember. **Type hints are not required in all our functions and we can introduce those gradually, it won't damage our code**.
461-
It is very convenient way of using this extraordinary feature!
500+
### Type hinting: final remarks
462501

502+
There are tools designed for static type checking. The most popular one is [`mypy`](https://mypy.readthedocs.io/en/stable/).
503+
Adding it to your Continuous Integration (CI) pipeline is a good idea.
504+
Other tools are integrated with popular IDEs like `PyCharm` or `VSCode`; most are based on `mypy` logic.
463505

506+
The last thing to remember is that **type hints are optional in all our functions, and we can introduce them gradually,
507+
which won't damage our code and output generated by CI type checking tools**.
508+
It is a very convenient way of using this extraordinary feature!

0 commit comments

Comments
 (0)