Skip to content

Commit d9cc5f2

Browse files
authored
Generate html doc for notebooks (#572)
1 parent 505e404 commit d9cc5f2

File tree

18 files changed

+127
-68
lines changed

18 files changed

+127
-68
lines changed

.readthedocs.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
version: 2
2+
formats: all
3+
conda:
4+
environment: docs/environment.yml

awswrangler/_config.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -364,8 +364,8 @@ def _inject_config_doc(doc: Optional[str], available_configs: Tuple[str, ...]) -
364364
if "\n Parameters" not in doc:
365365
return doc
366366
header: str = (
367-
"\n Note\n ----"
368-
"\n This functions has arguments that can has default values configured globally through "
367+
"\n\n Note\n ----"
368+
"\n This function has arguments which can be configured globally through "
369369
"*wr.config* or environment variables:\n\n"
370370
)
371371
args: Tuple[str, ...] = tuple(f" - {x}\n" for x in available_configs)

awswrangler/athena/_read.py

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -590,12 +590,12 @@ def read_sql_query(
590590
591591
**Related tutorial:**
592592
593-
- `Amazon Athena <https://github.com/awslabs/aws-data-wrangler/blob/
594-
main/tutorials/006%20-%20Amazon%20Athena.ipynb>`_
595-
- `Athena Cache <https://github.com/awslabs/aws-data-wrangler/blob/
596-
main/tutorials/019%20-%20Athena%20Cache.ipynb>`_
597-
- `Global Configurations <https://github.com/awslabs/aws-data-wrangler/blob/
598-
main/tutorials/021%20-%20Global%20Configurations.ipynb>`_
593+
- `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
594+
tutorials/006%20-%20Amazon%20Athena.html>`_
595+
- `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
596+
tutorials/019%20-%20Athena%20Cache.html>`_
597+
- `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
598+
tutorials/021%20-%20Global%20Configurations.html>`_
599599
600600
**There are two approaches to be defined through ctas_approach parameter:**
601601
@@ -642,8 +642,8 @@ def read_sql_query(
642642
/athena.html#Athena.Client.get_query_execution>`_ .
643643
644644
For a practical example check out the
645-
`related tutorial <https://github.com/awslabs/aws-data-wrangler/blob/
646-
main/tutorials/024%20-%20Athena%20Query%20Metadata.ipynb>`_!
645+
`related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
646+
tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
647647
648648
649649
Note
@@ -853,12 +853,12 @@ def read_sql_table(
853853
854854
**Related tutorial:**
855855
856-
- `Amazon Athena <https://github.com/awslabs/aws-data-wrangler/blob/
857-
main/tutorials/006%20-%20Amazon%20Athena.ipynb>`_
858-
- `Athena Cache <https://github.com/awslabs/aws-data-wrangler/blob/
859-
main/tutorials/019%20-%20Athena%20Cache.ipynb>`_
860-
- `Global Configurations <https://github.com/awslabs/aws-data-wrangler/blob/
861-
main/tutorials/021%20-%20Global%20Configurations.ipynb>`_
856+
- `Amazon Athena <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
857+
tutorials/006%20-%20Amazon%20Athena.html>`_
858+
- `Athena Cache <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
859+
tutorials/019%20-%20Athena%20Cache.html>`_
860+
- `Global Configurations <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
861+
tutorials/021%20-%20Global%20Configurations.html>`_
862862
863863
**There are two approaches to be defined through ctas_approach parameter:**
864864
@@ -902,8 +902,8 @@ def read_sql_table(
902902
/athena.html#Athena.Client.get_query_execution>`_ .
903903
904904
For a practical example check out the
905-
`related tutorial <https://github.com/awslabs/aws-data-wrangler/blob/main/
906-
tutorials/024%20-%20Athena%20Query%20Metadata.ipynb>`_!
905+
`related tutorial <https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/
906+
tutorials/024%20-%20Athena%20Query%20Metadata.html>`_!
907907
908908
909909
Note

awswrangler/s3/_read_parquet.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -684,7 +684,7 @@ def read_parquet_table(
684684
This function MUST return a bool, True to read the partition or False to ignore it.
685685
Ignored if `dataset=False`.
686686
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
687-
https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/023%20-%20Flexible%20Partitions%20Filter.ipynb
687+
https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
688688
columns : List[str], optional
689689
Names of columns to read from the file(s).
690690
validate_schema:

awswrangler/s3/_read_text.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ def read_csv(
217217
This function MUST return a bool, True to read the partition or False to ignore it.
218218
Ignored if `dataset=False`.
219219
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
220-
https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/023%20-%20Flexible%20Partitions%20Filter.ipynb
220+
https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
221221
pandas_kwargs :
222222
KEYWORD arguments forwarded to pandas.read_csv(). You can NOT pass `pandas_kwargs` explicit, just add valid
223223
Pandas arguments in the function call and Wrangler will accept it.
@@ -359,7 +359,7 @@ def read_fwf(
359359
This function MUST return a bool, True to read the partition or False to ignore it.
360360
Ignored if `dataset=False`.
361361
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
362-
https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/023%20-%20Flexible%20Partitions%20Filter.ipynb
362+
https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
363363
pandas_kwargs:
364364
KEYWORD arguments forwarded to pandas.read_fwf(). You can NOT pass `pandas_kwargs` explicit, just add valid
365365
Pandas arguments in the function call and Wrangler will accept it.
@@ -505,7 +505,7 @@ def read_json(
505505
This function MUST return a bool, True to read the partition or False to ignore it.
506506
Ignored if `dataset=False`.
507507
E.g ``lambda x: True if x["year"] == "2020" and x["month"] == "1" else False``
508-
https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/023%20-%20Flexible%20Partitions%20Filter.ipynb
508+
https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/tutorials/023%20-%20Flexible%20Partitions%20Filter.html
509509
pandas_kwargs:
510510
KEYWORD arguments forwarded to pandas.read_json(). You can NOT pass `pandas_kwargs` explicit, just add valid
511511
Pandas arguments in the function call and Wrangler will accept it.

awswrangler/s3/_write_excel.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ def to_excel(
4343
s3_additional_kwargs : Optional[Dict[str, Any]]
4444
Forward to botocore requests. Valid parameters: "ACL", "Metadata", "ServerSideEncryption", "StorageClass",
4545
"SSECustomerAlgorithm", "SSECustomerKey", "SSEKMSKeyId", "SSEKMSEncryptionContext", "Tagging",
46-
"RequestPayer", "ExpectedBucketOwner".
46+
"RequestPayer", "ExpectedBucketOwner".
4747
e.g. s3_additional_kwargs={'ServerSideEncryption': 'aws:kms', 'SSEKMSKeyId': 'YOUR_KMS_KEY_ARN'}
4848
use_threads : bool
4949
True to enable concurrent requests, False to disable multiple threads.

awswrangler/s3/_write_parquet.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -270,7 +270,7 @@ def to_parquet( # pylint: disable=too-many-arguments,too-many-locals
270270
s3_additional_kwargs : Optional[Dict[str, Any]]
271271
Forward to botocore requests. Valid parameters: "ACL", "Metadata", "ServerSideEncryption", "StorageClass",
272272
"SSECustomerAlgorithm", "SSECustomerKey", "SSEKMSKeyId", "SSEKMSEncryptionContext", "Tagging",
273-
"RequestPayer", "ExpectedBucketOwner".
273+
"RequestPayer", "ExpectedBucketOwner".
274274
e.g. s3_additional_kwargs={'ServerSideEncryption': 'aws:kms', 'SSEKMSKeyId': 'YOUR_KMS_KEY_ARN'}
275275
sanitize_columns : bool
276276
True to sanitize columns names (using `wr.catalog.sanitize_table_name` and `wr.catalog.sanitize_column_name`)
@@ -291,7 +291,7 @@ def to_parquet( # pylint: disable=too-many-arguments,too-many-locals
291291
concurrent_partitioning: bool
292292
If True will increase the parallelism level during the partitions writing. It will decrease the
293293
writing time and increase the memory usage.
294-
https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/022%20-%20Writing%20Partitions%20Concurrently.ipynb
294+
https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
295295
mode: str, optional
296296
``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
297297
For details check the related tutorial:
@@ -302,7 +302,7 @@ def to_parquet( # pylint: disable=too-many-arguments,too-many-locals
302302
If True allows schema evolution (new or missing columns), otherwise a exception will be raised.
303303
(Only considered if dataset=True and mode in ("append", "overwrite_partitions"))
304304
Related tutorial:
305-
https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/014%20-%20Schema%20Evolution.ipynb
305+
https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/tutorials/014%20-%20Schema%20Evolution.html
306306
database : str, optional
307307
Glue/Athena catalog: Database name.
308308
table : str, optional
@@ -740,7 +740,7 @@ def store_parquet_metadata( # pylint: disable=too-many-arguments
740740
s3_additional_kwargs : Optional[Dict[str, Any]]
741741
Forward to botocore requests. Valid parameters: "ACL", "Metadata", "ServerSideEncryption", "StorageClass",
742742
"SSECustomerAlgorithm", "SSECustomerKey", "SSEKMSKeyId", "SSEKMSEncryptionContext", "Tagging",
743-
"RequestPayer", "ExpectedBucketOwner".
743+
"RequestPayer", "ExpectedBucketOwner".
744744
e.g. s3_additional_kwargs={'ServerSideEncryption': 'aws:kms', 'SSEKMSKeyId': 'YOUR_KMS_KEY_ARN'}
745745
boto3_session : boto3.Session(), optional
746746
Boto3 Session. The default boto3 session will be used if boto3_session receive None.

awswrangler/s3/_write_text.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ def to_csv( # pylint: disable=too-many-arguments,too-many-locals,too-many-state
153153
s3_additional_kwargs : Optional[Dict[str, Any]]
154154
Forward to botocore requests. Valid parameters: "ACL", "Metadata", "ServerSideEncryption", "StorageClass",
155155
"SSECustomerAlgorithm", "SSECustomerKey", "SSEKMSKeyId", "SSEKMSEncryptionContext", "Tagging",
156-
"RequestPayer", "ExpectedBucketOwner".
156+
"RequestPayer", "ExpectedBucketOwner".
157157
e.g. s3_additional_kwargs={'ServerSideEncryption': 'aws:kms', 'SSEKMSKeyId': 'YOUR_KMS_KEY_ARN'}
158158
sanitize_columns : bool
159159
True to sanitize columns names or False to keep it as is.
@@ -173,7 +173,7 @@ def to_csv( # pylint: disable=too-many-arguments,too-many-locals,too-many-state
173173
concurrent_partitioning: bool
174174
If True will increase the parallelism level during the partitions writing. It will decrease the
175175
writing time and increase the memory usage.
176-
https://github.com/awslabs/aws-data-wrangler/blob/main/tutorials/022%20-%20Writing%20Partitions%20Concurrently.ipynb
176+
https://aws-data-wrangler.readthedocs.io/en/2.4.0-docs/tutorials/022%20-%20Writing%20Partitions%20Concurrently.html
177177
mode : str, optional
178178
``append`` (Default), ``overwrite``, ``overwrite_partitions``. Only takes effect if dataset=True.
179179
For details check the related tutorial:
@@ -563,7 +563,7 @@ def to_json(
563563
s3_additional_kwargs : Optional[Dict[str, Any]]
564564
Forward to botocore requests. Valid parameters: "ACL", "Metadata", "ServerSideEncryption", "StorageClass",
565565
"SSECustomerAlgorithm", "SSECustomerKey", "SSEKMSKeyId", "SSEKMSEncryptionContext", "Tagging",
566-
"RequestPayer", "ExpectedBucketOwner".
566+
"RequestPayer", "ExpectedBucketOwner".
567567
e.g. s3_additional_kwargs={'ServerSideEncryption': 'aws:kms', 'SSEKMSKeyId': 'YOUR_KMS_KEY_ARN'}
568568
use_threads : bool
569569
True to enable concurrent requests, False to disable multiple threads.

docs/environment.yml

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
channels:
2+
- conda-forge
3+
dependencies:
4+
- python>=3
5+
- pandoc
6+
- ipykernel
7+
- pip
8+
- pip:
9+
- nbsphinx
10+
- nbsphinx-link
11+
- sphinx
12+
- sphinx_bootstrap_theme
13+
- IPython
14+
- -e ..

docs/source/_ext/copy_tutorials.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
import json
2+
from pathlib import Path
3+
4+
5+
def setup(app):
6+
file_dir = Path(__file__).parent
7+
for f in file_dir.joinpath("../../../tutorials").glob("*.ipynb"):
8+
with open(file_dir.joinpath(f"../tutorials/{f.stem}.nblink"), "w") as output_file:
9+
nb_link = {"path": f"../../../tutorials/{f.name}", "extra-media": ["../../../tutorials/_static"]}
10+
json.dump(nb_link, output_file)

0 commit comments

Comments
 (0)