Skip to content

Commit 5736894

Browse files
committed
Merge branch 'master' into dev
2 parents bbdf425 + 7a5e1f0 commit 5736894

File tree

2 files changed

+28
-7
lines changed

2 files changed

+28
-7
lines changed

README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,8 @@ wr.db.to_sql(df, engine, schema="test", name="my_table")
7171
- [PyPi (pip)](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#pypi-pip)
7272
- [Conda](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#conda)
7373
- [AWS Lambda Layer](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#aws-lambda-layer)
74-
- [AWS Glue Wheel](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#aws-glue-wheel)
74+
- [AWS Glue Python Shell Jobs](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#aws-glue-python-shell-jobs)
75+
- [AWS Glue PySpark Jobs](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#aws-glue-pyspark-jobs)
7576
- [Amazon SageMaker Notebook](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#amazon-sagemaker-notebook)
7677
- [Amazon SageMaker Notebook Lifecycle](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#amazon-sagemaker-notebook-lifecycle)
7778
- [EMR](https://aws-data-wrangler.readthedocs.io/en/latest/install.html#emr)

docs/source/install.rst

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,16 +33,36 @@ and press **create** to create the layer.
3333

3434
4 - Go to your Lambda and select your new layer!
3535

36-
AWS Glue Wheel
37-
--------------
36+
AWS Glue Python Shell Jobs
37+
--------------------------
3838

39-
.. note:: AWS Data Wrangler has compiled dependencies (C/C++) so there is only support for ``Glue Python Shell``, **not** for ``Glue PySpark``.
39+
1 - Go to `GitHub's release page <https://github.com/awslabs/aws-data-wrangler/releases>`_ and download the wheel file
40+
(.whl) related to the desired version.
4041

41-
1 - Go to `GitHub's release page <https://github.com/awslabs/aws-data-wrangler/releases>`_ and download the wheel file (.whl) related to the desired version.
42+
2 - Upload the wheel file to any Amazon S3 location.
43+
44+
3 - Go to your Glue Python Shell job and point to the wheel file on S3 in
45+
the *Python library path* field.
46+
47+
48+
`Official Glue Python Shell Reference <https://docs.aws.amazon.com/glue/latest/dg/add-job-python.html#create-python-extra-library>`_
49+
50+
AWS Glue Python PySpark Jobs
51+
----------------------------
52+
53+
.. note:: AWS Data Wrangler has compiled dependencies (C/C++) so there is only support for ``Glue PySpark Jobs >= 2.0``.
54+
55+
1 - Go to `GitHub's release page <https://github.com/awslabs/aws-data-wrangler/releases>`_ and download the wheel file
56+
(.whl) related to the desired version.
4257

4358
2 - Upload the wheel file to any Amazon S3 location.
4459

45-
3 - Go to your Glue Python Shell job and point to the new file on S3.
60+
3 - Go to your Glue PySpark job and create a new *Job parameters* key/value:
61+
62+
* Key: ``--additional-python-modules``
63+
* Value: ``s3://{BUCKET_NAME}/awswrangler-{VERSION}-py3-none-any.whl``
64+
65+
`Official Glue PySpark Reference <https://docs.aws.amazon.com/glue/latest/dg/reduced-start-times-spark-etl-jobs.html#reduced-start-times-new-features>`_
4666

4767
Amazon SageMaker Notebook
4868
-------------------------
@@ -123,7 +143,7 @@ complement Big Data pipelines.
123143
sudo pip-3.6 install awswrangler
124144
125145
.. note:: Make sure to freeze the Wrangler version in the bootstrap for productive
126-
environments (e.g. awswrangler==1.0.0)
146+
environments (e.g. awswrangler==1.8.1)
127147
128148
From Source
129149
-----------

0 commit comments

Comments
 (0)