Skip to content

Commit 4f291ed

Browse files
committed
Update tutorials for 2.0.0
1 parent 0c11247 commit 4f291ed

7 files changed

+1049
-405
lines changed

tutorials/001 - Introduction.ipynb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@
1515
"source": [
1616
"## What is AWS Data Wrangler?\n",
1717
"\n",
18-
"An [open-source](https://github.com/awslabs/aws-data-wrangler>) Python package that extends the power of [Pandas](https://github.com/pandas-dev/pandas>) library to AWS connecting **DataFrames** and AWS data related services (**Amazon Redshift**, **AWS Glue**, **Amazon Athena**, **Amazon EMR**, etc).\n",
18+
"An [open-source](https://github.com/awslabs/aws-data-wrangler>) Python package that extends the power of [Pandas](https://github.com/pandas-dev/pandas>) library to AWS connecting **DataFrames** and AWS data related services (**Amazon Redshift**, **AWS Glue**, **Amazon Athena**, **Amazon Timestream**, **Amazon EMR**, etc).\n",
1919
"\n",
20-
"Built on top of other open-source projects like [Pandas](https://github.com/pandas-dev/pandas), [Apache Arrow](https://github.com/apache/arrow), [Boto3](https://github.com/boto/boto3), [SQLAlchemy](https://github.com/sqlalchemy/sqlalchemy), [Psycopg2](https://github.com/psycopg/psycopg2) and [PyMySQL](https://github.com/PyMySQL/PyMySQL), it offers abstracted functions to execute usual ETL tasks like load/unload data from **Data Lakes**, **Data Warehouses** and **Databases**.\n",
20+
"Built on top of other open-source projects like [Pandas](https://github.com/pandas-dev/pandas), [Apache Arrow](https://github.com/apache/arrow) and [Boto3](https://github.com/boto/boto3), it offers abstracted functions to execute usual ETL tasks like load/unload data from **Data Lakes**, **Data Warehouses** and **Databases**.\n",
2121
"\n",
2222
"Check our [list of functionalities](https://aws-data-wrangler.readthedocs.io/en/stable/api.html)."
2323
]
@@ -70,16 +70,16 @@
7070
},
7171
{
7272
"cell_type": "code",
73-
"execution_count": 2,
73+
"execution_count": 1,
7474
"metadata": {},
7575
"outputs": [
7676
{
7777
"data": {
7878
"text/plain": [
79-
"'1.9.0'"
79+
"'2.0.0'"
8080
]
8181
},
82-
"execution_count": 2,
82+
"execution_count": 1,
8383
"metadata": {},
8484
"output_type": "execute_result"
8585
}
@@ -93,9 +93,9 @@
9393
],
9494
"metadata": {
9595
"kernelspec": {
96-
"display_name": "Python 3",
96+
"display_name": "conda_python3",
9797
"language": "python",
98-
"name": "python3"
98+
"name": "conda_python3"
9999
},
100100
"language_info": {
101101
"codemirror_mode": {
@@ -107,7 +107,7 @@
107107
"name": "python",
108108
"nbconvert_exporter": "python",
109109
"pygments_lexer": "ipython3",
110-
"version": "3.6.11"
110+
"version": "3.6.10"
111111
}
112112
},
113113
"nbformat": 4,

tutorials/007 - Redshift, MySQL, PostgreSQL.ipynb

Lines changed: 35 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,16 @@
66
"source": [
77
"[![AWS Data Wrangler](_static/logo.png \"AWS Data Wrangler\")](https://github.com/awslabs/aws-data-wrangler)\n",
88
"\n",
9-
"# 7 - Databases (Redshift, MySQL and PostgreSQL)\n",
9+
"# 7 - Redshift, MySQL and PostgreSQL\n",
1010
"\n",
11-
"[Wrangler](https://github.com/awslabs/aws-data-wrangler)'s Database module (`wr.db.*`) has two mainly functions that tries to follow the Pandas conventions, but add more data type consistency.\n",
11+
"[Wrangler](https://github.com/awslabs/aws-data-wrangler)'s Redshift, MySQL and PostgreSQL have two basic function in common that tries to follow the Pandas conventions, but add more data type consistency.\n",
1212
"\n",
13-
"- [wr.db.to_sql()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.db.to_sql.html#awswrangler.db.to_sql)\n",
14-
"\n",
15-
"- [wr.db.read_sql_query()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.db.read_sql_query.html#awswrangler.db.read_sql_query)"
13+
"- [wr.redshift.to_sql()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.redshift.to_sql.html)\n",
14+
"- [wr.redshift.read_sql_query()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.redshift.read_sql_query.html)\n",
15+
"- [wr.mysql.to_sql()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.mysql.to_sql.html)\n",
16+
"- [wr.mysql.read_sql_query()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.mysql.read_sql_query.html)\n",
17+
"- [wr.postgresql.to_sql()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.postgresql.to_sql.html)\n",
18+
"- [wr.postgresql.read_sql_query()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.postgresql.read_sql_query.html)"
1619
]
1720
},
1821
{
@@ -34,15 +37,11 @@
3437
"cell_type": "markdown",
3538
"metadata": {},
3639
"source": [
37-
"### Creating an engine (SQLAlchemy Engine)\n",
38-
"\n",
39-
"The Wrangler offers basically three diffent ways to create a SQLAlchemy engine.\n",
40-
"\n",
41-
"1 - [wr.catalog.get_engine()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.catalog.get_engine.html#awswrangler.catalog.get_engine): Get the engine from a Glue Catalog Connection.\n",
40+
"## Connect throught Glue Catalog Connections\n",
4241
"\n",
43-
"2 - [wr.db.get_engine()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.db.get_engine.html#awswrangler.db.get_engine): Get the engine from primitives values (host, user, password, etc).\n",
44-
"\n",
45-
"3 - [wr.db.get_redshift_temp_engine()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.db.get_redshift_temp_engine.html#awswrangler.db.get_redshift_temp_engine): Get redshift engine with temporary credentials. "
42+
"- [wr.redshift.connect()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.redshift.connect.html)\n",
43+
"- [wr.mysql.connect()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.mysql.connect.html)\n",
44+
"- [wr.postgresql.connect()](https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.postgresql.connect.html)"
4645
]
4746
},
4847
{
@@ -51,9 +50,9 @@
5150
"metadata": {},
5251
"outputs": [],
5352
"source": [
54-
"eng_postgresql = wr.catalog.get_engine(\"aws-data-wrangler-postgresql\")\n",
55-
"eng_mysql = wr.catalog.get_engine(\"aws-data-wrangler-mysql\")\n",
56-
"eng_redshift = wr.catalog.get_engine(\"aws-data-wrangler-redshift\")"
53+
"con_redshift = wr.redshift.connect(\"aws-data-wrangler-redshift\")\n",
54+
"con_mysql = wr.mysql.connect(\"aws-data-wrangler-mysql\")\n",
55+
"con_postgresql = wr.postgresql.connect(\"aws-data-wrangler-postgresql\")"
5756
]
5857
},
5958
{
@@ -72,13 +71,13 @@
7271
"name": "stdout",
7372
"output_type": "stream",
7473
"text": [
75-
"(1,)\n"
74+
"[1]\n"
7675
]
7776
}
7877
],
7978
"source": [
80-
"with eng_postgresql.connect() as con:\n",
81-
" for row in con.execute(\"SELECT 1\"):\n",
79+
"with con_redshift.cursor() as cursor:\n",
80+
" for row in cursor.execute(\"SELECT 1\"):\n",
8281
" print(row)"
8382
]
8483
},
@@ -95,9 +94,9 @@
9594
"metadata": {},
9695
"outputs": [],
9796
"source": [
98-
"wr.db.to_sql(df, eng_postgresql, schema=\"public\", name=\"tutorial\", if_exists=\"replace\", index=False) # PostgreSQL\n",
99-
"wr.db.to_sql(df, eng_mysql, schema=\"test\", name=\"tutorial\", if_exists=\"replace\", index=False) # MySQL\n",
100-
"wr.db.to_sql(df, eng_redshift, schema=\"public\", name=\"tutorial\", if_exists=\"replace\", index=False) # Redshift"
97+
"wr.redshift.to_sql(df, con_redshift, schema=\"public\", table=\"tutorial\", mode=\"overwrite\")\n",
98+
"wr.mysql.to_sql(df, con_mysql, schema=\"test\", table=\"tutorial\", mode=\"overwrite\")\n",
99+
"wr.postgresql.to_sql(df, con_postgresql, schema=\"public\", table=\"tutorial\", mode=\"overwrite\")"
101100
]
102101
},
103102
{
@@ -164,9 +163,20 @@
164163
}
165164
],
166165
"source": [
167-
"wr.db.read_sql_query(\"SELECT * FROM public.tutorial\", con=eng_postgresql) # PostgreSQL\n",
168-
"wr.db.read_sql_query(\"SELECT * FROM test.tutorial\", con=eng_mysql) # MySQL\n",
169-
"wr.db.read_sql_query(\"SELECT * FROM public.tutorial\", con=eng_redshift) # Redshift"
166+
"wr.redshift.read_sql_query(\"SELECT * FROM public.tutorial\", con=con_redshift)\n",
167+
"wr.mysql.read_sql_query(\"SELECT * FROM test.tutorial\", con=con_mysql)\n",
168+
"wr.postgresql.read_sql_query(\"SELECT * FROM public.tutorial\", con=con_postgresql)"
169+
]
170+
},
171+
{
172+
"cell_type": "code",
173+
"execution_count": 6,
174+
"metadata": {},
175+
"outputs": [],
176+
"source": [
177+
"con_redshift.close()\n",
178+
"con_mysql.close()\n",
179+
"con_postgresql.close()"
170180
]
171181
}
172182
],

0 commit comments

Comments
 (0)