Releases · aws/aws-sdk-pandas

01 Sep 01:20

igorborgest

1.9.0

d93b8a3

AWS Data Wrangler 1.9.0

Breaking changes

Global configuration s3fs_block_size was replaced by s3_block_size #370

New Functionalities

Automatic recovery of Pandas indexes from Parquet files. #366
Automatic recovery of Pandas time zones from Parquet files. #366
Optional schema evolution disabling through the new schema_evolution argument. #353

Enhancements

s3fs dependency was replaced by builtin code. #370
Significant Amazon S3 I/O speed up for high latency environments (e.g. local, on-premises). #370

Bug Fix

Improve NaN handling. #362
Sanitise table name for partitions insertion #360

Docs

Few updates.

Thanks

We thank the following contributors/users for their work on this release:

@isrsal, @bppont, @weishao-aws, @alexifm, @Digma, @samcon, @TerrellV, @msantino, @alvaropc, @luigift, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!

Assets 7

11 Aug 18:26

igorborgest

1.8.1

c6e4ff0

AWS Data Wrangler 1.8.1

Bug Fix

Fix NaN values handling for wr.athena.read_sql_*(). #351

Docs

Instructions for installation in AWS Glue PySpark Jobs. #46

Thanks

We thank the following contributors/users for their work on this release:

@czagoni, @josecw, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel file are available below. Just upload it and run!

Assets 7

09 Aug 20:03

igorborgest

1.8.0

4196c59

AWS Data Wrangler 1.8.0

New Functionalities

wr.s3.to_parquet() now has max_rows_by_file argument. #283
Support for Unix path pattern matching (*, ?, [seq], [!seq]) for any list/read/delete/copy function on S3. #322

Enhancements

Mypy applied with strict mode.

Bug Fix

Fix unnecessary table versioning (glue catalog) creation for wr.s3.to_parquet() during appends. #342
Lack of sanitisation in indexes names for wr.s3.to_parquet/csv(). #343

Docs

New Who uses AWS Data Wrangler? section!!!

Thanks

We thank the following contributors/users for their work on this release:

@Thiago-Dantas, @andre-marcos-perez, @ericct, @marcelo-vilela, @edvorkin, @nicholas-miles, @chrispruitt, @rparthas ,@igorborgest.

P.S. Lambda Layer zip file and Glue wheel file are available below. Just upload it and run!

Assets 7

30 Jul 13:58

igorborgest

1.7.0

281ad53

AWS Data Wrangler 1.7.0

Breaking changes

The partitioned parquet reading now has a different approach for pushdown filters. For details check the tutorial

New Functionalities

Global configuration module - TUTORIAL
Concurrently partitions write - TUTORIAL
Flexible Partitions Filter (PUSH-DOWN) - TUTORIAL
Add Athena query metadata to Pandas DataFrames returned by wr.athane.read_sql_*() - TUTORIAL #331
wr.athena.describe_table() #329
wr.athena.show_create_table() #334
Add path_ignore_suffix argument to all read functions #326

Enhancements

Support for PyArrow 1.0.0 #337
Support for Pandas 1.1.0
Support writing encrypted redshift copy manifest to S3 #327
wr.athane.read_sql_*() now accepts empty results #299
Allow connect_args to be passed when creating an SQL engine from a glue connection #309
Add skip_header_line_count argument to wr.catalog.create_csv_table() #338

Bug Fix

Add missing type annotations and fix types in docstrings. #321
KeyError: 'StatementType' with Athena using max_cache_seconds #323
wr.s3.read_csv() slow with chunksize #324
wr.s3.read_csv() with "chunksize" does not forward pandas_kwargs "encoding" #330
Ensure DataFrame mutability for wr.athane.read_sql_*() w/ ctas_approach=True #335

Docs

Several small updates.

Thanks

We thank the following contributors/users for their work on this release:

@kylepierce, @davidszotten, @meganburger, @erikcw, @JPFrancoia, @zacharycarter, @DavideBossoli88, @c-line, @anand086, @jasadams, @mrtns, @schot, @koiker, @flaviomax, @bryanyang0528, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel file are available below. Just upload it and run!

Assets 7

12 Jul 14:40

igorborgest

1.6.3

6f31e1b

AWS Data Wrangler 1.6.3

New Functionalities

Add wr.catalog.get_partitions(). #305

Enhancements

Improving Decimal casting.

Bug Fix

Fix support for support for boto3 >= 1.14.18. 🐞 #315

Docs

Add Spark Table Interoperability tutorial.
General small updates.

Thanks

We thank the following contributors/users for their work on this release:

@jasadams, @bryanyang0528, @qemtek, @igorborgest.

P.S. Lambda Layer zip file and Glue wheel file are available below. Just upload it and run!

Assets 7

01 Jul 19:39

igorborgest

1.6.2

377098b

AWS Data Wrangler 1.6.2

Enhancements

Now casting columns before append on an existing table only if necessary (wr.s3.to_parquet()).
Add retry mechanism for InternalError on s3 object deletion.
Add handling of immutable numpy arrays. (flag.writeable==False)

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

Assets 7

26 Jun 02:55

igorborgest

1.6.1

df4c00c

AWS Data Wrangler 1.6.1

Enhancements

Casting support for any column type to string using dtype argument on wr.s3.to_parquet()

Bug Fix

General bugs related to Athena Cache. 🐞

Docs

General small updates.

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

Assets 7

24 Jun 19:26

igorborgest

1.6.0

e000078

AWS Data Wrangler 1.6.0

New Functionalities

Amazon Athena CACHE 🚀 #285
Initial AWS STS module

Enhancements

Numpy 1.19.0
Add auto_create and db_groups arguments to get_redshift_temp_engine #288
Add validate_schema arguments to wr.s3.read_parquet_table
Add safe argument to read_parquet #296
Refactor naming of pandas kwargs #291
Allow providing suffix to s3.store_parquet_metadata #295
Add last_modified_begin and last_modified_begin to list_objects, read_csv, read_json, read_fwf and read_parquet

Bug Fix

Fix bug on get_table_description on tables w/o description #294

Docs

Add Athena cache tutorial.

Thanks

We thank the following contributors/users for their work on this release:

@koiker, @patrick-muller, @flaviomax, @acere, @jarretg, @bryanyang0528, @schrobot, @kinghuang, @igorborgest.

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

Assets 7

14 Jun 01:39

igorborgest

1.5.0

01b7bef

AWS Data Wrangler 1.5.0

New Functionalities

Amazon QuickSight support! 🎉
Add create/delete database on wr.glue

Enhancements

General improvements in the tutorials
New Amazon S3 path check
Add sanitize_columns arg for s3.to_parquet and s3.to_csv #278 #279
Remove memory copy of DataFrame for to_parquet and to_csv

Bug Fix

Force index=False for wr.db.to_sql() with redshift

Thanks

We thank the following contributors/users for their work on this release:

@ywang103, @patrick-muller, @tuliocasagrande, @sarojdongol, @sdknij, @ilyanoskov, @igorborgest.

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

Assets 7

02 Jun 22:58

igorborgest

1.4.0

0dcda71

AWS Data Wrangler 1.4.0

New Functionalities

Add support for reading CSV, JSON and FWF partitions. #265

Enhancements

General improvement of moto tests

Bug Fix

Fix encoding arg support for reading CSV, JSON and FWF. #271

Thanks

We thank the following contributors/users for their work on this release:

@bryanyang0528, @dwbelliston, @patrick-muller, @sdknij, @igorborgest.

P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!

P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).

Assets 7

Releases: aws/aws-sdk-pandas

AWS Data Wrangler 1.9.0

Breaking changes

New Functionalities

Enhancements

Bug Fix

Docs

Thanks

Uh oh!

AWS Data Wrangler 1.8.1

Bug Fix

Docs

Thanks

Uh oh!

AWS Data Wrangler 1.8.0

New Functionalities

Enhancements

Bug Fix

Docs

Thanks

Uh oh!

AWS Data Wrangler 1.7.0

Breaking changes

New Functionalities

Enhancements

Bug Fix

Docs

Thanks

Uh oh!

AWS Data Wrangler 1.6.3

New Functionalities

Enhancements

Bug Fix

Docs

Thanks

Uh oh!

AWS Data Wrangler 1.6.2

Enhancements

Uh oh!

AWS Data Wrangler 1.6.1

Enhancements

Bug Fix

Docs

Uh oh!

AWS Data Wrangler 1.6.0

New Functionalities

Enhancements

Bug Fix

Docs

Thanks

Uh oh!

AWS Data Wrangler 1.5.0

New Functionalities

Enhancements

Bug Fix

Thanks

Uh oh!

AWS Data Wrangler 1.4.0

New Functionalities

Enhancements

Bug Fix

Thanks

Uh oh!