Skip to content

v0.2.0

Choose a tag to compare

@alxmrs alxmrs released this 28 Jan 22:29
· 236 commits to main since this release
2d3df59

New version of weather-sp. Fixes and improvements to weather-dl and weather-mv.

Thanks to our volunteer open source contributors and Google 20%ers!

Current State

All three tools are still in their beta and alpha stages. In this release, the stability of weather-mv was especially improved. We've been able to execute streaming ingestion of Grib data into BigQuery. Users of weather-sp will now have greater control to express the output location of split files through a file pattern template.

weather-dl: Minor fixes

  • We fixed GCS timeout issues experienced intermittently.
  • Issue with mandatory partition keys was fixed.

weather-mv: Major fixes for tool stability

  • Grib support added.
  • Row extraction is faster by loading weather data into memory.
  • Log messages were improved.
  • Writes to BigQuery will use the most efficient method (streaming vs file upload).
  • XArray Open step is made generic.
  • Several fixes were introduced.
    • JSON serialization fixes.
    • Dataflow environment will now include get ecCodes installed so we can run cfgrib.
    • Tarballs are smaller / faster to upload to Dataflow (or another Beam runner).
    • BigQuery write errors were fixed.

weather-sp: New version

The splitter now supports flexible specification of output files.

General project improvements

  • Documentation was groomed.
  • Windows developer pathway was documented.
  • Fix in developer scripts (now we can better dev-test different branches of the project) and slow CI.
  • Announced open developer meetings.

What's Changed

  • weather-dl: Fix GCS timeout issues the pipelines intermittently experiences. by @alxmrs in #72
  • Improve grib file processing speed by @pramodg in #74
  • Default behavior is better by @lakshmanok in #77
  • Updating script to use new package name by @CillianFn in #79
  • Better progress logs for weather-mv. by @alxmrs in #82
  • weather-mv fix: Serializing all numpy float and int types to JSON. by @alxmrs in #83
  • Documented windows workaround. by @alxmrs in #85
  • Updated weather-mv install process to setup ecCodes on worker machine. by @alxmrs in #86
  • Groomed documentation by @alxmrs in #88
  • Coercing timedelta to float by @alxmrs in #89
  • weather-mv: Allow users to pass in keyword arguments to xarray.open_dataset by @alxmrs in #87
  • weather-splitter: allow for more flexible output files by @uhager in #65
  • Fix slow test runs by @CillianFn in #92
  • Add check for partition_keys when using append_date_dirs by @CillianFn in #90
  • Exclude test data from tarball by @CillianFn in #93
  • weather-mv – Fixed error writing to BigQuery: Excluding non-coordinate indexes if they don't appear in the Schema by @alxmrs in #95
  • Updating tool versions in prep for release. by @alxmrs in #97
  • Announcing open developer meetings. by @alxmrs in #96

New Contributors

Full Changelog: v0.1.1...v0.2.0