- Fixed CI/CD release workflow.
- Restored
pip-toolsdependency inrequirements/base.txt, because it is required forbf buildcommand to work properly. - Limited
wheelversion, as 0.46.1 is broken. - Upgraded test dockerfiles to python 3.8.
- Made DAG failed import fallbacks more robust.
- Support composer-2.10.1-airflow-2.9.3
- Support for development on Apple M3 processors
- Support for Python 3.10
- BigQuery query job labeling for collect and write operations. Labels are passed via
job_labelsdict argument inDatasetConfigurationandDatasetManager.
- Switched from Google Container Registry to Artifact Registry. Made
-r/--docker-repositorycommon for all deploy commands. Build and deploy commands authenticate to the Docker repository taken fromdeployment_config.pyor CLI arguments, instead of hardcodedhttps://eu.gcr.io.
- Bumped basic dependencies: Apache Beam 2.48.0, google-cloud-bigtable 2.17.0, google-cloud-language 2.10.0, google-cloud-storage 2.11.2, among others (#374).
- Added the
env_variableargument tobigflow.Workflowwhich enables to change a name of the variable used to obtain environment name (#365).
- Fixed compatibility issues with Cloud Composer 2.x and Airflow 2.x
- Cloud Composer 2.0.x is not supported properly, please use either 1.x or 2.1+
- Bigflow CLI commands won't fail on additional unknown parameters. This allows to pass additional parameters to BigFlow Jobs.
- Bumped dependencies of main libraries (e.g. Apache Beam to version 2.45 or BigQuery to version 3.6.0). It enabled compatibility with MacBooks with M1 processor.
- Requires Python version = 3.8
- Enabled vault endpoint TLS certificate verification by default for
bf buildandbf deploycommands. This fixes the MITM attack vulnerability. Kudos to Konstantin Weddige for reporting.
- Default vault endpoint TLS certificate verification for
bf buildandbf deploymay fail in some environments. Use-vev/--vault-endpoint-verifyoption to disable or provide path to custom trusted certificates or CA certificates. Disabling makes execution vulnerable for MITM attacks and is discouraged - do it only when justified and in trusted environments. See https://requests.readthedocs.io/en/latest/user/advanced/#ssl-cert-verification for details.
- Added two more parameters in KubernetesPodOperator required since Composer 2.1.0
- MarkupSafe bumped to >2.1.0 (avoiding the broken 2.1.0 version)
- Jinja version bumped to >=3<4
- Fixing the DAG builder issue introduced in 1.5.1 – now it produces DAGs compatible with (airflow 1.x + composer 1.x) or (airflow 2.x + composer 2.x)
Broken! – DAG builder produces DAGs incompatible with (airflow 1.x + composer 1.x). Fixed in 1.5.2.
- Composer 2.0 support – using
composer-user-workloadsnamespace inside generated DAGs if running on Composer 2.X, to fix the problem with inheriting the Composer SA
- Setting grpcio-status as <=1.48.2 to avoid problems with pip-compile on protobuf package
- changing docker image caching implementation – using BUILDKIT_INLINE_CACHE=1 only if cache properties are set
- always installing typing-extensions>=3.7 to avoid clashes
- Deprecated
loganddataprocextras
- The
base_frozenextras with frozen base requirements - More type hints
- Making exporting image to tar as optional
bf buildarguments validation- fixed broken MarkupSafe package version
- Optional support for 'pytest' as testing framework
- Labels support for datasets and tables in DatasetConfig and DatasetManager
- Check if docker image was pushed before deploying airflow dags
- Propagate 'env' to bigflow jobs
- Automatically create and push
:latestdocker tag
- Changes in building toolchain - optimizes, better logging, optional setup.py
- Tool for synching requirements with Dataflow preinstalled python dependencies.
- Schema for 'dirty' (not on git tag or there are local changes) versions was changed. Now it includes git commit and git workdir hash instead of random suffix.
- Don't delete intermediate docker layers after build.
- Dockerfile template was changed (does not affect existing projects).
- Integration with
pip-tools - Configurable timeout for jobs
Job.execution_timeout - Flag
Job.depends_on_pastto ignore errors from previous job - Tools/base classes to simplify writing e2e tests for BigQuery
pyproject.tomlto new projects
- Project configuration moved to
setup.py - The same
setup.pyis used by Beam to build tarballs - Deprecate some functions at
bigflow.resourcesandbigflow.commons - Bigflow uses version ranges for dependencies
- Dataflow jobs starts much faster
- Airflow task waits until Dataflow job is finished
- Fixed
Job.retry_countandJob.retry_pause_sec
- Initial release