Skip to content
This repository was archived by the owner on Aug 20, 2025. It is now read-only.
51 changes: 51 additions & 0 deletions proposals/20220919-airflow_orchestration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#### SIG TFX-Addons
# Project Proposal

**Your name:** Varun Murthy

**Your email:** murthyvs@google.com

**Your company/organization:** Google Core ML TFX Team (MTV + Seoul)

**Project name:** Airflow Orchestration

## Project Description
Moving Airflow to TFX add-ons due to decreasing native support.
Copy link
Contributor

@rcrowe-google rcrowe-google Sep 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe what the results of the project will be, and a short introduction to what Airflow is and how it's used. You can also discuss the transition from having Airflow support included in TFX directly, versus having it as a separate install.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can we installed and used as a separate package under tfxa/.


## Project Category
Orchestrator

## Project Use-Case(s)
We are moving Airflow from tfx/orchestration to tfx-addons. Native support for Airflow won't be provided in the near future.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A use-case describes how the project will be used by developers. In this case it should be something more like:

Apache Airflow is a widely used orchestrator for ML workflows. Developers can use Airflow with TFX to orchestrate TFX pipelines. This project continues support for Airflow, which includes the web console and CLI tooling which developers can use to monitor and control their TFX pipelines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imagine that you're a developer who is new to TFX. What do you need to know about Airflow, and how it fits with TFX? Why should you consider installing this module?


## Project Implementation
1. Copy tfx/orchestration/airflow to tfx-addons/airflow.
Copy link
Contributor

@rcrowe-google rcrowe-google Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please describe how the install will work
  2. Please discuss any issues with testing in CI, given that this code will be independent of TFX
  3. Please describe any restructuring of the code, such as adding a setup script and tests

Copy link
Author

@murthyvs murthyvs Sep 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So ... this is where it gets a bit interesting because the tutorials on the TF website need to reflect the deprecation too. It's a multi-part effort in which we need to first move the orchestrator to TFXA and THEN update the documentation, delete/remove dependencies from tfx/. We can't provide an accurate, line-by-line description as of today ... but we need to create a project in TFXA to get started.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's ok, just describe the issues and some options for dealing with them, and note that we haven't settled on a particular option yet. At some point however, we will need a solid plan to do this so that we can be successful.

2. Mark Airflow as deprecated in tfx/ and indicate that support will be dropped in 1-2 releases.
3. Update TFX tutorials on www.tensorflow.org to indicate deprecated and moving to TFXA

## Project Dependencies
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. tfx/dsl/components/base:base_node
2. tfx/dsl/components/base:base_component
3. tfx/dsl/components/base:base_executor
4. tfx/dsl/components/base:executor_spec
5. tfx/orchestration:data_types
6. tfx/orchestration:metadata
7. tfx/orchestration/config:base_component_config
8. tfx/orchestration/launcher:base_component_launcher
9. tfx/orchestration:pipeline
10. tfx/orchestration:tfx_runner
11. tfx/orchestration/config:config_utils
12. tfx/orchestration/config:pipeline_config
13. tfx/utils:json_utils
14. tfx/utils:telemetry_utils

## Project Team
Varun Murthy (murthyvs@google.com)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, and we also need your Github ID

Google Core ML TFX Team (tfx-team@google.com)

# Note
Please be aware of the processes and requirements which are outlined here:

* [SIG-TFX-Addons](https://github.com/tensorflow/tfx-addons)
* [Contributing Guidelines](https://github.com/tensorflow/tfx-addons/blob/main/CONTRIBUTING.md)
* [TensorFlow Code of Conduct](https://github.com/tensorflow/tfx-addons/blob/main/CODE_OF_CONDUCT.md)