-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Description
What would you like to happen?
Full context here - https://docs.google.com/document/d/1JcVFJsPbVvtvaYdGi-DzWy9PIIYJhL7LwWGEXt2NZMk/edit?tab=t.0 - we should add some containers for running Beam ML jobs.
Steps:
- Create requirements files for these containers. These should be auto-generated like we do for other containers today - https://github.com/apache/beam/blob/master/sdks/python/container/run_generate_requirements.sh - Add logic for generating ml requirements #35484
- Automatically stage and publish CPU ML containers alongside our existing containers. This should be able to mostly reuse the existing Dockerfile with a different set of requirements - Start pushing ml containers #35595
- Automatically stage and publish GPU ML containers alongside our existing containers. This will require a different Dockerfile and likely will require building in a GPU environment (real or simulated)
- Allow users to specify
--sdk_container_image=mlor--sdk_container_image=gpuand auto-resolve to the correct container. - Add tests to run continuously against these built containers
Issue Priority
Priority: 2 (default / most feature requests should be filed as P2)
Issue Components
- Component: Python SDK
- Component: Java SDK
- Component: Go SDK
- Component: Typescript SDK
- Component: IO connector
- Component: Beam YAML
- Component: Beam examples
- Component: Beam playground
- Component: Beam katas
- Component: Website
- Component: Infrastructure
- Component: Spark Runner
- Component: Flink Runner
- Component: Samza Runner
- Component: Twister2 Runner
- Component: Hazelcast Jet Runner
- Component: Google Cloud Dataflow Runner