Skip to content

Commit d4b841c

Browse files
claudevdmClaude
andauthored
Update changes.md with pickler changes. (#36558)
* Update changes.md with pickler changes. * Remove trailing whitespace. --------- Co-authored-by: Claude <[email protected]>
1 parent 2b666da commit d4b841c

File tree

1 file changed

+16
-0
lines changed

1 file changed

+16
-0
lines changed

CHANGES.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,22 @@
116116
## Breaking Changes
117117

118118
* X behavior was changed ([#X](https://github.com/apache/beam/issues/X)).
119+
* (Python) `dill` is no longer a required, default dependency for Apache Beam ([#21298](https://github.com/apache/beam/issues/21298)).
120+
- This change only affects pipelines that explicitly use the `pickle_library=dill` pipeline option.
121+
- While `dill==0.3.1.1` is still pre-installed on the official Beam SDK base images, it is no longer a direct dependency of the apache-beam Python package. This means it can be overridden by other dependencies in your environment.
122+
- If your pipeline uses `pickle_library=dill`, you must manually ensure `dill==0.3.1.1` is installed in both your submission and runtime environments.
123+
- Submission environment: Install the dill extra in your local environment `pip install apache-beam[gcpdill]`.
124+
- Runtime (worker) environment: Your action depends on how you manage your worker's environment.
125+
- If using default containers or custom containers with the official Beam base image e.g. `FROM apache/beam_python3.10_sdk:2.69`
126+
- Add `dill==0.3.1.1` to your worker's requirements file (e.g., requirements.txt)
127+
- Pass this file to your pipeline using the --requirements_file requirements.txt pipeline option (For more details see [managing Dataflow dependencies](https://cloud.google.com/dataflow/docs/guides/manage-dependencies#py-custom-containers)).
128+
- If custom containers with a non-Beam base image e.g. `FROM python:3.9-slim`
129+
- Install apache-beam with the dill extra in your docker file e.g. `RUN pip install --no-cache-dir apache-beam[gcp,dill]`
130+
- If there is a dill version mismatch between submission and runtime environments you might encounter unpickling errors like `Can't get attribute '_create_code' on <module 'dill._dill' from...`.
131+
- If dill is not installed in the runtime environment you will see the error `ImportError: Pipeline option pickle_library=dill is set, but dill is not installed...`
132+
- Report any issues you encounter when using `pickle_library=dill` to the GitHub issue ([#21298](https://github.com/apache/beam/issues/21298))
133+
* (Python) Added a `pickle_library=dill_unsafe` pipeline option. This allows overriding `dill==0.3.1.1` using dill as the pickle_library. Use with extreme caution. Other versions of dill has not been tested with Apache Beam ([#21298](https://github.com/apache/beam/issues/21298)).
134+
* (Python) The deterministic fallback coder for complex types like NamedTuple, Enum, and dataclasses now normalizes filepaths for better determinism guarantees. This affects streaming pipelines updating from 2.68 to 2.69 that utilize this fallback coder. If your pipeline is affected, you may see a warning like: "Using fallback deterministic coder for type X...". To update safely sepcify the pipeline option `--update_compatibility_version=2.68.0` ([#36345](https://github.com/apache/beam/pull/36345)).
119135
* (Python) Fixed transform naming conflict when executing DataTransform on a dictionary of PColls ([#30445](https://github.com/apache/beam/issues/30445)).
120136
This may break update compatibility if you don't provide a `--transform_name_mapping`.
121137
* Removed deprecated Hadoop versions (2.10.2 and 3.2.4) that are no longer supported for [Iceberg](https://github.com/apache/iceberg/issues/10940) from IcebergIO ([#36282](https://github.com/apache/beam/issues/36282)).

0 commit comments

Comments
 (0)