Skip to content

[FEATURE] RFC 225 follow-up: finalize function-based transformations & deprecation path for Koheesio 0.11 #235

@dannymeijer

Description

@dannymeijer

Background

RFC 225: Function-Based Transformations for Koheesio 0.11 introduce function-based and decorator-based transformation APIs (Transformation.from_func, ColumnsTransformation.from_func / from_multi_column_func, and the new decorators).
The feature/function-based-transformations-rfc-0.11 branch delivers the initial implementation plus deprecations, but there is still follow-up work to fully finish the rollout for 0.11 and to clearly stage the later 1.0 removals.

This issue tracks that follow-up.

Scope

  • In scope for 0.11 prep
    • Polishing and consolidating the new function-/decorator-based APIs.
    • Migrating “obvious” built-in simple transformations where it makes sense for 0.11.
    • Tightening and centralizing documentation/migration guidance around RFC 225.
  • In scope for planning (not implementing) 1.0
    • Clearly documenting what will be removed in 1.0 (and how to migrate), based on the now-implemented design.

Goals

  1. Ensure the most frequently used simple transformations are available in the new style, so teams can adopt RFC 225 patterns immediately.
  2. Clearly document the deprecation path and 1.0 breakage plan, without actually implementing 1.0 yet.

Tasks

  • API & implementation follow-ups for 0.11

    • Inventory built-in transformations that are “simple enough” to reasonably migrate in 0.11 (string casing, trimming, basic numeric transforms, etc.).
    • For each selected transformation, add a function-based or decorator-based equivalent:
      - Prefer ColumnsTransformation.from_func() or @column_transformation / @multi_column_transformation.
      - Keep subclass-based versions in place for now; this is additive.
    • Ensure new versions fully respect ColumnConfig/ListOfColumns behavior (type filtering, run_for_all, strict mode).
  • Documentation & migration guidance (0.11)

    • Update docs/reference/spark/transformations.md so examples and sections lead with:
      - Transformation.from_func() and ColumnsTransformation.from_func() / from_multi_column_func().
      - @transformation, @column_transformation, @multi_column_transformation.
    • Move subclassing examples into an “Advanced / when to subclass” section and explicitly call out when subclassing is still the right tool.
    • Expand docs/releases/0.11.md with a short “How to start using RFC 225 now” section (linking to the reference for details).
  • Deprecation path & 1.0 planning (docs only)

    • Clearly document that the following are deprecated in 0.11 and targeted for removal in 1.0:
      - Transform
      - ColumnsTransformationWithTarget
      - Nested ColumnConfig classes
    • Add a concise migration guide in the docs describing how to replace each deprecated pattern with the new APIs:
      - TransformTransformation.from_func()
      - ColumnsTransformationWithTargetColumnsTransformation.from_func() + target_column
      - Nested ColumnConfig → field-based config on ColumnsTransformation
    • Outline (but not implement) the expected 1.0 cleanup in a dedicated “Future changes (1.0)” docs section so users can plan ahead.
  • Validation & feedback loop

    • Run the full test suite after adding any 0.11 migrations, ensuring coverage for new function-/decorator-based built-ins.
    • Capture early adopter feedback on:
      - The ergonomics of the new APIs (especially decorators and pattern detection).
      - Whether additional built-ins should be migrated before cutting 0.11.
      - Any confusing aspects in the docs or migration guidance.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions