Skip to content

Target naming migration#2864

Draft
jshearer wants to merge 2 commits intomasterfrom
jshearer/target_naming_migration
Draft

Target naming migration#2864
jshearer wants to merge 2 commits intomasterfrom
jshearer/target_naming_migration

Conversation

@jshearer
Copy link
Copy Markdown
Contributor

@jshearer jshearer commented Apr 15, 2026

This is the migration script for target naming that I talk about in #2780. I decided to frame it as a new, temporary flowctl raw subcommand that analyzes existing materializations and determines the correct TargetNamingStrategy based on its current source.targetNaming, endpoint configuration, and built resource paths. By default it prints a dry-run report. With --execute, it publishes the changes one materialization at a time.

Migration classification

Every materialization is classified into one of:

  • MIGRATE: target_naming and per-binding x-schema-name can be set automatically without causing unintended backfilling.
  • MANUAL: Either we can determine the correct settings, but applying them would cause resource paths to change from 1-element ([table]) to 2-element ([schema, table]), or we can't determine the correct schema automatically (no endpoint config schema, no consistent schema in resource paths).
  • SKIP_NO_SCHEMA: connector doesn't support x-schema-name (no schema pointer in its resource spec).
  • SKIP_NOT_CONNECTOR: Dekaf
  • SKIP_ALREADY_SET / SKIP_DISABLED_NO_BUILT_SPEC: already migrated, or disabled with no built spec to analyze.

Strategy selection rules

TargetNamingStrategy is derived from source.targetNaming:

source.targetNaming Proposed strategy
WithSchema MatchSourceStructure
PrefixSchema PrefixTableNames { schema, skip_common_defaults: false }
PrefixNonDefaultSchema PrefixTableNames { schema, skip_common_defaults: true }
NoSchema SingleSchema { schema }
No source capture MatchSourceStructure, falling back to SingleSchema if existing bindings conflict

The no-source-capture case tries MatchSourceStructure because that was the de facto behavior before targetNaming existed (update_materialization_resource_spec unconditionally derived x-schema-name from collection names). This is a slight deviation from the original migration proposal, which grouped no-source-capture with NoSchema -> SingleSchema. The fallback to SingleSchema means the end result is the same when collection names don't match path schemas.

For strategies that require a schema value, the tool resolves it from (in order):

  • the endpoint config's schema/dataset/namespace field
  • a unanimous schema detected from existing built resource paths
  • the connector's well-known default (Snowflake: PUBLIC).

Filling x-schema-name on existing bindings

targetNaming controls how future bindings (from auto-discover) get their schema and table names. Existing bindings need x-schema-name filled in separately to match where their data actually lives.

For many existing bindings, the strategy-derived schema matches the actual schema in the resource path, and x-schema-name is simply set to that value. But for some bindings, the two diverge: a binding created before x-schema-name existed, or via a code path that didn't populate it, would have been placed in whatever schema the endpoint config specified, which may not match what the strategy would derive from the collection name.

When the customer explicitly set source.targetNaming, the tool preserves their strategy for future bindings but fills in the actual schema (from the built resource path) on existing bindings where the strategy-derived value would conflict. The report flags these as (actual; strategy would produce "..." for new bindings).

When no source.targetNaming was set and a binding's collection-derived schema doesn't match its resource path schema, the tool falls back to SingleSchema with the resolved endpoint schema. If the endpoint schema also doesn't match the resource path schemas, the task is marked as MANUAL.

Snowflake compatibility mode handling

materialize-snowflake uniquely produces 1-element resource paths ([table]) when the binding's schema matches the endpoint-config's default, and 2-element paths ([schema, table]) otherwise. The migration tool mirrors the connector's logic to determine whether setting x-schema-name would preserve or change the resource path. When the endpoint config has no explicit schema, the tool assumes Snowflake's default of PUBLIC.

Disabled materializations

Disabled materializations with a built spec are analyzed normally. Disabled materializations without a built spec are skipped entirely, as they're old enough that re-enabling them at this point would almost certainly require a backfill anyway.

Execute mode

With --execute, the tool publishes each MIGRATE materialization individually:

  • Re-fetches the spec at publish time for the latest last_pub_id (optimistic concurrency)
  • Sets targetNaming on the materialization
  • Fills in x-schema-name on bindings that are missing it
  • Publishes via draft_specs

@jshearer jshearer force-pushed the jshearer/target_naming_migration branch from 19e75b0 to e244ce7 Compare April 15, 2026 21:34
…ource configs

Previously, `generate_missing_materialization_configs` delegated resource config generation to the generic `stub_config` path, which always derived x-schema-name from the 2nd-to-last collection name component regardless of the materialization's configured strategy.

Now resource stubs are created via `update_materialization_resource_spec`, which populates x-schema-name and x-collection-name according to the materialization's `target_naming` and `source` settings. This means `flowctl generate` produces resource configs that match what the runtime and auto-discover would produce for the same materialization.
@jshearer jshearer force-pushed the jshearer/target_naming_migration branch 8 times, most recently from 7ca065d to d8ae42e Compare April 16, 2026 00:46
Adds `flowctl raw migrate-target-naming` to analyze all materializations and determine the appropriate `TargetNamingStrategy` for each, based on the legacy `source.targetNaming` field and endpoint configuration.

For each materialization, the tool:
* Looks up x-schema-name support from `connector_tags.resource_spec_schema`
* Maps the legacy `TargetNaming` enum to the new `TargetNamingStrategy` (`MatchSourceStructure`, `SingleSchema`, `PrefixTableNames`)
* Detects the endpoint schema from connector config, falling back to the common schema across existing resource paths
* Analyzes each binding to determine whether filling in x-schema-name would change the resource path (requiring manual intervention) or target a different database schema
* Falls back from `MatchSourceStructure` to `SingleSchema` when collection names don't match existing resource path schemas
* Handles Snowflake's backwards-compat behavior where 1-element paths are preserved when the schema matches the endpoint default

The report classifies each materialization as MIGRATE (safe to auto-migrate), MANUAL (needs human intervention due to resource path changes or ambiguous schema), or various SKIP reasons. Disabled tasks with synthetic binding-N resource paths are classified as MIGRATE since they'll backfill on re-enable. Disabled materializations without a built spec are skipped entirely.
@jshearer jshearer force-pushed the jshearer/target_naming_migration branch from d8ae42e to 48e7f93 Compare April 16, 2026 00:53
@jshearer jshearer self-assigned this Apr 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant