Skip to content

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Nov 19, 2025

This PR targets the following PR:


What

This PR expands the Snowflake destination V4 migration guide by filling in 6 placeholder <!-- Devin: comments with detailed, actionable instructions for users migrating from version 3 to version 4.

The expansion addresses:

  • Step-by-step UI instructions for enabling "Disable Final Tables" in v3
  • Detailed 7-step process for users who interact with both raw and final tables, including concrete schema configuration examples
  • Documentation of raw table naming patterns for v2/v3 and v4
  • SQL queries for finding and dropping legacy raw tables
  • Bug fixes: typo "Snowfalke" → "Snowflake" and critical terminology fix "database" → "schema" for airbyte_internal

How

The changes were made by:

  1. Reviewing Snowflake connector source code (SnowflakeSpecification.kt, SnowflakeNameGenerators.kt, TypingDedupingUtil.kt)
  2. Examining historical v3 source code (commit 2ea4544^) to verify UI placement of "Disable Final Tables"
  3. Reviewing existing Typing and Deduping and Direct Loading documentation
  4. Analyzing test code to understand raw table naming patterns
  5. Iterating on PR feedback from @ian-at-airbyte to ensure accuracy and clarity

Review guide

Critical items requiring verification:

  1. UI steps (lines 27-31): Verify these steps match the actual v3 UI flow:

    • Open the Advanced section (verified in v3 spec with "group": "advanced")
    • Turn on Disable Final Tables
    • Click Test and save

    ⚠️ These steps were verified through code review but not tested in a live v3 environment.

  2. Schema configuration guidance (lines 49-55): Confirm the field names and examples match what users see in the UI:

    • "Schema" field for final tables destination
    • "Airbyte Internal Table Dataset Name" field for raw-only destination
    • Example values (ANALYTICS_V4, AIRBYTE_INTERNAL_RAW)
  3. Raw table naming patterns (lines 73-80): Validate these patterns are accurate:

    • V2/V3: raw_{namespace}__{stream}
    • V4: {namespace}_raw__stream_{stream} (with variable underscores based on longest underscore run)
  4. SQL queries (lines 88-108): Test these Snowflake SQL commands work correctly:

    • SHOW TABLES IN SCHEMA <DATABASE>.<INTERNAL_SCHEMA> LIKE 'RAW\_%';
    • DROP TABLE IF EXISTS <DATABASE>.<INTERNAL_SCHEMA>.<TABLE_NAME>;
  5. Terminology fix (line 77): Confirm airbyte_internal is correctly referred to as a "schema" not a "database" - this is critical for user safety.

  6. Migration flow completeness (lines 47-65): Review the 7-step process for users with both raw and final tables to ensure it's logical, complete, and uses only v3 terminology (no V4 references before upgrade).

User Impact

Positive:

  • Users migrating from Snowflake v3 to v4 now have clear, step-by-step instructions for all migration scenarios
  • Concrete examples for configuring distinct schemas to avoid table name collisions
  • Accurate information about where to find and how to clean up legacy raw tables
  • SQL queries they can copy-paste to safely remove old tables
  • Corrected safety warnings about the airbyte_internal schema

Potential risks:

  • If UI steps don't match actual v3 UI, users may get confused
  • If SQL queries have syntax errors, users may encounter errors when cleaning up tables
  • If schema configuration examples don't match UI labels, users may struggle to configure destinations correctly

Can this PR be safely reverted and rolled back?

  • YES 💚

This is purely documentation - no code changes. Reverting would restore the placeholder comments.


Link to Devin run: https://app.devin.ai/sessions/72110afd12ce49dc9203f7763a3e9962
Requested by: [email protected] (@ian-at-airbyte)

Note: The UI steps and SQL queries were verified through code review and historical source analysis but have not been tested in live v3/Snowflake environments. Reviewer verification is important, especially for the UI flow and SQL syntax.

- Add step-by-step UI instructions for enabling Disable Final Tables in v3
- Provide detailed guidance for users with both raw and final tables
- Document raw table naming patterns for v2/v3 and v4
- Add SQL queries for finding and dropping legacy raw tables
- Fix typo: Snowfalke -> Snowflake
- Fix terminology: database -> schema for airbyte_internal

Co-Authored-By: [email protected] <[email protected]>
@devin-ai-integration devin-ai-integration bot requested a review from a team as a code owner November 19, 2025 00:54
@devin-ai-integration
Copy link
Contributor Author

Original prompt from [email protected]
@Devin I've started a PR to expand the Snowflake destination V4 migration guide. <https://github.com/airbytehq/airbyte/pull/69728> I've left some comments in that migration guide itself with requests for you. They look like `<!-- Devin:`.

1. Review the connector code for Snowflake.
2. Review the third-party API documentation for Snowflake.
3. Review the existing documentation for Typing and Deduping and Direct Loading.
4. Based on that knowledge, branch off my PR and PR back into it. Your PR should expand the areas of the documentation where I've asked for your assistance. Don't get too creative. Stay on task for now.
Thread URL: https://airbytehq-team.slack.com/archives/D08FX8EC9L0/p1763513205257389

@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Helpful Resources

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • /format-fix - Fixes most formatting issues.
  • /bump-version - Bumps connector versions.
    • You can specify a custom changelog by passing changelog. Example: /bump-version changelog="My cool update"
    • Leaving the changelog arg blank will auto-populate the changelog from the PR title.
  • /run-cat-tests - Runs legacy CAT tests (Connector Acceptance Tests)
  • /build-connector-images - Builds and publishes a pre-release docker image for the modified connector(s).
  • JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
    • /bump-bulk-cdk-version bump=patch changelog='foo' - Bump the Bulk CDK's version. bump can be major/minor/patch.
  • Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.

📝 Edit this welcome message.

2. For each Snowflake destination you have, add an identical second Snowflake destination.

3. Ensure each pair of Snowflake connectors have opposite settings for <!-- Devin: fill out the rest of this step to use Disable Final Tables from Snowflake v3 -->
3. Ensure each pair of Snowflake connectors have opposite settings for **Disable Final Tables** (in version 3) or **Legacy raw tables** (in version 4). One connector should have this setting enabled, and the other should have it disabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Google.WordList] Use 'turn off' or 'off' instead of 'disabled'.


5. Update your connections to point to the appropriate destination:
- Connections that need raw tables only should target the destination with **Disable Final Tables** or **Legacy raw tables** enabled.
- Connections that need final tables should target the destination with this setting disabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Google.WordList] Use 'turn off' or 'off' instead of 'disabled'.


The table names match these patterns depending on which version created them:

- **Version 2/3 (Typing and Deduping)**: `raw_{namespace}__{stream}` (for example, `airbyte_internal.raw_public__users`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [vale] reported by reviewdog 🐶
[Vale.Spelling] Did you really mean 'Deduping'?

- **Version 2/3 (Typing and Deduping)**: `raw_{namespace}__{stream}` (for example, `airbyte_internal.raw_public__users`)
- **Version 4 (Legacy raw tables mode)**: `{namespace}_raw__stream_{stream}` (for example, `airbyte_internal.public_raw__stream__users`)

Note: The number of underscores between `raw` and `stream` may vary depending on the longest underscore sequence in your namespace and stream names.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Google.Colons] ': T' should be in lowercase.


:::note
Version 4 of the Snowfalke destination uses the `airbyte_internal` database for temporary scratch space (for example, streams running in dedup mode, truncate refreshes, and overwrite syncs). Dropping the entire `airbyte_internal database` can interrupt active syncs and cause data loss. Only drop the specific raw tables you no longer need.
Version 4 of the Snowflake destination uses the `airbyte_internal` schema for temporary scratch space (for example, streams running in dedup mode, truncate refreshes, and overwrite syncs). Dropping the entire `airbyte_internal` schema can interrupt active syncs and cause data loss. Only drop the specific raw tables you no longer need.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 [vale] reported by reviewdog 🐶
[Vale.Spelling] Did you really mean 'dedup'?

@github-actions
Copy link
Contributor

github-actions bot commented Nov 19, 2025

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-kkwc2a8c8-airbyte-growth.vercel.app

Built with commit 0735277.
This pull request is being automatically deployed with vercel-action

Comment on lines 27 to 33
3. Click **Edit configuration**.

4. Open the **Advanced** section.

5. Turn on **Disable Final Tables**.

6. Click **Save**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check these UI steps? They don't appear to be correct to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this! I based the UI steps on the v4 spec placement, but I don't have access to verify the exact v3 UI flow. Could you confirm:

  1. Is "Edit configuration" the correct button text, or should it be something else (e.g., "Edit", "Settings")?
  2. Is "Disable Final Tables" located under an "Advanced" section in v3, or is it in a different location?
  3. Are there any other UI differences I should be aware of?

I can update the steps once I know the correct v3 UI flow, or I can make them more location-agnostic if the exact path varies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It should be enough to open the Connector. It defaults to the "Settings" tab and you don't need to click "Edit Configuration". I don't even think that label exists.
  2. Unclear, but you should be able to determine this by examining the connector's historical source code for a commit prior to version 4.0.
  3. The "Save" button is always "Test and save" in the UI in Airbyte.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification! I've updated the steps to:

  1. Remove "Click Edit configuration" (since it defaults to Settings tab)
  2. Keep "Open the Advanced section" (verified by checking the v3 spec at commit 2ea4544^ - the disable_type_dedupe field had "group": "advanced")
  3. Change "Save" to "Test and save"

The updated steps are now:

  1. In the navigation bar, click Destinations.
  2. Open your Snowflake connector.
  3. Open the Advanced section.
  4. Turn on Disable Final Tables.
  5. Click Test and save.

Pushing this change now.

6. Click **Save**.

:::note
After upgrading to version 4, this setting appears as **Legacy raw tables** and should remain enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use "should". Just say it remains enabled and assume success.

2. For each Snowflake destination you have, add an identical second Snowflake destination.

3. Ensure each pair of Snowflake connectors have opposite settings for <!-- Devin: fill out the rest of this step to use Disable Final Tables from Snowflake v3 -->
3. Ensure each pair of Snowflake connectors have opposite settings for **Disable Final Tables** (in version 3) or **Legacy raw tables** (in version 4). One connector should have this setting enabled, and the other should have it disabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should assume at this point they haven't upgraded and use the V3 instructions only.

4. Configure distinct default schemas for each destination to avoid table name collisions.

5. Update your connections to point to the appropriate destination:
- Connections that need raw tables only should target the destination with **Disable Final Tables** or **Legacy raw tables** enabled.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto about not using V4 instructions before they upgrade

3. Ensure each pair of Snowflake connectors have opposite settings for <!-- Devin: fill out the rest of this step to use Disable Final Tables from Snowflake v3 -->
3. Ensure each pair of Snowflake connectors have opposite settings for **Disable Final Tables** (in version 3) or **Legacy raw tables** (in version 4). One connector should have this setting enabled, and the other should have it disabled.

4. Configure distinct default schemas for each destination to avoid table name collisions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on this a bit more? I'm not convinced everyone knows how to do this.

- Remove 'should' wording, assume success (comment 2543815075)
- Use V3 terminology only, remove V4 references before upgrade (comments 2543817615, 2543818089)
- Elaborate on schema configuration with concrete examples (comment 2543819122)
- Fix markdownlint trailing spaces in SQL code blocks
- Change 'enabled/disabled' to 'turned on/off' per Vale style guide

Co-Authored-By: [email protected] <[email protected]>
@devin-ai-integration
Copy link
Contributor Author

I've addressed most of your feedback in commit 2e8bdd5:

Completed changes:

  • Comment 2543815075: Removed "should" wording - now says "remains enabled" assuming success
  • Comments 2543817615 and 2543818089: Removed all V4 terminology from the "both raw and final tables" section - now uses only V3 "Disable Final Tables" terminology
  • Comment 2543819122: Elaborated on schema configuration with concrete examples showing how to set distinct schemas for both destinations
  • ✅ Fixed markdownlint trailing spaces in SQL code blocks
  • ✅ Changed "enabled/disabled" to "turned on/off" per Vale style guide

Pending:

  • Comment 2543814410: Waiting for your guidance on the correct UI steps for v3 (see my reply on that comment)

Once you confirm the correct UI flow for v3, I'll update those steps and the PR will be ready for final review.

You can preview the updated documentation here: https://airbyte-docs-aokexb71k-airbyte-growth.vercel.app/integrations/destinations/snowflake-migrations

3. Ensure each pair of Snowflake connectors have opposite settings for **Disable Final Tables**. One connector should have this setting turned on, and the other should have it turned off.

4. Configure distinct default schemas for each destination to avoid table name collisions:
- For the destination that will create final tables, set a distinct **Schema** in the Snowflake destination configuration (for example, `ANALYTICS_V4`). This is where final tables will be written.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Google.Will] Avoid using 'will'.

3. Ensure each pair of Snowflake connectors have opposite settings for **Disable Final Tables**. One connector should have this setting turned on, and the other should have it turned off.

4. Configure distinct default schemas for each destination to avoid table name collisions:
- For the destination that will create final tables, set a distinct **Schema** in the Snowflake destination configuration (for example, `ANALYTICS_V4`). This is where final tables will be written.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Google.Will] Avoid using 'will'.


4. Configure distinct default schemas for each destination to avoid table name collisions:
- For the destination that will create final tables, set a distinct **Schema** in the Snowflake destination configuration (for example, `ANALYTICS_V4`). This is where final tables will be written.
- For the raw-only destination (with **Disable Final Tables** turned on), set a distinct **Airbyte Internal Table Dataset Name** under the **Advanced** section (for example, `AIRBYTE_INTERNAL_RAW`). This is where raw tables will be written.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [vale] reported by reviewdog 🐶
[Google.Will] Avoid using 'will'.

- Remove 'Edit configuration' step (defaults to Settings tab)
- Keep 'Advanced' section (verified in v3 spec)
- Change 'Save' to 'Test and save' (correct button label)

Addresses comment 2543835428

Co-Authored-By: [email protected] <[email protected]>
@ian-at-airbyte ian-at-airbyte merged commit 1c67397 into docs-snowflake-4-detailed-migration-guide Nov 19, 2025
33 checks passed
@ian-at-airbyte ian-at-airbyte deleted the devin/1763513537-expand-snowflake-migration-guide branch November 19, 2025 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants