Data Integration: add optional truncate table for FULL_TABLE sync (Postgres & MySQL)#6076
Open
topdev998 wants to merge 1 commit intomage-ai:masterfrom
Open
Data Integration: add optional truncate table for FULL_TABLE sync (Postgres & MySQL)#6076topdev998 wants to merge 1 commit intomage-ai:masterfrom
topdev998 wants to merge 1 commit intomage-ai:masterfrom
Conversation
Member
|
Have you tested with real Mage data integration pipeline? |
Author
Hi, thanks for the question! I also attempted to run a full Mage pipeline using the Docker dev environment, but encountered a transient dependency issue during the frontend build (Yarn registry error), so I wasn’t able to complete that end-to-end run. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Add an optional truncate_full_table configuration flag for SQL destinations to support truncating the destination table during FULL_TABLE replication.
When enabled, the destination table is truncated before loading a fresh snapshot, ensuring idempotent full refresh behavior. This logic is implemented in the shared SQL base class and applied to PostgreSQL and MySQL destinations.
Motivation
Currently,
FULL_TABLEreplication appends data unless the table is manually cleared. This enhancement allows users to perform a clean overwrite of the destination table automatically, which is a common requirement for full refresh pipelines.Implementation details
KEY_TRUNCATE_FULL_TABLE = "truncate_full_table"indestinations/constants.py-
truncate_full_tableproperty-
build_truncate_table_commandshelper indestinations/sql/base.pyDestination.build_query_strings:replication_method == FULL_TABLEtruncate_full_table == TrueTRUNCATE TABLEstatement during the initial batch (batch 0), before insert queriesdestinations/postgresql/templates/config.jsondestinations/mysql/templates/config.json(default is false to preserve existing behavior)
How to use
Example (PostgreSQL):
{
"database": "your_db",
"host": "your_host",
"password": "your_password",
"port": 5432,
"schema": "public",
"table": "your_table",
"username": "your_user",
"truncate_full_table": true
}
Example (MySQL):
{
"database": "your_db",
"host": "your_host",
"password": "your_password",
"port": 3306,
"table": "your_table",
"username": "your_user",
"use_lowercase": true,
"truncate_full_table": true
}
How Has This Been Tested?
Unit tests
Added unit tests for both destinations:
PostgreSQLDestinationTests.test_build_query_strings_truncate_full_tableMySQLDestinationTests.test_build_query_strings_truncate_full_tableThese tests verify that:
TRUNCATE TABLEis generated when:replication_method = FULL_TABLEtruncate_full_table = TrueFULL_TABLERun tests:
Manual validation
build_query_stringstruncate_full_table=true→TRUNCATE TABLEpresenttruncate_full_table=false→ no truncateFULL_TABLE→ no truncateChecklist