Skip to content

Commit 726746c

Browse files
authored
Migrated Tinybird payload from JSON field to String field (#24484)
ref https://linear.app/ghost/issue/PROD-2271/figure-out-a-path-forward-with-tinybird-for-the-experimental-json-data Our Tinybird workspace currently uses a JSON field type for the `payload` in our main datasource. This JSON field is a beta feature that isn't _quite_ ready for production yet, so for now we need to migrate back to storing the payload in a String field. Luckily, this change only impacts two important files: - `analytics_events.datasource` — our main datasource - `mv_hits.pipe` — pipe that feeds our main materialized view All of our other pipes and endpoints read from the `mv_hits` materialized view, rather than reading from the `analytics_events` datasource directly, which means there are no required changes to any of our other pipes/endpoints. Testing this change: - [x] This change should deploy successfully to Tinybird, starting with a fresh/empty workspace (clear local workspace with `tb local remove`) - [x] This change should deploy successfully to Tinybird, starting with the existing workspace/schema (checkout main, deploy to tb local, checkout this branch, deploy to tb local) - [x] Existing Tinybird endpoint tests should all pass without modification - [x] The full Ghost -> Analytics Service -> Tinybird -> Ghost Admin event ingestion and querying flow should work after the migration - [x] No existing data should be lost/corrupted/duplicated in the course of the migration/deployment (checkout main, deploy to tb local, add some events, checkout this branch, deploy to tb local) - [x] Querying the `mv_hits.pipe` before and after should return exactly the same results (`tb sql 'select * from mv_hits;'`) - [x] An event recorded before the migration in the mv_hits materialized view should exactly match the same event recorded after the migration, except for the timestamp (including the full ingestion flow with the analytics service) - [x] There should be no downtime during the deployment process — events recorded continuously through the migration should continue to be ingested and seamlessly populate the mv_hits materialized view without dropping or corrupting any events (technically this isn't critical/blocking yet since we're still in beta, but it's a good stress test on our deployment process) Note: this also changes the `analytics_events_test.datasource` to match, but this datasource is not used at all in production, so testing this is not a concern.
1 parent 6bc1d68 commit 726746c

File tree

3 files changed

+19
-13
lines changed

3 files changed

+19
-13
lines changed

ghost/core/core/server/data/tinybird/datasources/analytics_events.datasource

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,12 @@ SCHEMA >
66
`session_id` String `json:$.session_id`,
77
`action` LowCardinality(String) `json:$.action`,
88
`version` LowCardinality(String) `json:$.version`,
9-
`payload` JSON(max_dynamic_types=4, max_dynamic_paths=32) `json:$.payload`,
9+
`payload` String `json:$.payload`,
1010
`site_uuid` LowCardinality(String) `json:$.payload.site_uuid`
1111

1212
ENGINE "MergeTree"
1313
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
1414
ENGINE_SORTING_KEY "site_uuid, timestamp"
15+
16+
FORWARD_QUERY >
17+
SELECT timestamp, session_id, action, version, toString(payload) as payload, site_uuid

ghost/core/core/server/data/tinybird/datasources/analytics_events_test.datasource

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,12 @@ SCHEMA >
88
`session_id` String `json:$.session_id`,
99
`action` LowCardinality(String) `json:$.action`,
1010
`version` LowCardinality(String) `json:$.version`,
11-
`payload` JSON(max_dynamic_types=4, max_dynamic_paths=32) `json:$.payload`,
11+
`payload` String `json:$.payload`,
1212
`site_uuid` LowCardinality(String) `json:$.payload.site_uuid`
1313

1414
ENGINE "MergeTree"
1515
ENGINE_PARTITION_KEY "toYYYYMM(timestamp)"
1616
ENGINE_SORTING_KEY "site_uuid, timestamp"
17+
18+
FORWARD_QUERY >
19+
SELECT timestamp, session_id, action, version, toString(payload) as payload, site_uuid

ghost/core/core/server/data/tinybird/pipes/mv_hits.pipe

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -5,20 +5,20 @@ SQL >
55
action,
66
version,
77
coalesce(session_id, '0') as session_id,
8-
toString(payload.locale) as locale,
9-
toString(payload.location) as location,
8+
JSONExtractString(payload, 'locale') as locale,
9+
JSONExtractString(payload, 'location') as location,
1010
case
11-
when isNull(payload.referrerSource) then toString(payload.meta.referrerSource)
12-
else toString(payload.referrerSource)
11+
when JSONExtractString(payload, 'referrerSource') = '' then JSONExtractString(payload, 'meta', 'referrerSource')
12+
else JSONExtractString(payload, 'referrerSource')
1313
end as referrer,
14-
toString(payload.pathname) as pathname,
15-
toString(payload.href) as href,
14+
JSONExtractString(payload, 'pathname') as pathname,
15+
JSONExtractString(payload, 'href') as href,
1616
site_uuid,
17-
toString(payload.member_uuid) as member_uuid,
18-
toString(payload.member_status) as member_status,
19-
toString(payload.post_uuid) as post_uuid,
20-
toString(payload.post_type) as post_type,
21-
lower(toString(getSubcolumn(payload,'user-agent'))) as user_agent
17+
JSONExtractString(payload, 'member_uuid') as member_uuid,
18+
JSONExtractString(payload, 'member_status') as member_status,
19+
JSONExtractString(payload, 'post_uuid') as post_uuid,
20+
JSONExtractString(payload, 'post_type') as post_type,
21+
lower(JSONExtractString(payload, 'user-agent')) as user_agent
2222
FROM analytics_events
2323
where action = 'page_hit'
2424

0 commit comments

Comments
 (0)