fix(firestore-bigquery-export): accept ISO 8601 string partition values#2813
Conversation
Restores 0.2.x behavior for Firestore string partition fields. The 0.3.0 refactor of PartitionValueConverter narrowed accepted inputs to Firestore Timestamp / timestamp-like / Date, silently coercing strings (including ISO 8601 dates such as "2026-01-01") to NULL and landing rows in the __NULL__ partition. PartitionValueConverter.convert now parses strings via new Date(value). Unparseable strings still return null and trigger the existing warning. Adds a defensive try/catch around the BigQuery formatter switch so any serialization failure degrades to null + warn rather than crashing the row write. Fixes #2803
There was a problem hiding this comment.
Code Review
This pull request restores support for ISO 8601 date/datetime strings as partition field values, addressing a regression introduced in version 0.3.0. The changes update the PartitionValueConverter to parse string inputs and include comprehensive unit tests for various date formats and edge cases. Feedback indicates that the DATETIME conversion needs to strip the 'Z' suffix for BigQuery compatibility and suggests adding a default case to the switch statement for improved robustness.
…witch Adds default: return null to PartitionValueConverter.convert. fieldType is typed as a strict union, so exhaustiveness is already enforced at compile time, but the value comes from external config — a runtime mismatch falls through cleanly to null instead of returning undefined. Per gemini-code-assist review on #2813.
|
Thanks for the review. Addressing each point: 1. Stripping 2. 3. Logging in catch: The caller |
|
The overall direction looks right, but I think this needs one more change before merging. The current implementation uses For example: new Date("2024-02-30").toISOString(); // Wrong: "2024-02-30" is not a real date, but JS normalizes it to "2024-03-01T00:00:00.000Z".
new Date("2024-01").toISOString(); // Wrong: this is a partial date, but JS treats it as "2024-01-01T00:00:00.000Z".
new Date("1").toISOString(); // Wrong: this is not a date format we should accept, but JS parses it as "2001-01-01T00:00:00.000Z". |
…ion strings
JavaScript's Date parser is too permissive for partition value validation:
new Date("2024-02-30") -> 2024-03-01 (silent month overflow)
new Date("2024-01") -> 2024-01-01 (silent partial-date fill)
new Date("1") -> 2001-01-01 (bare numeric as year)
new Date("2023-02-29") -> 2023-03-01 (non-leap-year overflow)
new Date("2024-01-15T10:30") -> engine-dependent local-time interpretation
Replaces the loose new Date() check with a strict YYYY-MM-DD prefix regex
that requires an explicit timezone designator (Z or +/-HH:MM) when a time
component is present, plus a calendar validator built via setUTCFullYear
that rejects calendar-invalid components like Feb 30, non-leap-year Feb 29,
month 13, and day 32.
Per CorieW review on #2813.
|
Good catch, agreed — Tightened in 4392a4f:
|
BigQuery DATE / DATETIME / TIMESTAMP all reject year 0 — the supported range is 0001-01-01 to 9999-12-31. The previous strict validator passed "0000-01-01" through (setUTCFullYear(0, 0, 1) yields year 0, matching input components) but BigQuery rejects it server-side, causing the same NULL-partition symptom this PR is fixing. Reject yearN < 1 client-side so the user gets a clear warning log.
|
One more gap caught on self-review, fixed in fba4ca3.
But BigQuery DATE / DATETIME / TIMESTAMP all reject year 0 (the supported range is 0001-01-01 to 9999-12-31), so the row would have failed server-side and ended up as the same NULL-partition symptom this PR is meant to fix, just on a more obscure input. Added an explicit Other edge cases I probed against the regex + validator + parser chain and confirmed are handled:
41/41 converter tests passing. |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request updates the firestore-bigquery-export extension to version 0.3.2, restoring support for ISO 8601 date and datetime strings as partition field values. The implementation introduces strict regex-based parsing and calendar validation within the PartitionValueConverter to ensure data integrity before BigQuery insertion. Comprehensive tests have been added to verify various date formats and boundary conditions. Feedback suggests optimizing performance by moving the regex to a static constant.
Adds 4 cases to PartitionValueConverter TIMESTAMP suite:
- out-of-range hour ("25:00:00Z") -> null
- out-of-range minute ("23:60:00Z") -> null
- timezone offset without colon ("+0800") -> accepted
- fractional seconds beyond millisecond precision -> accepted
The first two pin the implicit time-validation behavior (regex matches
H/M/S as 2-digit pairs, then new Date() rejects out-of-range values).
The last two document accepted RFC 3339 variants.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request restores support for ISO 8601 date and datetime strings as partition field values, addressing a regression introduced in version 0.3.0. The changes include strict validation logic using regex and calendar checks to ensure inputs fall within BigQuery's supported ranges (0001-9999) and formats. Extensive unit tests have been added to verify these improvements. Feedback indicates that the 'Z' suffix should be removed when converting to BigQuery DATETIME types to avoid potential serialization errors.
Adds a strict-equality test that asserts the DATETIME converter outputs "2024-01-15 10:30:00.000" (space separator, no Z) when fed an ISO 8601 string with a Z suffix. The existing tests only checked that the result contained "2024-01-15", which left the canonical-form contract unpinned. @google-cloud/bigquery's BigQuery.datetime() already normalises Z-suffixed input to BigQuery's canonical DATETIME form, so the production code is correct. This test guards against future regressions.
The regex in PartitionValueConverter accepts both "T" and a literal space as the date/time separator (RFC 3339 allows either), but no test covered the space form. Pinned so the contract is explicit.
ISO 8601 allows "24:00:00" as an alias for next-day midnight, and JS Date parses it that way. The new string parsing path in 0.3.2 would therefore silently roll inputs like "2024-01-15T24:00:00Z" forward to the "2024-01-16" partition. That is a regression vs 0.2.x, where the raw string was passed straight to BigQuery and BigQuery rejected hour 24 outright, surfacing the bad row instead of misfiling it. Reject hour 24 explicitly so the caller logs firestoreTimePartitionFieldError and the row lands in the __NULL__ partition (loud failure, matching the calendar-validator philosophy already applied to Feb 30 etc.). Minute and second out-of-range values are still caught by new Date() returning Invalid Date in the existing check below. Adds a test pinning the hour 24 rejection.
Bumps 7 extensions to capture security and minor dep bumps that have accumulated since eea0e40 (the last collective bump on 2026-04-14). firestore-bigquery-export is already current at 0.3.2 from #2813. - delete-user-data 0.1.28 -> 0.1.29 - firestore-counter 0.2.15 -> 0.2.16 - firestore-send-email 0.2.8 -> 0.2.9 - firestore-shorten-urls-bitly 0.2.4 -> 0.2.5 - firestore-translate-text 0.1.28 -> 0.1.29 - rtdb-limit-child-nodes 0.1.18 -> 0.1.19 - storage-resize-images 0.3.3 -> 0.3.4
Summary
order_week: "2026-01-01"withTIME_PARTITIONING_FIELD_TYPE=DATE) are accepted. The 0.3.0 refactor narrowedPartitionValueConverterto FirestoreTimestamp/ timestamp-like /Dateonly, silently coercing strings toNULLand landing rows in the__NULL__partition.PartitionValueConverter.convertnow parses strings vianew Date(value). Unparseable strings still returnnulland trigger the existing warning log.try/catcharound the BigQuery formatter switch so any serialization failure degrades tonull+ warn rather than crashing the row write.Fixes #2803
Test plan
cd firestore-bigquery-export/firestore-bigquery-change-tracker && npm run test:local -- src/__tests__/bigquery/partitioning/converter.test.ts— 29/29 passnpm run test:local— converter + config + unit suites pass; the 8 pre-existing failures ine2e.test.tsare present onorigin/nextbaseline (verified) and unrelated to this changenpx tsc --noEmit— cleanextension.yamlversion to0.3.2as part of release workflow (CHANGELOG entry added)