Commit 697f501
### Rationale for this change
As of today, it's not possible to write Parquet `TIME` data whose `isAdjustedToUTC` parameter is `false`. Instead, `isAdjustedToUTC` is hard-coded to `true` [here](https://github.com/apache/arrow/blob/2dd3ccda6437f79aa34641bd3197dd7392ae4aec/cpp/src/parquet/arrow/schema.cc#L431).
Unfortunately, some Parquet consumers only support `TIME` data if the `isAdjustedToUTC` parameter is `false`, meaning they cannot import Parquet `TIME` data generated by our Parquet Writer. For example, the apache/spark Parquet reader only supports Parquet `TIME` columns if [`isAdjustedToUTC=false` and `units=MICROSECONDS`](https://github.com/apache/spark/blob/554f6b64f1e2b2346499f6d3340a3695244bfc84/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetSchemaConverter.scala#L309).
Adding support for writing `TIME` data with the `isAdjustedToUTC` set to `false` would unblock users who need to write Spark-compatible Parquet data.
### What changes are included in this PR?
1. Added a `write_time_adjusted_to_utc` as a property to `parquet::ArrowWriterProperties`. If `true`, all `TIME` columns have their `isAdjustedToUTC` parameters set to `true`. Otherwise, `isAdjustedToUTC` is set to `false` for all `TIME` columns. This property is `false` by default.
2. Added `enable_write_time_adjusted_to_utc()` and `disable_write_time_adjusted_to_utc()` methods to `parquet::ArrowWriterProperties::Builder`.
### Are these changes tested?
Yes. I added test case `ParquetTimeAdjustedToUTC` to test suite `TestConvertArrowSchema`.
### Are there any user-facing changes?
Yes. Users can now configure the `isAdjustedToUTC` parameter for Parquet `TIME` data.
NOTE: This change introduces an incompatibility. The default value for `isAdjustedToUTC` parameter is now `false` instead of `true`.
### NOTE
1. I did not update the PyArrow interface because I am not familiar with that code base. I was planning on creating a new GitHub issue to track that work separately.
2. There already exists an open PR (#43268) for addressing this issue. However, that PR was last active over a year ago and seems stale.
* GitHub Issue: #41476
Lead-authored-by: Sarah Gilmore <[email protected]>
Co-authored-by: Sarah Gilmore <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Sarah Gilmore <[email protected]>
1 parent de52048 commit 697f501
File tree
3 files changed
+89
-13
lines changed- cpp/src/parquet
- arrow
3 files changed
+89
-13
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1352 | 1352 | | |
1353 | 1353 | | |
1354 | 1354 | | |
1355 | | - | |
| 1355 | + | |
1356 | 1356 | | |
1357 | | - | |
| 1357 | + | |
1358 | 1358 | | |
1359 | | - | |
| 1359 | + | |
1360 | 1360 | | |
1361 | 1361 | | |
1362 | 1362 | | |
| |||
1782 | 1782 | | |
1783 | 1783 | | |
1784 | 1784 | | |
| 1785 | + | |
| 1786 | + | |
| 1787 | + | |
| 1788 | + | |
| 1789 | + | |
| 1790 | + | |
| 1791 | + | |
| 1792 | + | |
| 1793 | + | |
| 1794 | + | |
| 1795 | + | |
| 1796 | + | |
| 1797 | + | |
| 1798 | + | |
| 1799 | + | |
| 1800 | + | |
| 1801 | + | |
| 1802 | + | |
| 1803 | + | |
| 1804 | + | |
| 1805 | + | |
| 1806 | + | |
| 1807 | + | |
| 1808 | + | |
| 1809 | + | |
| 1810 | + | |
| 1811 | + | |
| 1812 | + | |
| 1813 | + | |
| 1814 | + | |
| 1815 | + | |
| 1816 | + | |
| 1817 | + | |
| 1818 | + | |
| 1819 | + | |
| 1820 | + | |
| 1821 | + | |
| 1822 | + | |
| 1823 | + | |
| 1824 | + | |
| 1825 | + | |
| 1826 | + | |
| 1827 | + | |
| 1828 | + | |
| 1829 | + | |
| 1830 | + | |
| 1831 | + | |
| 1832 | + | |
| 1833 | + | |
| 1834 | + | |
| 1835 | + | |
| 1836 | + | |
| 1837 | + | |
1785 | 1838 | | |
1786 | 1839 | | |
1787 | 1840 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
420 | 420 | | |
421 | 421 | | |
422 | 422 | | |
423 | | - | |
424 | | - | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
425 | 426 | | |
426 | 427 | | |
427 | 428 | | |
428 | 429 | | |
429 | 430 | | |
430 | | - | |
431 | | - | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
432 | 434 | | |
433 | | - | |
434 | | - | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
435 | 438 | | |
436 | 439 | | |
437 | 440 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1161 | 1161 | | |
1162 | 1162 | | |
1163 | 1163 | | |
1164 | | - | |
| 1164 | + | |
| 1165 | + | |
1165 | 1166 | | |
1166 | 1167 | | |
1167 | 1168 | | |
| |||
1256 | 1257 | | |
1257 | 1258 | | |
1258 | 1259 | | |
| 1260 | + | |
| 1261 | + | |
| 1262 | + | |
| 1263 | + | |
| 1264 | + | |
| 1265 | + | |
| 1266 | + | |
| 1267 | + | |
| 1268 | + | |
1259 | 1269 | | |
1260 | 1270 | | |
1261 | 1271 | | |
1262 | 1272 | | |
1263 | 1273 | | |
1264 | | - | |
| 1274 | + | |
1265 | 1275 | | |
1266 | 1276 | | |
1267 | 1277 | | |
| |||
1277 | 1287 | | |
1278 | 1288 | | |
1279 | 1289 | | |
| 1290 | + | |
| 1291 | + | |
1280 | 1292 | | |
1281 | 1293 | | |
1282 | 1294 | | |
| |||
1310 | 1322 | | |
1311 | 1323 | | |
1312 | 1324 | | |
| 1325 | + | |
| 1326 | + | |
| 1327 | + | |
| 1328 | + | |
| 1329 | + | |
1313 | 1330 | | |
1314 | 1331 | | |
1315 | 1332 | | |
1316 | 1333 | | |
1317 | 1334 | | |
1318 | 1335 | | |
1319 | 1336 | | |
1320 | | - | |
| 1337 | + | |
| 1338 | + | |
1321 | 1339 | | |
1322 | 1340 | | |
1323 | 1341 | | |
| |||
1326 | 1344 | | |
1327 | 1345 | | |
1328 | 1346 | | |
1329 | | - | |
| 1347 | + | |
| 1348 | + | |
1330 | 1349 | | |
1331 | 1350 | | |
1332 | 1351 | | |
| |||
1337 | 1356 | | |
1338 | 1357 | | |
1339 | 1358 | | |
| 1359 | + | |
1340 | 1360 | | |
1341 | 1361 | | |
1342 | 1362 | | |
| |||
0 commit comments