-
Notifications
You must be signed in to change notification settings - Fork 7
Description
First of all, thanks for this awesome work. It's been great to see the progress on all this :-)
In the example on the readme, you load a .parquet file that contains a geometry example. Since there is not a separate naming format/convention (e.g. .geo.parquet or .geoparquet), I might not know that there is a geometry in there, so I just load arrow and open the dataset as normal. Looking at the geometry column would be confusing to me. This behavior differs whether I have the geoarrow package loaded or not.
library(tidyverse)
library(arrow)
open_dataset("~/Desktop/nc.parquet") |>
head(n = 1) |>
pull(geometry, as_vector = TRUE)
#> <arrow_binary[1]>
#> [1] 01, 06, 00, 00, 00, 01, 00, 00, 00, 01, 03, 00, 00, 00, 01, 00, 00, 00, 1b, 00, 00, 00, 00, 00, 00, a0, 41, 5e, 54, c0, 00, 00, ...
library(geoarrow)
open_dataset("~/Desktop/nc.parquet") |>
head(n = 1) |>
pull(geometry, as_vector = TRUE)
#> <geoarrow_wkb[1]>
#> [1] MULTIPOLYGON (((-81.47276 36.23436, -81.54084 36.27251, -81.56198 36.27359, -81.63306 36.34069, -81.74107 36.39178, -81.69828 36.47178...This issue might should be in the R arrow package, but I'm wondering if arrow should detect when there is a geometry column present and adjust behavior (the metadata is in there, so this information is known). For example, when calling collect(), should there be a warning that a geometry column is being collected and that geoarrow::st_collect() might be the better option (as in #21)? Or a warning when opening a geoparquet without geoarrow loaded?
library(tidyverse)
library(arrow)
nc = open_dataset("~/Desktop/nc.parquet")
# We know there is a geometry from the metadata
nc$metadata[[1]]
#> [1] "{\"version\":\"0.3.0\",\"primary_column\":\"geometry\",\"columns\":{\"geometry\":{\"encoding\":\"WKB\",\"crs\":\"GEOGCS[\\\"NAD27\\\",DATUM[\\\"North_American_Datum_1927\\\",SPHEROID[\\\"Clarke 1866\\\",6378206.4,294.978698213898]],PRIMEM[\\\"Greenwich\\\",0],UNIT[\\\"degree\\\",0.0174532925199433,AUTHORITY[\\\"EPSG\\\",\\\"9122\\\"]],AXIS[\\\"Latitude\\\",NORTH],AXIS[\\\"Longitude\\\",EAST],AUTHORITY[\\\"EPSG\\\",\\\"4267\\\"]]\",\"bbox\":[-84.3239,33.882,-75.457,36.5896],\"geometry_type\":\"MultiPolygon\"}}}"