-
Notifications
You must be signed in to change notification settings - Fork 22
GeoPackage reader #183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
GeoPackage reader #183
Conversation
Will comb through for various edges cases (empty, 3D, etc.) once the writer is gpkg spec-compliant implementation
…d import DBInterface
… commits and removing unneccesary branches
…idiomatic modifications, and map do syntax for meshes collection
… abstract WKBGeometry type added, handling failing tests (1 erroring gisissues.jl, 1 failing novalues.jl), fixed bugs gpkgmeshattrs and gpkgmesh
… function for ring vs rope extents
…ed, idiomatic refactor of SQLite.load! internal transactions, small fixes
…ite.Stmt and Transaction
…coords already float
…nd table inserts, updated geopackage test conform to specification req 29
…ures... and other cases
src/extra/gpkg/read.jl
Outdated
| code = metadata.code | ||
| srsid = metadata.srsid | ||
| if srsid == 0 || srsid == 4326 | ||
| crs = isone(metadata.zextent) ? LatLonAlt{WGS84Latest} : LatLon{WGS84Latest} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bounded crs conditionally to LatLonAlt if z field in the GeoPackage gpkg_geometry_columns table has a 1.
If LatLon is bounded to a crs that has an axis values recorded for {x,y,{z}}.
The outputted Geometry is reduced by one axis to fit LatLon{Datum, 2} instead of including the z axis.
This removal of the z axis happens in getcoord in conversion.jl where its calls raw(coords(p))
The method overload to retain the z/alt value
raw(coords::CRS) = coords.x, coords.y
raw(coords::LatLon) = coords.lon, coords.lat
raw(coords::LatLonAlt) = coords.lon, coords.lat, coords.alt
Is this not an appropriate solution to handling the z-dimension?
Should there also be {x,y,{z}} method overload for raw(coord::CRS) = coords.x, coords.y, coords.z as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to adhere to the GPKG spec. The srdid == 4326 probably means LatLon{WGS84Latest}, but you are using the zextent to switch to another type, which is probably incorrect.
If you use the spec as your only source of truth, we will be much more safe.
src/extra/gpkg/read.jl
Outdated
| table = map(gpkgbinary) do row | ||
| # According to https://www.geopackage.org/spec/#r30 | ||
| # A feature table or view SHALL have only one geometry column. | ||
| geomindex = findfirst(==(Symbol(geomcolumn)), keys(row)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the previous comments on simplifying this check for the geomcolumn:
The main conflict is the test for duplicate fieldnames in the values Tuple, otherwise we could simplify this iteration to conditionally check for geomcolumn by its name rather than using its geomindex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main conflict is the test for duplicate fieldnames in the values Tuple, otherwise we could simplify this iteration to conditionally check for
geomcolumnby its name rather than using itsgeomindex
Is that possible? I thought that the rows of this table contained the attributes of the geometry as named tuples, which by definition are unique names. You can't have duplicate field names in a named tuple.
| # calculate GeoPackageBinaryHeader size in byte stream given extent of envelope: | ||
| # envelope is [minx, maxx, miny, maxy, minz, maxz], 48 bytes or envelope is [minx, maxx, miny, maxy], 32 bytes or no envelope, 0 bytes | ||
| # byte[2] magic + byte[1] version + byte[1] flag + byte[4] srs_id + byte[(8*2)×(x,y{,z})] envelope | ||
| skiplen = iszero(envelope) ? 4 : 4 + 8 * 2 * (envelope + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm swapping to using skip (before seek) and starting from byte[4] given previous reads.
if we have no envelope then we will skip srs_id byte[4]
otherwise we calculate the amount of bytes needed to skip srs_id and envelope bytes
src/extra/gpkg/read.jl
Outdated
| # According to https://www.geopackage.org/spec/#r6 and https://www.geopackage.org/spec/#r7 | ||
| # PRAGMA integrity_check returns a single row with the value 'ok' | ||
| # PRAGMA foreign_key_check (w/ no parameter value) returns an empty result set | ||
| if first(DBInterface.execute(db, "PRAGMA integrity_check;")).integrity_check != "ok" || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if first(DBInterface.execute(db, "PRAGMA integrity_check;")).integrity_check != "ok" || | |
| if only(DBInterface.execute(db, "PRAGMA integrity_check;")).integrity_check != "ok" || |
When the query returns a single entry, we can replace first by only to get a safer check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please split these two checks into separate paragraphs.
src/extra/gpkg/read.jl
Outdated
| # According to https://www.geopackage.org/spec/#r10 and https://www.geopackage.org/spec/#r13 | ||
| # A GeoPackage SHALL include a 'gpkg_spatial_ref_sys' table and a 'gpkg_contents table' | ||
| if first(DBInterface.execute( | ||
| db, | ||
| """ | ||
| SELECT COUNT(*) AS n FROM sqlite_master WHERE | ||
| name IN ('gpkg_spatial_ref_sys', 'gpkg_contents') AND | ||
| type IN ('table', 'view'); | ||
| """ | ||
| )).n != 2 | ||
| throw(ErrorException("missing required metadata tables in the GeoPackage SQL database")) | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the most idiomatic query to find out if a given table is present in the database? It seems you are using a count to find out if the table is present. I wonder if we couldn't be more explicit with some special query that simply says yes or no when a table is present.
We should probably split this check into two checks, one for each required table.
src/extra/gpkg/read.jl
Outdated
| # gpkg_contents table. | ||
| function gpkgextract(db; layer=1) | ||
| # get the first (and only) feature table returned in sqlite query results | ||
| metadata = first(DBInterface.execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| metadata = first(DBInterface.execute( | |
| metadata = only(DBInterface.execute( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ERROR: LoadError: TypeError: non-boolean (Missing) used in boolean context
Each row is only valid when called and for some reason only invalidates the forward-only iterator context
SQLite.Row{false}:
:tablename missing
:geomcolumn missing
:srsid missing
:org missing
:orgcoordsysid missing
first retains the cursor's current step after it calls iterate as I understand it
src/extra/gpkg/read.jl
Outdated
| # According to Geometry Columns Table Requirements | ||
| # https://www.geopackage.org/spec/#:~:text=2.1.5.1.2.%20Table%20Data%20Values | ||
| #------------------------------------------------------------------------------ | ||
| # Requirement 16: https://www.geopackage.org/spec/#r16 | ||
| # Values of the gpkg_contents table srs_id column | ||
| # SHALL reference values in the gpkg_spatial_ref_sys table srs_id column | ||
| # | ||
| # Requirement 18: https://www.geopackage.org/spec/#r18 | ||
| # The gpkg_contents table SHALL contain a row | ||
| # with a lowercase data_type column value of "features" | ||
| # for each vector features user data table or view. | ||
| # | ||
| # Requirement 22: gpkg_geometry_columns table | ||
| # SHALL contain one row record for the geometry column | ||
| # in each vector feature data table | ||
| # | ||
| # Requirement 23: gpkg_geometry_columns table_name column | ||
| # SHALL reference values in the gpkg_contents table_name column | ||
| # for rows with a data_type of 'features' | ||
| # | ||
| # Requirement 24: The column_name column value in a gpkg_geometry_columns row | ||
| # SHALL be the name of a column in the table or view specified by the table_name | ||
| # column value for that row. | ||
| # | ||
| # Requirement 25: The geometry_type_name value in a gpkg_geometry_columns row | ||
| # SHALL be one of the uppercase geometry type names specified | ||
| # | ||
| # Requirement 146: The srs_id value in a gpkg_geometry_columns table row | ||
| # SHALL match the srs_id column value from the corresponding row in the | ||
| # gpkg_contents table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to place comments right before the lines of code where they are needed. Right now it is difficult to see which lines of code require some of these assumptions.
|
|
||
| function wkbgeombuffer(row, geomcolumn) | ||
| # get the column of SQL Geometry Binary specified by gpkg_geometry_columns table in column_name field | ||
| buff = IOBuffer(getproperty(row, Symbol(geomcolumn))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of object do you get if you don't wrap the result with IOBuffer? Isn't it a valid IO stream already? Or a Vector{UInt8}? Depending on the type we can decide what to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a Vector{UInt8}
sample:
UInt8[0x47, 0x50, 0x00, 0x01, 0xe6, 0x10, 0x00, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]
src/extra/gis.jl
Outdated
| elseif endswith(fname, ".gpkg") | ||
| return gpkgread(fname; layer, kwargs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notice that you are adding the function call in the wrong place. The gpkgread function already returns a GeoTable, it needs to be called in the body of gisread. If you call it inside gistable, then the result will be forwarded to asgeotable which is a duplication of work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should even consider moving this call to the root load function. The pipeline we are discussing for gpkg is very different from the other GIS backends.
| aspatial = getindex.(table, 1) | ||
|
|
||
| # check if type of aspatial elements generated are calls to the macro for declaring `NamedTuple` types | ||
| eltype(aspatial) != @NamedTuple{} || return nothing, getindex.(table, 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returns here if aspatial = [NamedTuple(), ..., NamedTuple()]


No description provided.