Skip to content

Conversation

@jph6366
Copy link

@jph6366 jph6366 commented Nov 18, 2025

No description provided.

jph6366 and others added 30 commits August 4, 2025 09:07
Will comb through for various edges cases (empty, 3D, etc.) once the writer is gpkg spec-compliant implementation
…idiomatic modifications, and map do syntax for meshes collection
… abstract WKBGeometry type added,

handling failing tests (1 erroring gisissues.jl, 1 failing novalues.jl), fixed bugs gpkgmeshattrs and gpkgmesh
…ed, idiomatic refactor of SQLite.load! internal transactions, small fixes
…nd table inserts, updated geopackage test conform to specification req 29
@jph6366 jph6366 marked this pull request as ready for review November 18, 2025 22:59
code = metadata.code
srsid = metadata.srsid
if srsid == 0 || srsid == 4326
crs = isone(metadata.zextent) ? LatLonAlt{WGS84Latest} : LatLon{WGS84Latest}
Copy link
Author

@jph6366 jph6366 Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I bounded crs conditionally to LatLonAlt if z field in the GeoPackage gpkg_geometry_columns table has a 1.

If LatLon is bounded to a crs that has an axis values recorded for {x,y,{z}}.
The outputted Geometry is reduced by one axis to fit LatLon{Datum, 2} instead of including the z axis.

This removal of the z axis happens in getcoord in conversion.jl where its calls raw(coords(p))

The method overload to retain the z/alt value

raw(coords::CRS) = coords.x, coords.y
raw(coords::LatLon) = coords.lon, coords.lat
raw(coords::LatLonAlt) = coords.lon, coords.lat, coords.alt

Is this not an appropriate solution to handling the z-dimension?

Should there also be {x,y,{z}} method overload for raw(coord::CRS) = coords.x, coords.y, coords.z as well?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to adhere to the GPKG spec. The srdid == 4326 probably means LatLon{WGS84Latest}, but you are using the zextent to switch to another type, which is probably incorrect.

If you use the spec as your only source of truth, we will be much more safe.

table = map(gpkgbinary) do row
# According to https://www.geopackage.org/spec/#r30
# A feature table or view SHALL have only one geometry column.
geomindex = findfirst(==(Symbol(geomcolumn)), keys(row))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the previous comments on simplifying this check for the geomcolumn:

The main conflict is the test for duplicate fieldnames in the values Tuple, otherwise we could simplify this iteration to conditionally check for geomcolumn by its name rather than using its geomindex

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main conflict is the test for duplicate fieldnames in the values Tuple, otherwise we could simplify this iteration to conditionally check for geomcolumn by its name rather than using its geomindex

Is that possible? I thought that the rows of this table contained the attributes of the geometry as named tuples, which by definition are unique names. You can't have duplicate field names in a named tuple.

# calculate GeoPackageBinaryHeader size in byte stream given extent of envelope:
# envelope is [minx, maxx, miny, maxy, minz, maxz], 48 bytes or envelope is [minx, maxx, miny, maxy], 32 bytes or no envelope, 0 bytes
# byte[2] magic + byte[1] version + byte[1] flag + byte[4] srs_id + byte[(8*2)×(x,y{,z})] envelope
skiplen = iszero(envelope) ? 4 : 4 + 8 * 2 * (envelope + 1)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm swapping to using skip (before seek) and starting from byte[4] given previous reads.

if we have no envelope then we will skip srs_id byte[4]

otherwise we calculate the amount of bytes needed to skip srs_id and envelope bytes

# According to https://www.geopackage.org/spec/#r6 and https://www.geopackage.org/spec/#r7
# PRAGMA integrity_check returns a single row with the value 'ok'
# PRAGMA foreign_key_check (w/ no parameter value) returns an empty result set
if first(DBInterface.execute(db, "PRAGMA integrity_check;")).integrity_check != "ok" ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if first(DBInterface.execute(db, "PRAGMA integrity_check;")).integrity_check != "ok" ||
if only(DBInterface.execute(db, "PRAGMA integrity_check;")).integrity_check != "ok" ||

When the query returns a single entry, we can replace first by only to get a safer check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please split these two checks into separate paragraphs.

Comment on lines 24 to 35
# According to https://www.geopackage.org/spec/#r10 and https://www.geopackage.org/spec/#r13
# A GeoPackage SHALL include a 'gpkg_spatial_ref_sys' table and a 'gpkg_contents table'
if first(DBInterface.execute(
db,
"""
SELECT COUNT(*) AS n FROM sqlite_master WHERE
name IN ('gpkg_spatial_ref_sys', 'gpkg_contents') AND
type IN ('table', 'view');
"""
)).n != 2
throw(ErrorException("missing required metadata tables in the GeoPackage SQL database"))
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the most idiomatic query to find out if a given table is present in the database? It seems you are using a count to find out if the table is present. I wonder if we couldn't be more explicit with some special query that simply says yes or no when a table is present.

We should probably split this check into two checks, one for each required table.

# gpkg_contents table.
function gpkgextract(db; layer=1)
# get the first (and only) feature table returned in sqlite query results
metadata = first(DBInterface.execute(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
metadata = first(DBInterface.execute(
metadata = only(DBInterface.execute(

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ERROR: LoadError: TypeError: non-boolean (Missing) used in boolean context

Each row is only valid when called and for some reason only invalidates the forward-only iterator context

SQLite.Row{false}:
 :tablename      missing
 :geomcolumn     missing
 :srsid          missing
 :org            missing
 :orgcoordsysid  missing

first retains the cursor's current step after it calls iterate as I understand it

Comment on lines 40 to 69
# According to Geometry Columns Table Requirements
# https://www.geopackage.org/spec/#:~:text=2.1.5.1.2.%20Table%20Data%20Values
#------------------------------------------------------------------------------
# Requirement 16: https://www.geopackage.org/spec/#r16
# Values of the gpkg_contents table srs_id column
# SHALL reference values in the gpkg_spatial_ref_sys table srs_id column
#
# Requirement 18: https://www.geopackage.org/spec/#r18
# The gpkg_contents table SHALL contain a row
# with a lowercase data_type column value of "features"
# for each vector features user data table or view.
#
# Requirement 22: gpkg_geometry_columns table
# SHALL contain one row record for the geometry column
# in each vector feature data table
#
# Requirement 23: gpkg_geometry_columns table_name column
# SHALL reference values in the gpkg_contents table_name column
# for rows with a data_type of 'features'
#
# Requirement 24: The column_name column value in a gpkg_geometry_columns row
# SHALL be the name of a column in the table or view specified by the table_name
# column value for that row.
#
# Requirement 25: The geometry_type_name value in a gpkg_geometry_columns row
# SHALL be one of the uppercase geometry type names specified
#
# Requirement 146: The srs_id value in a gpkg_geometry_columns table row
# SHALL match the srs_id column value from the corresponding row in the
# gpkg_contents table.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try to place comments right before the lines of code where they are needed. Right now it is difficult to see which lines of code require some of these assumptions.


function wkbgeombuffer(row, geomcolumn)
# get the column of SQL Geometry Binary specified by gpkg_geometry_columns table in column_name field
buff = IOBuffer(getproperty(row, Symbol(geomcolumn)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What type of object do you get if you don't wrap the result with IOBuffer? Isn't it a valid IO stream already? Or a Vector{UInt8}? Depending on the type we can decide what to do.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is a Vector{UInt8}
sample:

UInt8[0x47, 0x50, 0x00, 0x01, 0xe6, 0x10, 0x00, 0x00, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]

src/extra/gis.jl Outdated
Comment on lines 66 to 67
elseif endswith(fname, ".gpkg")
return gpkgread(fname; layer, kwargs...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice that you are adding the function call in the wrong place. The gpkgread function already returns a GeoTable, it needs to be called in the body of gisread. If you call it inside gistable, then the result will be forwarded to asgeotable which is a duplication of work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should even consider moving this call to the root load function. The pipeline we are discussing for gpkg is very different from the other GIS backends.

aspatial = getindex.(table, 1)

# check if type of aspatial elements generated are calls to the macro for declaring `NamedTuple` types
eltype(aspatial) != @NamedTuple{} || return nothing, getindex.(table, 2)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returns here if aspatial = [NamedTuple(), ..., NamedTuple()]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants