Skip to content

fix(gdal): st_read_meta returns empty results for /vsi* paths#775

Open
jatorre wants to merge 1 commit intoduckdb:v1.4-andiumfrom
jatorre:fix/st-read-meta-vsi-paths
Open

fix(gdal): st_read_meta returns empty results for /vsi* paths#775
jatorre wants to merge 1 commit intoduckdb:v1.4-andiumfrom
jatorre:fix/st-read-meta-vsi-paths

Conversation

@jatorre
Copy link

@jatorre jatorre commented Mar 18, 2026

Summary

st_read_meta() silently returns 0 rows when called with GDAL virtual filesystem paths (/vsizip/, /vsicurl/, /vsigs/, etc.). This is because the Bind function uses MultiFileReader::CreateFileList() which doesn't understand /vsi* prefixes and returns an empty list (masked by ALLOW_EMPTY).

This is inconsistent with st_read(), which correctly handles /vsi* paths by passing them directly to GDAL via GetPrefix().

The Fix

When the input path starts with /vsi, bypass MultiFileReader and pass the path directly to GDAL — matching the existing behavior of st_read(). Non-/vsi paths continue to use MultiFileReader for glob expansion.

Before:

-- Returns 0 rows silently
SELECT * FROM st_read_meta('/vsizip//vsicurl/https://example.com/data.zip');

After:

-- Returns full metadata (layers, geometry types, CRS, fields)
SELECT * FROM st_read_meta('/vsizip//vsicurl/https://example.com/data.zip');

Root Cause

In gdal_module.cpp, the ST_Read_Meta::Bind function:

const auto mfreader = MultiFileReader::Create(input.table_function);
const auto mflist = mfreader->CreateFileList(context, input.inputs[0], FileGlobOptions::ALLOW_EMPTY);
return make_uniq_base<FunctionData, BindData>(mflist->GetAllFiles());

MultiFileReader::CreateFileList tries to resolve /vsizip/... through DuckDB's VFS layer, fails, and returns an empty list. The ALLOW_EMPTY flag suppresses the error.

Meanwhile, st_read avoids this by using GDALClientContextState::GetPrefix() which already has explicit /vsi handling (line 446):

if (StringUtil::StartsWith(value, "/vsi")) {
    return value;  // Pass through to GDAL unchanged
}

Use Case

This enables listing layers in remote archives without downloading them:

-- Discover layers in a remote shapefile zip
SELECT unnest(layers) FROM st_read_meta('/vsizip//vsicurl/https://storage.googleapis.com/bucket/data.zip');

-- Works with signed URLs using the {braces} syntax
SELECT * FROM st_read_meta('/vsizip/{/vsicurl/https://...?X-Goog-Signature=...}/layer.shp');

Previously the only workaround was calling ogrinfo as an external process, or downloading the file locally first.

Related Issues

st_read_meta() uses MultiFileReader to resolve file paths, but
MultiFileReader doesn't understand GDAL virtual filesystem prefixes
(/vsizip/, /vsicurl/, /vsigs/, etc.). It silently returns an empty
file list (due to ALLOW_EMPTY), causing st_read_meta to return 0 rows.

This is inconsistent with st_read(), which bypasses MultiFileReader
for /vsi* paths and passes them directly to GDAL.

The fix detects /vsi* prefixes in the Bind function and creates the
file list directly, matching the behavior of st_read. Non-/vsi paths
continue to use MultiFileReader for glob expansion.

Before: st_read_meta('/vsizip//vsicurl/https://...') → 0 rows
After:  st_read_meta('/vsizip//vsicurl/https://...') → full metadata

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jatorre added a commit to jatorre/duckdb-spatial that referenced this pull request Mar 19, 2026
…duckdb#776 (KML axis order)

Combines two fixes for the CARTO import/export pipeline:

1. st_read_meta: bypass MultiFileReader for /vsi* paths (PR duckdb#775)
   - Enables pure DuckDB layer discovery on remote files
   - Eliminates ogrinfo external dependency

2. COPY TO: set OAMS_TRADITIONAL_GIS_ORDER when SRS is provided (PR duckdb#776)
   - Fixes KML export for coordinates with longitude > 90°
   - Eliminates ogr2ogr fallback for KML

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
jatorre added a commit to jatorre/duckdb-spatial that referenced this pull request Mar 19, 2026
…uckdb#775)

Based on v1.5-variegata branch (compatible with DuckDB v1.5.0).
KML axis order fix (PR duckdb#776) is already in this branch via geometry_always_xy setting.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant