Skip to content

feat: support remote pmtiles directory auto-discovery#2704

Draft
carderne wants to merge 1 commit intomaplibre:mainfrom
carderne:remote-pmtiles-dirs
Draft

feat: support remote pmtiles directory auto-discovery#2704
carderne wants to merge 1 commit intomaplibre:mainfrom
carderne:remote-pmtiles-dirs

Conversation

@carderne
Copy link
Copy Markdown
Contributor

NB: Still a work-in-progress, but tested to be working with a GCS bucket I control.

Resolves this issue: #2180

In theory quite simple, but there are a bunch of design choices I'm not really sure about...

Considerations

Sequential load

It has to do sequential GETs for each pmtiles. This will be very slow for large collections. Tried pointing it at a bucket with 5k+ pmtiles :P Would be super slow + memory issue I'm sure. A single failing header open inside a prefix aborts the whole prefix (same as the existing single-URL behavior). Could be softened to per-file warnings without much work if maintainers prefer.

Path vs file detection

Just does "URL path ends with .pmtiles" -> single file, else list as prefix

New URL creation

Listed object's URL is rebuilt via child.set_path("/{location}"). Works for s3://, gs://, az://, file:// because in all of them the URL path component is the object_store key. Not tested across backends... http:// listing is rare and may behave unevenly, error path should handle.

Did not refactor the TileSourceConfiguration trait to pass (store, path) pairs instead of URLs, even though that would avoid the URL reconstruction step. Kept the trait-shape change minimal (one default method).

No generic HTTP paths

Eg https://host/prefix/ will obviously fail unless it has the appropriate APIs.

Testing

Unit tests cover expand_url against file:// (tempdir) for the three main cases (prefix-with-matches, direct file URL, empty prefix). No integration test hits real S3... the existing test.sh still uses a single-file S3 URL.

Other

Sorry about the Arc moving around, I ran just fmt.

@carderne carderne changed the title support remote pmtiles directory auto-discovery feat: support remote pmtiles directory auto-discovery Apr 17, 2026
@github-actions
Copy link
Copy Markdown

Performance Comparison mainremote-pmtiles-dirs

Total Elapsed Time: 74.87s → 72.97s (-2.5%)
CPU Baseline: 91.58µs → 91.41µs (-0.2%)
Benchmark ID: timing

timing - Function execution time metrics.

+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| Function                   | Calls                        | Avg                            | P95                            | Total                          | % Total                      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| content::get_tile          | 2100500 → 2100500 (+0.0%)    | 41.28µs → 39.34µs (-4.7%)      | 45.63µs → 45.57µs (-0.1%)      | 86.71s → 82.64s (-4.7%)        | 115.81% → 113.25% (-2.2%)    |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| martin::main               | 1 → 1 (+0.0%)                | 74.86s → 72.98s (-2.5%)        | 74.89s → 73.01s (-2.5%)        | 74.87s → 72.97s (-2.5%)        | 100.00% → 100.00% (+0.0%)    |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| martin::start              | 1 → 1 (+0.0%)                | 74.86s → 72.98s (-2.5%)        | 74.89s → 73.01s (-2.5%)        | 74.87s → 72.97s (-2.5%)        | 100.00% → 100.00% (+0.0%)    |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| content::get_http_response | 2100500 → 2100500 (+0.0%)    | 31.49µs → 30.53µs (-3.0%)      | 38.88µs → 38.62µs (-0.7%)      | 66.15s → 64.12s (-3.1%)        | 88.36% → 87.87% (-0.6%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| content::get_tile_content  | 2100500 → 2100500 (+0.0%)    | 28.42µs → 26.11µs (-8.1%)      | 36.67µs → 36.03µs (-1.7%)      | 59.69s → 54.83s (-8.1%)        | 79.72% → 75.15% (-5.7%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| content::recompress        | 2100500 → 2100500 (+0.0%)    | 10.50µs → 10.32µs (-1.7%)      | 28.18µs → 27.86µs (-1.1%)      | 22.05s → 21.68s (-1.7%)        | 29.45% → 29.71% (+0.9%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| content::new               | 2100500 → 2100500 (+0.0%)    | 3.79µs → 2.60µs (-31.4%) 🚀    | 1.87µs → 1.86µs (-0.5%)        | 7.97s → 5.46s (-31.5%) 🚀      | 10.65% → 7.49% (-29.7%) 🚀   |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| content::encode            | 100100 → 100100 (+0.0%)      | 39.26µs → 36.82µs (-6.2%)      | 49.60µs → 67.65µs (+36.4%) ⚠️  | 3.93s → 3.69s (-6.1%)          | 5.25% → 5.05% (-3.8%)        |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| source::get_sources        | 2100500 → 2100500 (+0.0%)    | 1.14µs → 1.13µs (-0.9%)        | 1.48µs → 1.49µs (+0.7%)        | 2.39s → 2.38s (-0.4%)          | 3.19% → 3.27% (+2.5%)        |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+
| server::new_server         | 1 → 1 (+0.0%)                | 226.50µs → 228.93µs (+1.1%)    | 226.56µs → 228.99µs (+1.1%)    | 226.47µs → 228.92µs (+1.1%)    | 0.00% → 0.00% (+0.0%)        |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+------------------------------+

alloc-bytes - Exclusive allocation bytes by each function (excluding nested calls).

+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| Function                   | Calls                        | Avg                            | P95                            | Total                          | % Total                    |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| content::get_tile_content  | 2100500 → 2100500 (+0.0%)    | 95.7 KB → 95.7 KB (+0.0%)      | 180.0 KB → 180.0 KB (+0.0%)    | 191.7 GB → 191.7 GB (+0.0%)    | 84.37% → 84.37% (+0.0%)    |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| content::encode            | 100100 → 100100 (+0.0%)      | 344.6 KB → 344.6 KB (+0.0%)    | 347.5 KB → 347.5 KB (+0.0%)    | 32.9 GB → 32.9 GB (+0.0%)      | 14.48% → 14.48% (+0.0%)    |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| source::get_sources        | 2100500 → 2100500 (+0.0%)    | 932 B → 932 B (+0.0%)          | 1.2 KB → 1.2 KB (+0.0%)        | 1.8 GB → 1.8 GB (+0.0%)        | 0.80% → 0.80% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| content::get_tile          | 2100500 → 2100500 (+0.0%)    | 200 B → 200 B (+0.0%)          | 200 B → 200 B (+0.0%)          | 400.6 MB → 400.6 MB (+0.0%)    | 0.17% → 0.17% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| content::get_http_response | 2100500 → 2100500 (+0.0%)    | 181 B → 181 B (+0.0%)          | 180 B → 180 B (+0.0%)          | 364.5 MB → 364.5 MB (+0.0%)    | 0.16% → 0.16% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| content::recompress        | 2100500 → 2100500 (+0.0%)    | 22 B → 22 B (+0.0%)            | 22 B → 22 B (+0.0%)            | 44.1 MB → 44.1 MB (+0.0%)      | 0.02% → 0.02% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| martin::start              | 1 → 1 (+0.0%)                | 3.1 MB → 3.1 MB (+0.0%)        | 3.1 MB → 3.1 MB (+0.0%)        | 3.1 MB → 3.1 MB (+0.0%)        | 0.00% → 0.00% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| martin::main               | 1 → 1 (+0.0%)                | 141.9 KB → 139.8 KB (-1.5%)    | 141.9 KB → 139.9 KB (-1.4%)    | 141.9 KB → 139.8 KB (-1.5%)    | 0.00% → 0.00% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| server::new_server         | 1 → 1 (+0.0%)                | 30.9 KB → 30.9 KB (+0.0%)      | 30.9 KB → 30.9 KB (+0.0%)      | 30.9 KB → 30.9 KB (+0.0%)      | 0.00% → 0.00% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+
| content::new               | 2100500 → 2100500 (+0.0%)    | 0 B → 0 B (+0.0%)              | 0 B → 0 B (+0.0%)              | 0 B → 0 B (+0.0%)              | 0.00% → 0.00% (+0.0%)      |
+----------------------------+------------------------------+--------------------------------+--------------------------------+--------------------------------+----------------------------+

Threads

Total Alloc: 2.7 MB → 2.7 MB (+0.0%)
Total Dealloc: 272.8 MB → 272.8 MB (+0.0%)
Mem Diff: -270.1 MB → -270.1 MB (+0.0%)

+----------+--------------------------+------------------------------+------------------------------+--------------------------------+----------------------------------+
| Thread   | CPU % Avg                | CPU % Max                    | Alloc                        | Dealloc                        | Mem Diff                         |
+----------+--------------------------+------------------------------+------------------------------+--------------------------------+----------------------------------+
| martin   | 0.00% → 0.00% (+0.0%)    | 8.00% → 11.90% (+48.8%) ⚠️   | 2.7 MB → 2.7 MB (+0.0%)      | 1.8 MB → 1.8 MB (+0.0%)        | 900.3 KB → 899.2 KB (-0.1%)      |
+----------+--------------------------+------------------------------+------------------------------+--------------------------------+----------------------------------+
| hp-mcp   | 0.00% → 0.00% (+0.0%)    | 0.00% → 0.00% (+0.0%)        | 54.7 KB → 54.7 KB (+0.0%)    | 3.4 KB → 3.4 KB (+0.0%)        | 51.4 KB → 51.4 KB (+0.0%)        |
+----------+--------------------------+------------------------------+------------------------------+--------------------------------+----------------------------------+
| hp-debug | 7.00% → 6.70% (-4.3%)    | 19.90% → 23.90% (+20.1%) ⚠️  | 10.4 KB → 10.9 KB (+4.8%)    | 271.0 MB → 271.0 MB (+0.0%)    | -271.0 MB → -271.0 MB (+0.0%)    |
+----------+--------------------------+------------------------------+------------------------------+--------------------------------+----------------------------------+

Generated with hotpath-rs

Copy link
Copy Markdown
Member

@CommanderStorm CommanderStorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with merging this code (even though we want to move away from this concrete way of implementing init and towards reloadable init).
I would like a testcase though for the path you are adding.
The test you added is great, but I would reaaaly like a test for the "actual path" you care about.

This is important because this way we can also catch problems in the path you care about

Comment on lines +445 to +448
warn!(
"No files matching {extension:?} found under {}",
sanitize_url(&url)
);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this likely should trigger our on-invalid setting

pub enum OnInvalid {

/// Source types that support remote listing (e.g. `PMTiles` via `object_store`) may override
/// this to enumerate objects matching `allowed_extension` under a prefix.
#[allow(unused_variables)]
fn expand_url(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entriely a fan of this achritecture, since this means maintaining it.

We will need to refactor this to this sometime in the future.
If this PR comes with good tests, we can still go forward with this, but it needs to come with tests that this behaviour works as intended.

pub struct ReloadAdvisory {

}

#[cfg(test)]
mod tests {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a test using a (cfg-scoped test like the postgres feature) feature.
I.e. testing the s3 support.

I think this should work via either minio, localstack or some other version.

@carderne
Copy link
Copy Markdown
Contributor Author

@CommanderStorm we don't need this feature anymore, so if you're not convinced this is useful/necessary I won't be sad if you don't merge it. I'm also not likely to have the time in the next week or two to get this to a mergable state.

About the test: I tried with the existing S3 bucket you have in the test cases, but there weren't many files and I wasn't completely sure how to do it without breaking the existing S3 test. The files I tested it with are not public unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants