feat: support remote pmtiles directory auto-discovery#2704
feat: support remote pmtiles directory auto-discovery#2704carderne wants to merge 1 commit intomaplibre:mainfrom
Conversation
Performance Comparison
|
There was a problem hiding this comment.
I am fine with merging this code (even though we want to move away from this concrete way of implementing init and towards reloadable init).
I would like a testcase though for the path you are adding.
The test you added is great, but I would reaaaly like a test for the "actual path" you care about.
This is important because this way we can also catch problems in the path you care about
| warn!( | ||
| "No files matching {extension:?} found under {}", | ||
| sanitize_url(&url) | ||
| ); |
There was a problem hiding this comment.
this likely should trigger our on-invalid setting
martin/martin/src/config/file/main.rs
Line 477 in 7071f12
| /// Source types that support remote listing (e.g. `PMTiles` via `object_store`) may override | ||
| /// this to enumerate objects matching `allowed_extension` under a prefix. | ||
| #[allow(unused_variables)] | ||
| fn expand_url( |
There was a problem hiding this comment.
I am not entriely a fan of this achritecture, since this means maintaining it.
We will need to refactor this to this sometime in the future.
If this PR comes with good tests, we can still go forward with this, but it needs to come with tests that this behaviour works as intended.
Line 29 in 7071f12
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
I would like to see a test using a (cfg-scoped test like the postgres feature) feature.
I.e. testing the s3 support.
I think this should work via either minio, localstack or some other version.
|
@CommanderStorm we don't need this feature anymore, so if you're not convinced this is useful/necessary I won't be sad if you don't merge it. I'm also not likely to have the time in the next week or two to get this to a mergable state. About the test: I tried with the existing S3 bucket you have in the test cases, but there weren't many files and I wasn't completely sure how to do it without breaking the existing S3 test. The files I tested it with are not public unfortunately. |
NB: Still a work-in-progress, but tested to be working with a GCS bucket I control.
Resolves this issue: #2180
In theory quite simple, but there are a bunch of design choices I'm not really sure about...
Considerations
Sequential load
It has to do sequential GETs for each pmtiles. This will be very slow for large collections. Tried pointing it at a bucket with 5k+ pmtiles :P Would be super slow + memory issue I'm sure. A single failing header open inside a prefix aborts the whole prefix (same as the existing single-URL behavior). Could be softened to per-file warnings without much work if maintainers prefer.
Path vs file detection
Just does "URL path ends with .pmtiles" -> single file, else list as prefix
New URL creation
Listed object's URL is rebuilt via child.set_path("/{location}"). Works for s3://, gs://, az://, file:// because in all of them the URL path component is the object_store key. Not tested across backends... http:// listing is rare and may behave unevenly, error path should handle.
Did not refactor the TileSourceConfiguration trait to pass (store, path) pairs instead of URLs, even though that would avoid the URL reconstruction step. Kept the trait-shape change minimal (one default method).
No generic HTTP paths
Eg https://host/prefix/ will obviously fail unless it has the appropriate APIs.
Testing
Unit tests cover expand_url against file:// (tempdir) for the three main cases (prefix-with-matches, direct file URL, empty prefix). No integration test hits real S3... the existing test.sh still uses a single-file S3 URL.
Other
Sorry about the
Arcmoving around, I ranjust fmt.