Skip to content

Comments

Add extract functionality #106

Draft
michaelkirk wants to merge 3 commits intostadiamaps:mainfrom
michaelkirk:mkirk/extract-stream
Draft

Add extract functionality #106
michaelkirk wants to merge 3 commits intostadiamaps:mainfrom
michaelkirk:mkirk/extract-stream

Conversation

@michaelkirk
Copy link
Contributor

@michaelkirk michaelkirk commented Feb 7, 2026

In a rust project I'm working on, I want to slice out a subset of a pmtiles archive (just like the go pmtiles extract command).

You can see it in action in this video, where I extract an area around the city of Seattle from a planet.pmtiles using the HTTP (range request) backend.

pmtile-extract-seattle.mov.mp4

In the log, the first 8 requests traverse the necessary metadata and directories to figure out where all the tiles live. The next 36 requests fetch the tile data.

It was a large endeavor, and to be honest, when I started, I wasn't sure it was going to work out, but it went relatively well! Unfortunately, this branch is kind of a mess. I'm going to carve out some bite-sized PR's from this branch, more suitable for human consumption, but I wanted to have this as a reference in case anyone wanted the bigger picture.

@michaelkirk
Copy link
Contributor Author

I guess I should also say: It's obviously fine if you don't want this feature. We didn't discuss it up front or anything, so I won't feel burned. It was an itch I wanted to scratch for myself, but I would definitely rather upstream it, than live in permanent fork territory if it's useful to you.

@codecov
Copy link

codecov bot commented Feb 7, 2026

Codecov Report

❌ Patch coverage is 85.07266% with 113 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.59%. Comparing base (eb788e6) to head (31fd7d2).

Files with missing lines Patch % Lines
src/extract/extractor.rs 70.07% 54 Missing and 25 partials ⚠️
src/extract/mod.rs 92.41% 10 Missing and 1 partial ⚠️
src/directory.rs 79.54% 4 Missing and 5 partials ⚠️
src/extract/bbox.rs 82.60% 2 Missing and 6 partials ⚠️
src/extract/ranges.rs 97.88% 2 Missing and 1 partial ⚠️
src/backends/http.rs 95.23% 0 Missing and 1 partial ⚠️
src/tile.rs 98.68% 0 Missing and 1 partial ⚠️
src/writer.rs 80.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #106      +/-   ##
==========================================
+ Coverage   77.84%   80.59%   +2.74%     
==========================================
  Files          11       15       +4     
  Lines        1422     2128     +706     
  Branches     1422     2128     +706     
==========================================
+ Hits         1107     1715     +608     
- Misses        210      277      +67     
- Partials      105      136      +31     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

nyurik pushed a commit that referenced this pull request Feb 7, 2026
This was pulled out of #106, but you might find it nice on its own.

Note this is a breaking change, since `TileId::new` is public and we've
added a new PmtError enum.
@michaelkirk
Copy link
Contributor Author

@lseelenbinder - is adding the extract functionality to the library (not necessarily the CLI) something you're likely to consider from a feature perspective? Is there some discussion you'd like to have before I continue working on upstreaming my changes?

@lseelenbinder
Copy link
Member

@michaelkirk extract is definitely within scope of the library, and assuming it doesn't add a lot of dependencies, I'd say it doesn't even need a feature gate.

@nyurik
Copy link
Collaborator

nyurik commented Feb 11, 2026

any CLI automatically requires Clap dep

@lseelenbinder
Copy link
Member

any CLI automatically requires Clap dep

Agreed. :) I'm just saying the extract functionality is fine to have in the library, not that it's super useful without a CLI.

@michaelkirk
Copy link
Contributor Author

michaelkirk commented Feb 11, 2026 via email

Currently only HTTP has an interesting implementation.
I suspect I'll need to clean up feature flags before upstreaming.
This involves more seeking, but we're never holding the lock while
waiting for network activity.

Measurements show 20% improvement for a west coast (1.2gb) download with
50ms latency @ 100MBPS

I suspect a bigger improvement on slower networks, and a worse
improvement on slower disks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants