Skip to content

Commit 0b9997e

Browse files
jayakumarpujarpre-commit-ci[bot]giswqs
authored
Add CLIP-based zero-shot classification for vector features (#585)
* Add CLIP-based zero-shot classification for vector features Add CLIPVectorClassifier class and clip_classify_vector convenience function that classify vector polygon features using zero-shot CLIP inference. For each polygon, extracts the bounding-box image chip from a raster, encodes it with CLIP, and assigns the best-matching category label from a user-provided list via cosine similarity. Features: - Accepts GeoDataFrame or file path + any raster format - Pre-computes text embeddings once for efficient batch inference - Automatic CRS alignment between vector and raster - Handles multi-band, single-band, and float rasters - Configurable batch size, top-k predictions, min chip size - Output save support (.geojson, .parquet, .gpkg) Closes #129 * Address Copilot review: device, logging, chip logic, tests - Use get_device() for CUDA/MPS/CPU auto-detection (lazy import) - Replace print() with logger.info() for model load status - Fix min_chip_size: use 'or' so either dimension below threshold rejects - Move Window import to module level alongside window_from_bounds - Collapse identical single-band/two-band branches in _to_rgb_uint8 - Add FileNotFoundError for vector_data to convenience func docstring - Add end-to-end batch classification test covering _process_batch - Add test_narrow_chip_returns_none for the 'or' fix * Add notebook example for CLIP zero-shot vector classification - New docs/examples/clip_classify_vector.ipynb demonstrating: - clip_classify_vector() convenience function - CLIPVectorClassifier class API with top-k predictions - Result visualization with color-coded labels and confidence maps - Export to GeoJSON - Register notebook in mkdocs.yml nav and execute_ignore * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix transformers 5.x compatibility and notebook issues - Handle BaseModelOutputWithPooling from get_text_features() and get_image_features() which changed return type in transformers>=5.0 - Reproject vector data to raster CRS in preview plot to fix empty figure - Skip unclassified polygons (None top-k) in top-3 display loop - Shrink confidence colorbar legend for better readability * Improve normalization, immutability, and CLIPSeg consistency - Per-channel float normalization in _to_rgb_uint8 preserves color balance across bands with different value ranges - uint8 passthrough skips unnecessary renormalization - GeoDataFrame immutability: classify() never mutates the input - CLIPSegmentation: use get_device() for CUDA/MPS/CPU auto-detection, replace print() with logger, align tile normalization with _to_rgb_uint8 approach - Add tests for immutability, uint8 passthrough, per-channel normalization, transformers 5.x compat, and CLIPSeg improvements * Rework notebook to use RS-CLIP model and diverse land-cover grid Switch the clip_classify_vector notebook from generic CLIP on building footprints to flax-community/clip-rsicd-v2 (remote-sensing fine-tuned) with a 60 m grid of sample patches covering diverse land types. - Primary demo: land-cover classification (residential, vegetation, road, parking lot, grass) on grid patches — 82.6% mean confidence - Secondary demo: building footprint classification with RS-CLIP - Top-k predictions section using CLIPVectorClassifier class API - Model recommendations table comparing RS-CLIP vs generic CLIP - Black formatting applied to test files * Update notebook to Chesapeake dataset with ground-truth validation Replace the suburban NAIP example with the Chesapeake Bay Watershed NAIP scene (eastern Maryland, 2018) paired with its 13-class land-cover ground truth raster. This produces significantly better results: - 88.6% overall agreement with Chesapeake ground truth (zero-shot) - Forest recall: 97%, Agricultural field recall: 83% - Mean confidence: 0.928 - Balanced label distribution (244 forest / 256 agricultural) Key changes: - Dataset: m_3807511_ne_18_060_20181104.tif + landcover raster - Labels: forest / agricultural field (matching dominant land cover) - New section: ground-truth comparison with per-class accuracy - Downsampled raster display (10x) to prevent OOM on Colab - Side-by-side CLIP prediction vs ground truth visualization * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: giswqs <giswqs@gmail.com>
1 parent 627a5aa commit 0b9997e

File tree

7 files changed

+2172
-39
lines changed

7 files changed

+2172
-39
lines changed

0 commit comments

Comments
 (0)