Squeeze Every Bit of Insight: Leveraging Few-shot Models with a Compact Support Set for Domain Transfer in Object Detection from Pineapple Fields
1 Atlantic Campus, Universidad de Costa Rica, Cartago, Costa Rica
2 Computer Science Department, Instituto Tecnologico de Costa Rica, Cartago, Costa Rica
Object Detection in deep learning typically requires large manually labeled datasets and significant computational resources for model training, making it costly and resource-intensive. To address these challenges, we propose a novel framework featuring a two-stage pipeline that eliminates the need for additional training. Our framework leverages the Segment Anything Model (SAM) as an object proposal generator combined with few-shot models to construct an efficient object detector.
We introduce the use of the Mahalanobis distance with support and context prototypes, which significantly improves performance compared to traditional Euclidean-based distance metrics. The proposed pipeline was validated through a custom pineapple detection application, demonstrating its effectiveness in real-world scenarios. Furthermore, we show that our approach, utilizing only a few labeled samples, can outperform state-of-the-art few-shot models without additional training. Finally, we evaluated several SAM variants for the object proposal network and found that FastSAM achieves the highest mean average precision (mAP) for drone imagery collected from pineapple fields, outperforming other Segment Anything Model variants.
- Python == 3.8
- Virtual environment (recommended)
- Dependencies listed in
requirements.txt
- Create and activate a virtual environment
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Clone the repository
git clone https://github.com/imagine-laboratory/squeeze_every_bit.git
cd squeeze_every_bit/src
- Install dependencies
pip install --upgrade pip
pip install -r requirements.txt
This will install:
- PyTorch with torchvision and torchaudio
- Image/video libraries: OpenCV, PyYAML, etc.
- Utilities:
timm
,scikit-learn
,psutil
- Few-shot segmentation models: SAM, FastSAM, MobileSAM, EdgeSAM, etc.
- Use Python 3.8.
- For GPU acceleration, install the appropriate PyTorch with CUDA.
- If you encounter issues with
git+https
dependencies, make sure Git is installed:
sudo apt install git # For Debian/Ubuntu
The dataset follows the COCO format, commonly used for object detection tasks.
dataset/
├── annotations/
│ ├── instances_train.json
│ └── instances_val.json
├── train/
│ ├── image1.jpg
│ └── ...
├── val/
│ ├── image101.jpg
│ └── ...
images
— Image metadataannotations
— Object instancecategories
— Class definitions
{
"images": [
{
"id": 1,
"file_name": "image1.jpg",
"width": 1920,
"height": 1080
}
],
"annotations": [
{
"id": 10,
"image_id": 1,
"category_id": 1,
"bbox": [100, 150, 200, 300],
"iscrowd": 0
}
],
"categories": [
{
"id": 1,
"name": "pineapple",
"supercategory": "fruit"
}
]
}
- Bounding boxes use
[x, y, width, height]
format.
To get started quickly, use the provided script to download pretrained weights.
pip install gdown
sudo apt install wget -y
gdown
is used for downloading files from Google Drive.
- Make the script executable:
chmod +x download_weights.sh
- Run the script:
./download_weights.sh
The following files will be saved in the weights/
directory:
FastSAM-x.pt
, edge_sam.pth
, sam_vit_h_4b8939.pth
, mobile_sam.pt
, sam_hq.pth
methods.py
is a Python script designed to evaluate few shot models with various configurable options including TIMM models, Segment Anything Models variants, dimensionality reduction.
Run the script via the command line with optional arguments to customize evaluation:
python methods.py [OPTIONS]
Argument | Type | Default | Description |
---|---|---|---|
--root |
str |
. |
Root directory path. |
--num-classes |
int |
1 |
Number of output classes. |
--use-sam-embeddings |
int |
0 |
Use SAM embeddings (0 = False, 1 = True). |
--timm-model |
str |
"" |
Name of TIMM model architecture to use, check on TIMM Collections. |
--ood-labeled-samples |
int |
1 |
Number of labeled out-of-distribution samples. |
--ood-unlabeled-samples |
int |
10 |
Number of unlabeled out-of-distribution samples. |
--ood-histogram-bins |
int |
15 |
Number of bins for OOD histogram. |
--dataset |
str |
"coco17" |
Dataset name to use. |
--batch-size |
int |
4 |
Batch size for training. |
--batch-size-val |
int |
64 |
Batch size for validation. |
--img-resolution |
int |
512 |
Input image resolution. |
--new-sample-size |
int |
224 |
Size of new samples after augmentation. |
--batch-size-labeled |
int |
1 |
Batch size for labeled data (in semi-supervised learning). |
--batch-size-unlabeled |
int |
4 |
Batch size for unlabeled data (in semi-supervised learning). |
--method |
str |
"None" |
Few shot method to use: 'samAlone' , 'fewshot1' , 'fewshot2' , 'fewshotOOD' , 'fewshotRelationalNetwork' , 'fewshotMatching' , 'fewshotBDCSPN' , 'fewshotMahalanobis' , 'ss' |
--numa |
int |
None |
NUMA node to use (for CPU affinity). |
--output-folder |
str |
None |
Folder to save outputs or checkpoints. |
--run-name |
str |
None |
Name of the run/experiment. |
--seed |
int |
None |
Random seed for reproducibility. |
--sam-model |
str |
None |
SAM Model weights size to use, available only for "sam" when using --sam-proposal model "b" or "h" . |
--device |
str |
"cuda" |
Device to run on ("cuda" or "cpu" ). |
--sam-proposal |
str |
"sam" |
SAM proposal type: "sam" , "edgsam" , "mobilesam" , "samhq" , or "fastsam" . |
--dim-red |
str |
"svd" |
Dimensionality reduction method ("svd" ). |
--n-components |
int |
10 |
Number of components for dimensionality reduction. |
--beta |
int |
1 |
Beta parameter (context-dependent). |
--mahalanobis |
str |
"normal" |
Mahalanobis distance variant. |
--batch-size-validation |
int |
4 |
Batch size during validation. |
--ood-validation-samples |
int |
10 |
Number of OOD validation samples. |
--mahalanobis-lambda |
float |
-1.0 |
Lambda parameter for Mahalanobis metric. |
The code can be executed with one of the following few-shot model options passed as a parameter (e.g., --method fewshot1
). Each corresponds to a different few-shot strategy:
Method Name | Description |
---|---|
samAlone |
SAM Alone. This method performs object segmentation using one of several Segment Anything Models (SAM-H, HQ-SAM-H, MobileSAM, EdgeSAM, SlimSAM-50, or FastSAM), treating all region proposals as independent object hypotheses without class-level adaptation. |
fewshot1 fewshot2 |
Euclidean Prototype. Implements the prototypical network method, where class prototypes are computed as the mean of embedded support samples, and classification is done using Euclidean distance to these prototypes, for one class fewshot1 and for two classes fewshot2 . |
ss |
Selective Search (SS). A classical, non-deep learning region proposal method that uses hierarchical segmentation based on pixel intensity to generate object candidates. Useful as a baseline to compare against deep learning models. |
fewshotOOD |
Density Prototype. This model extends the prototypical approach by estimating class prototypes using density functions, providing robustness for out-of-distribution (OOD) detection. |
fewshotBDCSPN |
BD-CSPN. Based on Liu et al., this approach modifies the prototypical method by dynamically refining class centroids. |
fewshotMahalanobis |
Mahalanobis Distance Prototype. This variation replaces Euclidean distance with Mahalanobis distance to account for feature covariance, enabling more adaptive decision boundaries. |
python methods.py \
--root ./pineapples_5m --num-classes 1 --use-sam-embeddings 0 --timm-model "resnet50" \
--dataset "coco17" --batch-size 16 --img-resolution 224 --device "cuda" \
--run-name "experiment_01" --sam-proposal "fastsam" --method "samAlone"
If you find this repository useful, please star ⭐ the repository and cite:
@InProceedings{squeeze_bit_insight,
author={Fallas-Moya, Fabian and Xie-Li, Danny and Calderon-Ramirez, Saul},
title={Squeeze Every Bit of Insight: Leveraging Few-shot Models with a Compact Support Set for Domain Transfer in Object Detection from Pineapple Fields},
year={2025}
}
For related work on training-free object detection for agriculture applications:
Simple Object Detection Framework without Training [bib]
@InProceedings{simple_object_detection_framework_without_training,
author={Xie-Li, Danny and Fallas-Moya, Fabian and Calderon-Ramirez, Saul},
booktitle={2024 IEEE 6th International Conference on BioInspired Processing (BIP)},
title={Simple Object Detection Framework without Training},
year={2024},
pages={1-6},
doi={10.1109/BIP63158.2024.10885396}
}
@InProceedings{fallas2023object,
title={Object detection in pineapple fields drone imagery using few shot learning and the segment anything model},
author={Fallas-Moya, Fabian and Calderon-Ramirez, Saul and Sadovnik, Amir and Qi, Hairong},
booktitle={2023 International Conference on Machine Learning and Applications (ICMLA)},
pages={1635--1642},
year={2023},
organization={IEEE}
}
This research was partially supported by computational resources provided through a machine allocation on the Kabré supercomputer at the Costa Rica National High Technology Center. Additional support was received from the University of Costa Rica (project C4612) and the Postgraduate Office of the Instituto Tecnológico de Costa Rica, which facilitated this publication.