DeepForest 2.0 Announcement #1177
bw4sz
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
DeepForest 2.0 Announcement
It is with great excitement that we release DeepForest 2.0! This is the first major release since the project took shape in its current form. The codebase has been updated to match modern machine learning workflows, including better command line tools, configurations and package management. These improvements will help users scale research projects quickly, test out new ideas and make their work more reproducible and long-lasting.
Our experience with machine learning for biology is that projects are not limited by models, or even data, but by the capacity for implementing workflows that can be scaled and maintained over time. The technical overhead of machine learning continues to grow. The drumbeat of new model architectures, benchmarks and ideas obscures the reality that most applications are limited by organization, data sharing and maintainability. DeepForest 2.0 is built specifically to help with these details. We have moved model sharing to Huggingface to allow easy contributions from users and sharing among research teams. We have refined prediction workflows to scale with massive new GPUs, while providing options for users with limited computing power. We focused on general biodiversity categories, 'trees', 'birds', 'livestock', rather than detailed taxonomic models. DeepForest 2.0 adds more support for the Detection + CropModel concept, in which two separate models are used for object detection and a second model is used for classification. We have found this to be simple, generalizable, and easier to interpret.
The next step for DeepForest is to expand the toolsets for the common workflows in airborne biodiversity monitoring. Want to annotate images with a single keypoint instead of more time-intensive boxes? Or maybe you need that extra geometric detail to get your tree boundaries? A keypoint model and a polygon model are coming soon. Have a huge dataset and don't know which images contain biodiversity objects? An integrated active learning workflow is coming soon. At the same time, DeepForest will continue to pursue integrations with the growing libraries of machine learning tools for environmental science. Torchgeo, Pytorch Wildlife, and many other libraries are great contributions to open-science and we want to work closely with these teams to allow models and data to be shared among packages. There is much work to be done and we welcome contributors to get involved to share their experience and needs. Together we can make a better open-source ecosystem to bring AI innovations to everyone. Deepforest can only improve with user issues and contributions. We welcome issues, requests and discussions of ideas big and small to make DeepForest the best it can be.
Release Notes
There are a small number of breaking changes to DeepForest as we standardize the API. We have coalesced around Hydra for configuration management. A scalable customization tool for managing scientific experiments, hydra allows users to specify a range of configuration values and pathways.
1. Huggingface for Model-Sharing and Archiving
The explosion of machine learning for biology means there are more models than ever before. Rather than focusing on creating models ourselves, we are committed to creating a simple avenue to plug models from the community into DeepForest. Adopting functionality from the popular transformers library and model sharing using HuggingFace's repositories, users can upload models to their own repos and get them through the new
load_modelstructure. This works for detection or classification models and will allow us to quickly integrate new models. We welcome contributions and pull requests to list your model as available for others in the community.As an example, we recently included support for Deformable DETR, a state-of-the-art transformer based object detection architecture. In the near future we will support different backbones for our default RetinaNet model, including pretrained foundation models like dinov3.
The
use_releasefunctions have been replaced with the generalload_model:2. Configuration using Hydra
Hydra is another one of those quiet but essential entries to machine learning workflows. Hydra is based on Omegaconf, a configuration manager that uses the YAML file format. Each config file inherits from a default config that we provide, which means you can easily adjust one or two parameters without having to copy and paste everything else. For example, here is how you might specify a new config to train on your own dataset, with a different batch size:
With the new command line interface, you don't need to write a config file at all - you can set parameters as arguments to the tool (see below). This replaces the old interface where there were arguments passed to
deepforest.main3. Better Data Reading and Visualizations
In the theme of under-appreciated obstacles to project success, getting data into the correct format and visualizing machine learning models and their outputs is chronically undervalued. To help with data loading, a new
read_filemethod helps read a large array of standard formats, from.csv/.json/.xml/.shpfiles in common formats like COCO and PascalVOC for object detection. DeepForest 2.0 adopts thesupervisionpackage from RoboFlow to help smoothly create plots of images, annotations and model outputs. Where possible we look to create connections with other open source packages and supervision will help us support a range of annotation formats, including points, polygons and videos. Check out their documentation and examples!5. Deprecations and Name Changes
DeepForest prioritizes simple functions to complete common tasks. Users often need to convert predictions overlaid over images into geographic coordinates on the earth's surface. We have revamped this workflow and introduced two functions
image_to_geo_coordinatesandgeo_to_image_coordinatesto replace the oldboxes_to_shapefileworkflow.All 1.0 visualization functions have been deprecated including
plot_predictions(). Please useplot_results().A full list of migrations is here:
https://github.com/weecology/DeepForest/blob/main/HISTORY.md#version-200rc1-date-october-21-2025
6. Command Line Tools
Not all of our users are comfortable writing custom scripts and there are tasks like training, prediction and evaluation which require writing more-or-less the same code each time. You can now use the
deepforestcommand to perform these actions, which also supports config overrides. You can also look at the command line script as a reference example that you could modify in your own code. For example, to train a model:Or to use your own config:
If your configuration has
validation.csv_fileandvalidation.root_dir, runningdeepforest evaluatewill run our standard detection metrics on it.Beta Was this translation helpful? Give feedback.
All reactions