Add CropModel backbone factory and multi-architecture support by Ritesh313 · Pull Request #1313 · weecology/DeepForest

Ritesh313 · 2026-02-19T00:23:23Z

Description

Adding pretrained RGB models from NeonTreeClassification. There are some setup decisions for CropModel that need to be discussed. I have mentioned these at the end.

Where applicable, I've noted the specific problem or reason for each change in the solution below.

Main functionality added: Support for using a species and genus level classifier on DeepForest crowns.

Solution:

Backbone support in CropModel. CropModel is hardcoded to resnet50. Users cannot submit models with other architectures (e.g. resnet18). Uploaded 2 resnet18-based models (one at species level, one at genus) to HuggingFace. Added support for different backbones in src/deepforest/model.py. The new setup replaces the simple_resnet_50() method. The changes still support the older functionality as is. Major changes are in model.py and should be straightforward to follow. Main addition is support for multiple backbones:
Tests. Added tests verifying the default matches the current behavior (based on simple_resnet_50()), and assert tests for output shape of the new architecture. See tests/test_crop_model.py.
Docs. Added instructions for using these models in docs/user_guide/02_prebuilt.md as mentioned in the issue.
Contributing guide. Added the setup I followed for adding the model to CONTRIBUTING.md. This will need some discussion, let me know your thoughts.
Config. Added support for architecture parameter in config.yaml and schema.py. These default to resnet50 so things are compatible with the existing setup.
Sample images. Added 3 sample images for classification testing: NEON_PSMEM_crop.png, NEON_TSCA_crop.png, NEON_PICOL_crop.png.
Formatting. Some purely formatting changes were made in code I didn't functionally touch. Specifically in tests/test_crop_model.py.

Testing: 5 new tests added. Full test suite: 304 passed, 1 xfailed, 0 failures. Pre-commit: all hooks passed.

Breaking changes: None. All defaults are resnet50 so things are compatible with the existing setup.

Discussion points:

Restricted to torchvision backbones. I think it keeps things simple for now. We can consider expanding if there's more interest down the line. Let me know if another approach is preferred.
How do we want to manage documentation for user-contributed classifiers? I'm not sure if the heading in 02_prebuilt.md is too generic for the long term. Do we want to keep adding model details here on this page? Just thinking about whether it might get cluttered with multiple models over time. CONTRIBUTING.md will change accordingly.
I have not added integration tests that load models from HuggingFace. A few options: a test that only runs when explicitly requested (not part of the default test suite since it downloads from HuggingFace), no in-repo tests for contributed models, or something else. Would like your input.

Related Issue(s)

Closes #1115

AI-Assisted Development

I used AI tools (e.g., GitHub Copilot, ChatGPT, etc.) in developing this PR
I understand all the code I'm submitting
I have reviewed and validated all AI-generated code

AI tools used (if applicable): GitHub Copilot (Claude)

bw4sz · 2026-02-19T01:07:06Z

I love that you not only added a model, but took on improving the cropmodel general structure, which is not used much. Let's have a look at the tests.

codecov · 2026-02-19T01:45:49Z

Codecov Report

❌ Patch coverage is 93.47826% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.42%. Comparing base (884502e) to head (afaf4cb).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
src/deepforest/model.py	92.30%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1313      +/-   ##
==========================================
+ Coverage   87.35%   87.42%   +0.06%     
==========================================
  Files          24       24              
  Lines        2981     3037      +56     
==========================================
+ Hits         2604     2655      +51     
- Misses        377      382       +5

Flag	Coverage Δ
unittests	`87.42% <93.47%> (+0.06%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ethanwhite · 2026-02-19T12:33:27Z

Friendly reminder that we decided to host this in the weecology huggingface org. @Ritesh313 - I just invited you and you should have sufficient permissions to create your own repos and write to them. Let me know if anything isn't working.

Ritesh313 · 2026-02-19T16:43:23Z

Thanks both!
@ethanwhite done, models are now hosted at weecology/cropmodel-neon-resnet18-species and weecology/cropmodel-neon-resnet18-genus. I've also updated all references in the docs accordingly.

A few other things I added in this latest push:

Validation script (validate_cropmodel.py) — runs the full end-to-end workflow (DeepForest crown detection → CropModel classification) on an existing tile and checks that outputs are non-trivial. Mostly useful for contributors wanting to sanity-check a model before submitting.
Minor .gitignore additions (.venv, uv.lock, logs/*)

Based on the discussion in #1115, my understanding is that this PR should serve as a reference example for user-contributed CropModels. That's why I made the structural changes to CropModel. I want to make sure that lines up with your expectations before going further. There are also a few open clarification questions in the original PR description that I'd love input on.

@bw4sz on the tests, are you looking for more unit test coverage of the new CropModel methods, or did you want to see example outputs of the tests I ran? I have copied the outputs of unit tests and the end-to-end test that I have so far, in case you wanted these:

Unit tests —

 pytest test_crop_model.py

PASSED tests/test_crop_model.py::test_CropModel
PASSED tests/test_crop_model.py::test_CropModel_label_dict
PASSED tests/test_crop_model.py::test_CropModel_predict
PASSED tests/test_crop_model.py::test_CropModel_predict_return_type
PASSED tests/test_crop_model.py::test_CropModel_train
PASSED tests/test_crop_model.py::test_load_from_disk
PASSED tests/test_crop_model.py::test_load_from_HuggingFace
PASSED tests/test_crop_model.py::test_save_and_reload
PASSED tests/test_crop_model.py::test_save_to_HuggingFace
PASSED tests/test_crop_model.py::test_use_cropmodel_during_prediction
PASSED tests/test_crop_model.py::test_use_cropmodel_predict_tile
PASSED tests/test_crop_model.py::test_CropModel_no_label_dict
PASSED tests/test_crop_model.py::test_predict_with_saved_model
======================= 13 passed in 11.95s ========================

End-to-end —

python validate_cropmodel.py --model weecology/cropmodel-neon-resnet18-species

Loading CropModel: weecology/cropmodel-neon-resnet18-species
  Classes: 167 (2PLANT, 2PLANT-S, ABAM, ABBA, ABCO...)

Running detection + classification on: OSBS_029.tif
55 predictions in overlapping windows, applying non-max suppression
55 predictions kept after non-max suppression
  Detected and classified 55 crowns

==================================================
Classification Summary
==================================================
Total crowns: 55
Unique classes predicted: 14
Confidence: min=0.144, mean=0.556, max=1.000

Species distribution:
     PIPA2:   14 (25.5%)
     POTR5:    9 (16.4%)
     QUWI2:    7 (12.7%)
      PIMA:    5 ( 9.1%)
     BENE4:    4 ( 7.3%)
      PITA:    3 ( 5.5%)

All sanity checks passed.

bw4sz · 2026-02-19T16:46:09Z

I'm unconcerned about coverage compared to making sure that the model reflects the evaluated performance. There are so many details in how inference versus training happen that its quite easy to have some different in preprocessing that can quiet corrode expected accuracy, we found one of these just yesterday #1310. As a first step, can you produce a few predictions from this PR, using the CropModel.load_from_checkpoint and the validation data from your training to illustrate the model produces predictions as intended or that match your expected performance.

Ritesh313 · 2026-02-19T18:23:37Z

You were right, I found a similar preprocessing issue during testing. The models were trained with nearest-neighbor resize, but CropModel uses bilinear by default. On the held-out test set (7,196 samples), bilinear gives 34% accuracy while nearest gives 87%, matching my training pipeline results (species 86.9%, genus 87.8%).

I've made the resize interpolation a configurable parameter so the right mode gets used automatically. Now the results match. But this change touches a few files so it needs more testing. I'll get back to this and update the PR.

If you think this change should be a different PR and issue, let me know.

bw4sz · 2026-02-19T19:30:59Z

No thats fine as part of this PR, helps connect the 'why' that was changed. I want to double check your numbers there, 87% accuracy for RGB species? That seems really high, and looking at https://github.com/GatorSense/NeonTreeClassification/blob/main/docs/training.md#baseline-results I was expecting much lower.

Googoochadwick · 2026-02-20T06:20:41Z

@Ritesh313

#1310 has been merged and I believe the accuracy would be different now.

I believe as cropmodel was initially saving crops as BGR, it was impacting the accuracy negatively as inspected and was also delivering undesired results as mentioned here earlier.

@Ritesh313 the change has been pushed to main so I suggest you to check the results again for below:

You were right, I found a similar preprocessing issue during testing. The models were trained with nearest-neighbor resize, but CropModel uses bilinear by default. On the held-out test set (7,196 samples), bilinear gives 34% accuracy while nearest gives 87%, matching my training pipeline results (species 86.9%, genus 87.8%).

I've made the resize interpolation a configurable parameter so the right mode gets used automatically. Now the results match. But this change touches a few files so it needs more testing. I'll get back to this and update the PR.

If you think this change should be a different PR and issue, let me know.

Aside from that, I will also do my own tests based on the contents discussed here.

Ritesh313 · 2026-02-25T14:44:09Z

I've gone through the full CropModel prediction pipeline step by step and pushed the changes. The high training number is indeed wrong, it was because of data leakage (addressed now), the actual is around 48%. The BGR issue doesn't affect the inference pipeline since no crops are saved to disk there (rasterio reads → transform → model).

The main thing I found was that torchvision.transforms.Resize defaults to bilinear interpolation, but the NEON models were trained with nearest-neighbor (via the NTC pipeline). That mismatch was silently hurting accuracy. I made resize_interpolation a configurable parameter. The library defaults to bilinear for backward compatibility, and the NEON HuggingFace configs specify nearest so they load it automatically.

Verified by loading the HF model and evaluating on my val set. It reproduces the same accuracy as my training pipeline, so the end-to-end flow is working correctly now.

Other than that, the rest of the pipeline checked out fine.

bw4sz · 2026-02-25T17:34:31Z

Great, looking at the code and I want to download and give a try against new data to see what kind of details we get.

bw4sz · 2026-02-25T18:44:25Z

@ethanwhite what's your opinion here. Part of the deepforest narrative is to find architectures from users,

https://huggingface.co/weecology/cropmodel-neon-resnet18-genus

I don't think any user will care that its 'neon' or that its 'resnet-18', I think we want to keep the idea that this is generic, or atleast a generic place to start finetuning. The metadata and model card are where these information live.

I proposed cropmodel-tree-species and cropmodel-tree-genus

@Ritesh313 I don't think it should be difficult to find and replace and re-upload.

Ritesh313 · 2026-02-25T20:37:48Z

Makes sense to keep it simple for now. I was thinking of including the data source in the name in case we eventually get user-contributed models trained on different datasets, something like cropmodel-neon-species vs cropmodel-userdata-species. But cropmodel-tree-species and cropmodel-tree-genus works, we can always revisit the naming if that comes up. The metadata and model card track the details anyway.

I'll also update these to the latest models trained on clean data (removed duplicates and rare species).

ethanwhite · 2026-02-26T12:52:36Z

I don't think any user will care that its 'neon' or that its 'resnet-18', I think we want to keep the idea that this is generic, or at least a generic place to start finetuning. The metadata and model card are where these information live.

I don't feel strongly so I'm fine with this change. Generally I think that most users won't look at a particular name on HF, they'll look at a docs page that we have on contributed models, identify the one they want, and copy and paste the code for loading it.

bw4sz · 2026-03-06T18:23:36Z

@Ritesh313 am I waiting on retraining here, or should I merge these?

Ritesh313 · 2026-03-06T18:48:19Z

You can merge this. I can update the model on HF directly

- Add _CROP_BACKBONES registry and create_crop_backbone() factory - CropModel.create_model() accepts architecture parameter - Rewrite from_pretrained() to read config before loading weights - Add architecture field to config.yaml and schema.py - Add NEON species/genus model docs to 02_prebuilt.md - Add CropModel contribution guide to CONTRIBUTING.md - Add backbone factory tests to test_crop_model.py - Add NEON crop test images Closes weecology#1115

- Move NEON ResNet-18 model references from ritesh313/ to weecology/ HuggingFace org (weecology/cropmodel-neon-resnet18-species, weecology/cropmodel-neon-resnet18-genus) - Add tests/validate_cropmodel.py: runs DeepForest detection + CropModel classification on a bundled tile and checks predictions are non-trivial (multiple classes, confidence > 0.3, no NaN labels) - Update .gitignore: add .venv/ and uv.lock

The CropModel resize was hardcoded to bilinear (torchvision default), but some models need nearest-neighbor — e.g. the NEON species/genus models are trained on small crops (~40px) where bilinear smoothing kills accuracy. This adds a resize_interpolation config option that gets threaded through training, inference, and the bounding box dataset. Default is bilinear so nothing changes for existing users. The NEON HF models already have resize_interpolation: nearest in their config.json so they pick it up automatically. - Add resize_interpolation to cropmodel config, schema, and all transform paths - Thread it through predict.py -> BoundingBoxDataset -> bounding_box_transform() - Extend test_crop_model_configurable_resize to cover interpolation mode - Update docs: 02_prebuilt, 03_cropmodels, CONTRIBUTING config example

bw4sz

Ready when the pre-commit passes.

bw4sz · 2026-03-19T02:17:44Z

please rebase main and that should be fixed.

bw4sz added High Priority labels Feb 19, 2026

bw4sz self-requested a review February 25, 2026 17:33

bw4sz requested changes Mar 8, 2026

View reviewed changes

Comment thread tests/validate_cropmodel.py Outdated

Comment thread .gitignore

Comment thread CONTRIBUTING.md

bw4sz mentioned this pull request Mar 8, 2026

CropModel enforces ImageNet norm weights #1347

Closed

musaqlain mentioned this pull request Mar 10, 2026

allow custom image normalization in CropModel #1350

Merged

vickysharma-prog mentioned this pull request Mar 12, 2026

Cluster box features to help choose images to annotate #1351

Open

Ritesh313 added 7 commits March 17, 2026 10:47

Add logs/* to .gitignore from upstream

e6d44ae

Rename HF models to cropmodel-tree-species/genus and update docs

0c8359a

Remove validate_cropmodel helper from tests

cdeae85

fix: remove duplicate logs/* entry in .gitignore

afaf4cb

Ritesh313 force-pushed the issue-1115-cropmodel-architectures branch from db6f241 to afaf4cb Compare March 17, 2026 14:58

Ritesh313 requested a review from bw4sz March 17, 2026 15:04

bw4sz approved these changes Mar 18, 2026

View reviewed changes

bw4sz merged commit 7e059e0 into weecology:main Mar 20, 2026
6 of 7 checks passed

Conversation

Ritesh313 commented Feb 19, 2026

Description

Related Issue(s)

AI-Assisted Development

Uh oh!

bw4sz commented Feb 19, 2026

Uh oh!

codecov bot commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ethanwhite commented Feb 19, 2026

Uh oh!

Ritesh313 commented Feb 19, 2026

Uh oh!

bw4sz commented Feb 19, 2026

Uh oh!

Ritesh313 commented Feb 19, 2026

Uh oh!

bw4sz commented Feb 19, 2026

Uh oh!

Googoochadwick commented Feb 20, 2026

@Ritesh313

#1310 has been merged and I believe the accuracy would be different now.

Uh oh!

Ritesh313 commented Feb 25, 2026

Uh oh!

bw4sz commented Feb 25, 2026

Uh oh!

bw4sz commented Feb 25, 2026

Uh oh!

Ritesh313 commented Feb 25, 2026

Uh oh!

ethanwhite commented Feb 26, 2026

Uh oh!

bw4sz commented Mar 6, 2026

Uh oh!

Ritesh313 commented Mar 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bw4sz left a comment

Choose a reason for hiding this comment

Uh oh!

bw4sz commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Feb 19, 2026 •

edited

Loading