Skip to content

Add keypoint dataset class#1311

Merged
bw4sz merged 1 commit intoweecology:mainfrom
jveitchmichaelis:keypoint_dataset
Mar 25, 2026
Merged

Add keypoint dataset class#1311
bw4sz merged 1 commit intoweecology:mainfrom
jveitchmichaelis:keypoint_dataset

Conversation

@jveitchmichaelis
Copy link
Copy Markdown
Collaborator

@jveitchmichaelis jveitchmichaelis commented Feb 17, 2026

Description

This PR adds a keypoint dataset that returns either (x, y) coordinates or a 2D density mask as target. The dict entry labels is consistent with transformers semantic segmentation models. This should cover the majority of the models that we might want to train. The structure is essentially the same as BoxDataset.

The diff is better than it looks. I've pulled out some duplicate logic for the box dataset and made base class which will be used for polygons as well. Most of the additions are in the test suite.

Models now have a task attribute which is used to load the correct dataset type (e.g. all current models have task = box). I think this is cleaner than having the user specify this via config, since it should be obvious by the choice of architecture.

Training is not supported, but I'd like to add that separately to reduce review load.

image

Related Issue(s)

#809
Supersedes #1182 as features from there are migrated.

AI-Assisted Development

Some suggest tests and refactoring.

  • I used AI tools (e.g., GitHub Copilot, ChatGPT, etc.) in developing this PR
  • I understand all the code I'm submitting
  • I have reviewed and validated all AI-generated code

AI tools used (if applicable):

Claude Code, Gemini

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 18, 2026

Codecov Report

❌ Patch coverage is 87.85714% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.87%. Comparing base (884502e) to head (18fe41b).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
src/deepforest/datasets/training.py 88.97% 14 Missing ⚠️
src/deepforest/main.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1311      +/-   ##
==========================================
- Coverage   87.35%   86.87%   -0.49%     
==========================================
  Files          24       24              
  Lines        2981     3176     +195     
==========================================
+ Hits         2604     2759     +155     
- Misses        377      417      +40     
Flag Coverage Δ
unittests 86.87% <87.85%> (-0.49%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

Coverage looks OK. Some edge cases in the unit tests which we could add later, and main coverage can be ignored for now as we don't have models in yet.

@jveitchmichaelis jveitchmichaelis marked this pull request as ready for review February 18, 2026 00:33
@jveitchmichaelis jveitchmichaelis marked this pull request as draft February 18, 2026 02:13
@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

Pausing review for a moment while I check some naming consistencies with utilities.

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Mar 12, 2026

@jveitchmichaelis can you either take me off request, or take off draft, trying to organize the issue queue.

@jveitchmichaelis jveitchmichaelis removed the request for review from bw4sz March 12, 2026 17:09
@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

Minor change here - the density map is normalized to sum to the object count for each class.

@jveitchmichaelis jveitchmichaelis marked this pull request as ready for review March 18, 2026 17:56
@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

@bw4sz this is good for review now, ignoring the pre-commit.ci failure.

@jveitchmichaelis jveitchmichaelis requested a review from bw4sz March 18, 2026 18:08
Copy link
Copy Markdown
Collaborator

@bw4sz bw4sz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the /src. I always am unsure about the amount of tests. For example, do we need flip tests? I think it's fine and they are pretty cheap to run. I'm going to approve, but something for others to weigh in on as we think about the size of the codebase, which I sense is going to swell a lot over the next year.

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Mar 18, 2026

Happy to merge once the pre-commit is solved.

@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

jveitchmichaelis commented Mar 18, 2026

Agree. I added the flip test to make sure that we have coverage for augmentations, since we haven't really tested them with points. Dataset tests are at least very fast, as you say.

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Mar 19, 2026

Please rebase main and that should be fixed.

@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

Good to go when the CI passes

@bw4sz
Copy link
Copy Markdown
Collaborator

bw4sz commented Mar 20, 2026

The #1343 made a tiny conflict, that I think you can handle right in GitHub. Then I will merge.

@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

jveitchmichaelis commented Mar 20, 2026

Rebased

@jveitchmichaelis
Copy link
Copy Markdown
Collaborator Author

All good, first rebase had a small conflict

@bw4sz bw4sz merged commit 5d18f0a into weecology:main Mar 25, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants