Fix scheduler stepping logic and checkpoint state persistence in PR#11 by Copilot · Pull Request #12 · thinksyncs/YOLOZU

Copilot · 2026-02-06T09:16:30Z

Code review of PR#11 revealed three critical correctness issues in the optimizer/scheduler factory implementation that would cause incorrect training behavior.

Issues Fixed

1. LinearWarmupWrapper dual counter desynchronization

The wrapper maintained both _step_count and inherited last_epoch, causing warmup phase to be off by one step. Now uses only last_epoch from parent _LRScheduler.

Before:

def __init__(self, ...):
    self._step_count = 0  # Custom counter
    super().__init__(optimizer, last_epoch)  # Parent has last_epoch=-1
    
def get_lr(self):
    if self._step_count < self.warmup_steps:  # Uses custom counter
        alpha = self._step_count / self.warmup_steps

After:

def __init__(self, ...):
    # Removed self._step_count
    super().__init__(optimizer, last_epoch)
    
def get_lr(self):
    current_step = self.last_epoch + 1  # Uses parent counter consistently
    if current_step < self.warmup_steps:
        alpha = current_step / self.warmup_steps

2. EMA shadow weights lost on checkpoint resume

EMA.state_dict() existed but was never called. Training resume would reinitialize EMA from current weights, losing accumulated shadow.

Added scheduler and ema parameters to save_checkpoint_bundle() and load_checkpoint_into()
Reorganized initialization: EMA and scheduler created before checkpoint resume
Both save_checkpoint_bundle() call sites updated

3. Scheduler state not persisted

Similar to EMA - scheduler would restart from step 0 on resume, breaking learning rate schedule continuity.

Testing

All 12 unit tests pass
Ruff linting clean
CodeQL: 0 vulnerabilities

Files Modified

rtdetr_pose/rtdetr_pose/sched_factory.py - Fixed LinearWarmupWrapper logic
rtdetr_pose/tools/train_minimal.py - Added checkpoint persistence for scheduler/EMA state

Original prompt

review PR#11

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: thinksyncs <42225585+thinksyncs@users.noreply.github.com>

Copilot

Pull request overview

This PR attempts to fix three critical issues identified in PR#11's optimizer and scheduler factory implementation: dual counter desynchronization in LinearWarmupWrapper, missing EMA state persistence in checkpoints, and missing scheduler state persistence in checkpoints.

Changes:

Added checkpoint persistence for scheduler and EMA state by extending save_checkpoint_bundle() and load_checkpoint_into() functions
Reorganized initialization order to create scheduler and EMA before checkpoint resume to enable state loading
Removed dual counter from LinearWarmupWrapper to use only the inherited last_epoch counter

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
rtdetr_pose/rtdetr_pose/sched_factory.py	New scheduler factory module with LinearWarmupWrapper, build_scheduler(), and EMA class implementation
rtdetr_pose/tools/train_minimal.py	Added scheduler/EMA parameters to checkpoint functions, integrated factory scheduler/EMA into training loop, added CLI arguments for scheduler/EMA configuration
SECURITY_SUMMARY.md	Added security review summary documenting CodeQL scan results and security considerations for PR#11
PR11_REVIEW_FIXES.md	Documentation describing the three issues found in PR#11 and the fixes applied

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rtdetr_pose/rtdetr_pose/sched_factory.py

rtdetr_pose/tools/train_minimal.py

rtdetr_pose/rtdetr_pose/sched_factory.py

Initial plan

11c9ed2

Copilot AI assigned Copilot and thinksyncs Feb 6, 2026

Copilot started work on behalf of thinksyncs February 6, 2026 09:16 View session

Copilot AI and others added 2 commits February 6, 2026 09:28

Complete PR#11 review with documented fixes

ae94c1f

Co-authored-by: thinksyncs <42225585+thinksyncs@users.noreply.github.com>

Update security summary with PR#11 review results

0aac165

Copilot AI changed the title ~~[WIP] Review changes in pull request #11~~ Fix scheduler stepping logic and checkpoint state persistence in PR#11 Feb 6, 2026

Copilot AI requested a review from thinksyncs February 6, 2026 09:33

Copilot finished work on behalf of thinksyncs February 6, 2026 09:33

thinksyncs marked this pull request as ready for review February 6, 2026 09:40

Copilot AI review requested due to automatic review settings February 6, 2026 09:40

Copilot started reviewing on behalf of thinksyncs February 6, 2026 09:40 View session

Copilot AI reviewed Feb 6, 2026

View reviewed changes

thinksyncs approved these changes Feb 6, 2026

View reviewed changes

Merge branch 'main' into copilot/review-pr-11

08f9f87

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix scheduler stepping logic and checkpoint state persistence in PR#11#12

Fix scheduler stepping logic and checkpoint state persistence in PR#11#12
Copilot wants to merge 4 commits intomainfrom
copilot/review-pr-11

Copilot AI commented Feb 6, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issues Fixed

1. LinearWarmupWrapper dual counter desynchronization

2. EMA shadow weights lost on checkpoint resume

3. Scheduler state not persisted

Testing

Files Modified

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Feb 6, 2026 •

edited

Loading