Skip to content

Fix deadlock in BatchSamplerShard with drop_last=True#1

Open
Kabir08 wants to merge 1 commit intomainfrom
fix/dataloader-epoch-deadlock-multi-task
Open

Fix deadlock in BatchSamplerShard with drop_last=True#1
Kabir08 wants to merge 1 commit intomainfrom
fix/dataloader-epoch-deadlock-multi-task

Conversation

@Kabir08
Copy link
Copy Markdown
Owner

@Kabir08 Kabir08 commented Feb 25, 2026

This fix addresses issue huggingface#3814 where deadlocks could occur when using custom batch samplers (generators) with drop_last=True in multi-process scenarios.

…h samplers without __len__

This fix addresses issue huggingface#3814 where deadlocks could occur when using custom
batch samplers (generators) with drop_last=True in multi-process scenarios.

Changes:
- Added logic to determine last_batch_idx from batch sampler, handling cases where
  batch sampler doesn't have __len__ (set last_batch_idx = -1)
- Added immediate yield logic when drop_last=True and either:
  1. Batch sampler doesn't have __len__ (generator-based), OR
  2. At last batch and it's incomplete

Also added test_batch_sampler_with_drop_last_no_len() to verify the fix works
correctly with custom batch samplers that don't have __len__.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant