Skip to content

increase timeout in skip-to-end script#167

Merged
jhutar merged 1 commit into
redhat-performance:mainfrom
NewtonChutney:main
Apr 1, 2026
Merged

increase timeout in skip-to-end script#167
jhutar merged 1 commit into
redhat-performance:mainfrom
NewtonChutney:main

Conversation

@NewtonChutney
Copy link
Copy Markdown
Contributor

Claude analyzed the issue behind the issue with the script : skip_to_end:

Log of CPT:

> script-skip-to-end.py --kafka-timeout 24000
2026-03-30 16:01:05,062 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:06,064 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:07,067 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:08,069 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:09,071 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:10,073 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:11,076 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:12,077 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:13,079 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:14,081 root MainThread WARNING Retrying as seek to end failed with: No partitions are currently assigned
2026-03-30 16:01:15,082 root MainThread ERROR Out of attempts when trying to seek to end

Claude's analysis

The issue is clear. The consumer calls subscribe() (line 115), then poll() + seek_to_end() in a retry loop
(10 attempts, 1s apart). But subscribe() is async — partition assignment happens in the background via the
consumer group coordinator. With only 10 retries at 1s each, the consumer doesn't get enough time to receive
its partition assignment from the ephemeral Kafka broker, which is likely slower than the production
cluster
..
The cleanest fix is in skip_to_end.py — increase the poll timeout and
retry count. Want me to update opl/opl/skip_to_end.py to give more time for partition assignment (e.g.,
increase poll(timeout_ms=0) to poll(timeout_ms=5000) and bump retries from 10 to 30)?

@jhutar jhutar merged commit e4a2695 into redhat-performance:main Apr 1, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants