-
Notifications
You must be signed in to change notification settings - Fork 0
fix: use created_at as canonical anchor for stale-claim detection #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -62,6 +62,9 @@ class Commit: | |
| output_ref: ObjectRef | None = None | ||
| parent_hash: str | None = None | ||
| status: TaskStatus = TaskStatus.PENDING | ||
| # created_at is the canonical anchor for task lifetime; it never changes | ||
| # and is used for stale-claim detection so that pending tasks cannot hide | ||
| # behind a recent heartbeat. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. P3 Low The comment on |
||
| created_at: datetime = field(default_factory=lambda: datetime.now(UTC)) | ||
| claimed_at: datetime = field(default_factory=lambda: datetime.now(UTC)) | ||
| error: str | None = None | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -299,6 +299,27 @@ def slow() -> int: | |
| with pytest.raises(TaskError, match="TimeoutError"): | ||
| client.submit(slow, _timeout=0.01) | ||
|
|
||
| def test_old_created_at_trumps_recent_claimed_at(self, store_dir: Path) -> None: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. P2 Medium The test |
||
| import cashet.dag as dag | ||
| import cashet.hashing as hashing | ||
| from cashet.models import TaskStatus | ||
|
|
||
| client = Client(store_dir=store_dir) | ||
|
|
||
| def work() -> int: | ||
| return 42 | ||
|
|
||
| task_def = hashing.build_task_def(work, (), {}) | ||
| input_refs = dag.resolve_input_refs((), {}) | ||
| commit = dag.build_commit(task_def, input_refs) | ||
| commit.status = TaskStatus.RUNNING | ||
| commit.created_at = datetime.now(UTC) - timedelta(seconds=400) | ||
| commit.claimed_at = datetime.now(UTC) - timedelta(seconds=5) | ||
| client.store.put_commit(commit) | ||
|
|
||
| ref = client.submit(work) | ||
| assert ref.load() == 42 | ||
|
|
||
| def test_running_claim_lookup_is_not_limited_to_1000_rows(self, store_dir: Path) -> None: | ||
| import cashet.dag as dag | ||
| import cashet.hashing as hashing | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P0 Critical The change from
commit.claimed_attocommit.created_atcorrectly ensures that stale-claim detection uses the task's immutable creation time. However, consider adding a safeguard for edge cases wherecreated_atmight be in the future (e.g., clock skew). While not present in current code, defensive checking could prevent unexpected behavior. This is a minor concern; the logic is otherwise sound.