feat(spider-scheduler): Add scheduler crate skeleton with trait and type abstractions.#330
feat(spider-scheduler): Add scheduler crate skeleton with trait and type abstractions.#330LinZhihao-723 wants to merge 5 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (8)
WalkthroughThis pull request introduces the ChangesSpider Scheduler Core
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| /// Returns an error if: | ||
| /// | ||
| /// * [`SchedulerError::DispatchQueueClosed`] if the dispatching queue is closed. | ||
| async fn enqueue(&self, assignment: TaskAssignment) -> Result<(), SchedulerError>; |
There was a problem hiding this comment.
We should probably have a batched enqueue method for better performance.
There was a problem hiding this comment.
The current planned implementation won't benefit from a batch operation:
- The dispatch queue is implemented using async channel, meaning that all enqueue operations will be serialized.
- The scheduler decision maker pops assignments from the queue one by one; a batch operation means we need to construct/destruct vector on top of the popped results, which introduces unnecessary overhead.
Description
This PR introduces a new crate
spider-schedulerand lands the trait and type abstractions that the scheduler will be built on top of. No concrete implementations are included — those follow in subsequent PRs.Architecture
The scheduler is the serial decision maker that turns ready tasks discovered by the storage layer into assignments for execution managers. It owns placement and ordering policy, not dependency resolution: storage decides what is ready, and the scheduler decides in what order and with what throttling ready tasks are offered to the fleet.
The pipeline:
Trait seams
SchedulerStorageClient— the scheduler's view of storage. Three lane-specific polls (poll_ready,poll_commit_ready,poll_cleanup_ready) mirror storage'sReadyQueueReceiverHandlelanes; each returns(SessionId, Vec<InboundEntry>)so a stale-session batch can be detected downstream.job_state(JobId) -> JobStateexposes a read-only lookup for placement policies that gate on job lifecycle.SchedulerCore— the algorithm seam. Owns its decision loop: poll the inbound queue through its associatedStorageClient, apply the scheduling algorithm, and write assignments to its associatedSink. Generic over both, so a real algorithm and a mock can share the same runtime. The loop terminates when itstokio_util::sync::CancellationTokenis cancelled.DispatchQueueSink— the writer side of the dispatching queue.enqueue(TaskAssignment)awaits when the bounded queue is full, providing the back-pressure that throttles the core to fleet drain rate.bump_session_id(SessionId)advances the queue's current session and invalidates everything currently queued; the core calls it when it observes a strictly-higher session from a poll.DispatchQueueSource— the reader side, drained by the EM-facing service.dequeue() -> (SessionId, TaskAssignment)returns the next assignment paired with the session it was enqueued under, so the EM can compare against storage's current session at registration time and discard stale assignments.Checklist
breaking change.
Validation performed
Summary by CodeRabbit