tzel-operator: --dal-slot-range CLI flag for slot selection#28
Open
saroupille wants to merge 1 commit into
Open
tzel-operator: --dal-slot-range CLI flag for slot selection#28saroupille wants to merge 1 commit into
saroupille wants to merge 1 commit into
Conversation
Previously the operator round-robined across all DAL slots reported by the protocol (0..number_of_slots from the /protocol_parameters endpoint), with no way to partition slot usage between multiple operators sharing a DAL node. Add `--dal-slot-range START..END` (half-open, Rust idiom) to restrict the slot pool. Round-robin is preserved within the configured range. Validation: - Parse-time: clap value_parser rejects malformed input, empty ranges (start == end) and inverted ranges (start > end). - Runtime: on each publish we re-validate `range.end <= number_of_slots` against the live DAL protocol parameters — fail fast with a clear message if the operator was configured outside the protocol-allowed band, or if the protocol shrinks number_of_slots out from under us. The retry loop in `publish_dal_chunk_with_protocol` now bounds attempts on the configured range width rather than `number_of_slots`. The shared AtomicU64 means that under concurrent publishes a single chunk attempt may revisit a slot or skip one before exhausting the range — no behavioural change relative to the pre-patch shared-counter scheme, but worth noting. `select_slot_index` short-circuits on `span == 0` BEFORE bumping the counter, so a degenerate config (unreachable via the CLI but defensible at the API surface) does not burn counter ticks and break fairness for concurrent callers. Tests: 2 table-driven tests on `parse_dal_slot_range` covering well- formed input (whitespace, max-u16-end) and the seven rejection paths (empty range, inverted, missing separator, non-numeric start/end, overflowing u16, empty string). Default behaviour (no flag) is unchanged: full 0..number_of_slots round-robin.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
tzel-operatorround-robins DAL slot publication across all slots reported by the protocol (0..number_of_slotsreturned by/protocol_parameters). There's currently no way to partition slot usage between operators sharing a DAL node — every operator competes for every slot, and the only collision-handling lives in the inner retry loop (already proposed→ next slot).This lands the smallest CLI knob that enables external coordination: each operator picks a disjoint slot subrange.
What
New flag:
--dal-slot-range START..END(half-open, Rust idiom —0..8means slots 0..=7). Default: unset → behaviour unchanged.How
dal_slot_range: Option<Range<u16>>with a customvalue_parser(parse_dal_slot_range) that splits on..and rejects empty/inverted/malformed input at parse-time.dal_slot_range: Option<Range<u16>>onOperatorConfig.select_slot_indexnow computesbase + (counter % span)where(base, span)derives from the configured range; falls back to(0, number_of_slots)when unset.publish_dal_chunk_with_protocolnow bounds its retry loop on the range width (dal_slot_attempt_budget) instead ofnumber_of_slots, and callsvalidate_dal_slot_rangeper publish — fail-fast with a clear message if the configured range exceeds the protocol'snumber_of_slots.select_slot_indexshort-circuits onspan == 0before the AtomicU64fetch_add, so a degenerate config (defensible-only path, unreachable via the CLI parser) does not waste counter ticks.Test plan
cargo build -p tzel-services --bin tzel-operator— green (verified locally on top oforigin/main).cargo test -p tzel-services --bin tzel-operator parse_dal_slot_range— 2/2 green.0..8,3..7,1..4w/ whitespace,65534..65535near-max) and the rejection paths (0..0empty,8..3inverted,5missing separator,0..barnon-numeric end,foo..5non-numeric start,0..65536u16 overflow,""empty):0..1collision-on-first-failure is intentional contract; concurrent-counter slot-skipping preexisted the patch and the commit message is now factual about it).Some(range)set has not been observed against a live DAL node.Compatibility
Default behaviour (no flag) is bit-for-bit unchanged:
select_slot_indexreduces tocounter % number_of_slots, identical to the pre-patch implementation.🤖 Generated with Claude Code