Skip to content

Commit dbdfee3

Browse files
committed
feat: add wrap-around and time-based iteration support
Iterator cycling features for workloads exceeding keyspace size: - Add duration_ms to SamplingConfig for time-based iteration limits - RandomIter cycles when limit > range_size (e.g., 1M requests on 100K keys) - SequentialIter auto-enables wrap_around when duration_ms is set - FilterContext tracks start_time_ms and checks both count and duration limits New tests for multi-threaded wrap-around behavior: - test_random_cycling_limit_exceeds_keyspace - test_random_cycling_multithreaded - test_random_duration_based_cycling / _multithreaded - test_sequential_duration_based / _multithreaded - test_partitioned_with_limit / _multithreaded_disjoint - test_delete_iter_single_pass - test_limit_with_duration_limit_wins - test_duration_with_limit_duration_wins Documentation updates with examples for: - Wrap-around mode (.wrap_around()) - Cycling with limit > keyspace - Time-based iteration (duration_ms)
1 parent 713fa2d commit dbdfee3

4 files changed

Lines changed: 716 additions & 22 deletions

File tree

INTEGRATION_GUIDE.md

Lines changed: 98 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -228,16 +228,18 @@ Statistical distributions for realistic workload simulation:
228228
| **Overlapping** | `.overlapping().random()` | Yes | Contention testing |
229229
| **Claim** | `.write()` / `.continue_write()` | Yes (atomic) | Exactly-once processing |
230230

231-
### Wrap-Around for Indefinite Iteration
231+
### Wrap-Around and Time-Based Iteration
232232

233233
By default, iterators stop when the keyspace is exhausted. For benchmarks requiring
234-
indefinite iteration, use `.wrap_around()`:
234+
indefinite iteration or fixed-duration runs, use wrap-around or time-based modes:
235235

236-
| Iterator | Default | With `.wrap_around()` |
237-
|----------|---------|----------------------|
238-
| `sequential()` | Stops at end | Cycles back to start |
239-
| `write()` | Stops when full | Clears bitmap, restarts |
240-
| `random()` | Samples with replacement | Already cycles (no wrap needed) |
236+
| Iterator | Default | With `.wrap_around()` | With `duration_ms` |
237+
|----------|---------|----------------------|-------------------|
238+
| `sequential()` | Stops at end | Cycles back to start | Auto wrap-around |
239+
| `write()` | Stops when full | Clears bitmap, restarts | N/A |
240+
| `random()` | Stops at keyspace size | N/A (use limit) | Cycles indefinitely |
241+
242+
#### Wrap-Around Mode
241243

242244
```rust
243245
// Sequential read - cycles through existing keys forever
@@ -259,6 +261,51 @@ let handles: Vec<_> = (0..num_workers).map(|_| {
259261
}).collect();
260262
```
261263

264+
#### Random Cycling with Limit > Keyspace
265+
266+
When `limit` exceeds keyspace size, `RandomIter` automatically cycles to fulfill
267+
the request. This enables "1M requests on 100K keys" scenarios:
268+
269+
```rust
270+
use keyspace_tracker::SamplingConfig;
271+
272+
// Request 1,000,000 random samples from 100,000-key space
273+
let items: Vec<_> = tracker.iter()
274+
.set_only()
275+
.with_sampling(SamplingConfig::default().with_limit(1_000_000))
276+
.random()
277+
.collect();
278+
assert_eq!(items.len(), 1_000_000); // Cycles through keyspace ~10 times
279+
```
280+
281+
#### Time-Based Iteration
282+
283+
Use `duration_ms` to run for a fixed duration (auto-enables wrap-around):
284+
285+
```rust
286+
use keyspace_tracker::SamplingConfig;
287+
288+
// Run for 60 seconds, cycling through keyspace
289+
let items: Vec<_> = tracker.iter()
290+
.set_only()
291+
.with_sampling(SamplingConfig::default().with_duration_ms(60_000))
292+
.random()
293+
.collect();
294+
// Collects items until 60 seconds elapsed
295+
296+
// Combine limit and duration - stops when either is reached first
297+
let items: Vec<_> = tracker.iter()
298+
.set_only()
299+
.with_sampling(
300+
SamplingConfig::default()
301+
.with_limit(1_000_000)
302+
.with_duration_ms(30_000) // 30 seconds max
303+
)
304+
.random()
305+
.collect();
306+
```
307+
308+
262309
### Partitioning Strategies
263310

264311
| Strategy | Method | Use Case |
@@ -799,6 +846,7 @@ impl<'a> TrackerIterBuilder<'a> {
799846
pub fn sample(self, probability: f64) -> Self;
800847
pub fn limit(self, max_items: u64) -> Self;
801848
pub fn seed(self, seed: u64) -> Self;
849+
pub fn with_sampling(self, config: SamplingConfig) -> Self;
802850

803851
// === Distribution ===
804852
pub fn distribution(self, dist: AccessDistribution) -> Self;
@@ -833,6 +881,49 @@ impl WriteIter<'a> {
833881
}
834882
```
835883

884+
### SamplingConfig
885+
886+
```rust
887+
/// Configuration for iteration sampling, limits, and duration.
888+
pub struct SamplingConfig {
889+
pub set_ratio: Option<f64>, // Ratio of set vs unset bits (0.0-1.0)
890+
pub limit: Option<u64>, // Maximum items to return
891+
pub duration_ms: Option<u64>, // Maximum duration in milliseconds
892+
pub sample_probability: f64, // Probabilistic sampling (0.0-1.0)
893+
pub seed: Option<u64>, // Random seed for reproducibility
894+
pub distribution: AccessDistribution, // Key access distribution
895+
pub overlapping: bool, // Allow multiple threads to visit same keys
896+
}
897+
898+
impl SamplingConfig {
899+
pub const fn new() -> Self;
900+
901+
/// Set mixed ratio: proportion of set (existing) vs unset (new) items.
902+
pub const fn with_set_ratio(self, ratio: f64) -> Self;
903+
904+
/// Limit number of items returned.
905+
/// For RandomIter: if limit > keyspace size, iterator cycles automatically.
906+
pub const fn with_limit(self, limit: u64) -> Self;
907+
908+
/// Set duration limit in milliseconds.
909+
/// Iteration continues (with wrap-around) until duration expires.
910+
/// Useful for time-based workloads like "run for 60 seconds".
911+
pub const fn with_duration_ms(self, duration_ms: u64) -> Self;
912+
913+
/// Set probabilistic sampling (each item has `prob` chance of being returned).
914+
pub const fn with_sample_probability(self, prob: f64) -> Self;
915+
916+
/// Set seed for reproducible random iteration.
917+
pub const fn with_seed(self, seed: u64) -> Self;
918+
919+
/// Set access distribution (Uniform, Zipfian, Exponential, etc.).
920+
pub const fn with_distribution(self, dist: AccessDistribution) -> Self;
921+
922+
/// Enable overlapping mode for contention testing.
923+
pub const fn with_overlapping(self, enabled: bool) -> Self;
924+
}
925+
```
926+
836927
### BitmapSnapshot
837928

838929
```rust

src/config.rs

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,11 @@ pub struct SamplingConfig {
6868
/// Maximum number of items to return. None = unlimited.
6969
pub limit: Option<u64>,
7070

71+
/// Duration limit for iteration (in milliseconds). None = unlimited.
72+
/// When set, iteration continues until the duration expires.
73+
/// Combines with `limit` - iteration stops when either is reached.
74+
pub duration_ms: Option<u64>,
75+
7176
/// Probabilistic sampling ratio (0.0-1.0).
7277
/// Each matching item has this probability of being returned.
7378
/// 1.0 = return all matching, 0.5 = return ~50% of matching.
@@ -92,6 +97,7 @@ impl SamplingConfig {
9297
Self {
9398
set_ratio: None,
9499
limit: None,
100+
duration_ms: None,
95101
sample_probability: 1.0,
96102
seed: None,
97103
distribution: AccessDistribution::Uniform,
@@ -117,6 +123,16 @@ impl SamplingConfig {
117123
self
118124
}
119125

126+
/// Set duration limit for iteration (in milliseconds).
127+
///
128+
/// Iteration will continue (with wraparound) until the duration expires.
129+
/// This enables time-based workloads like "run for 60 seconds".
130+
#[inline]
131+
pub const fn with_duration_ms(mut self, duration_ms: u64) -> Self {
132+
self.duration_ms = Some(duration_ms);
133+
self
134+
}
135+
120136
/// Set probabilistic sampling (each item has `prob` chance of being returned).
121137
///
122138
/// Use this to randomly sample a percentage of the keyspace.
@@ -164,6 +180,7 @@ impl SamplingConfig {
164180
#[inline]
165181
pub fn has_sampling(&self) -> bool {
166182
self.limit.is_some()
183+
|| self.duration_ms.is_some()
167184
|| self.sample_probability < 1.0
168185
|| self.set_ratio.is_some()
169186
|| !matches!(self.distribution, AccessDistribution::Uniform)
@@ -224,28 +241,52 @@ pub struct FilterContext {
224241
pub(crate) sampling: SamplingConfig,
225242
rng: fastrand::Rng,
226243
yielded: u64,
244+
/// Start time for duration-based iteration (milliseconds since UNIX epoch).
245+
start_time_ms: Option<u64>,
227246
}
228247

229248
impl FilterContext {
230249
/// Create a new filter context.
231250
#[inline]
232251
pub fn new(filter: BitFilter, sampling: SamplingConfig) -> Self {
233252
let rng = sampling.make_rng();
253+
let start_time_ms = sampling.duration_ms.map(|_| Self::current_time_ms());
234254
Self {
235255
filter,
236256
sampling,
237257
rng,
238258
yielded: 0,
259+
start_time_ms,
239260
}
240261
}
241262

242-
/// Check if we should continue iterating (respects limit).
263+
/// Get current time in milliseconds since UNIX epoch.
264+
#[inline]
265+
fn current_time_ms() -> u64 {
266+
use std::time::{SystemTime, UNIX_EPOCH};
267+
SystemTime::now()
268+
.duration_since(UNIX_EPOCH)
269+
.map(|d| d.as_millis() as u64)
270+
.unwrap_or(0)
271+
}
272+
273+
/// Check if we should continue iterating (respects limit and duration).
243274
#[inline]
244275
pub fn check_limit(&self) -> bool {
245-
match self.sampling.limit {
246-
Some(limit) => self.yielded < limit,
247-
None => true,
276+
// Check count limit
277+
if let Some(limit) = self.sampling.limit {
278+
if self.yielded >= limit {
279+
return false;
280+
}
248281
}
282+
// Check duration limit
283+
if let (Some(duration_ms), Some(start_time_ms)) = (self.sampling.duration_ms, self.start_time_ms) {
284+
let elapsed = Self::current_time_ms().saturating_sub(start_time_ms);
285+
if elapsed >= duration_ms {
286+
return false;
287+
}
288+
}
289+
true
249290
}
250291

251292
/// Check if item matches filter (with mixed-ratio support).

0 commit comments

Comments
 (0)