@@ -7,12 +7,14 @@ A Python package providing two implementations of a time-based storage system fo
77
88## Features
99
10- - Two storage implementations:
11- - ` TimeBasedStorage ` : Uses a sorted list for efficient range queries
10+ - Three storage implementations:
11+ - ` TimeBasedStorage ` : Uses a dictionary for simple key-value access
1212 - ` TimeBasedStorageHeap ` : Uses a heap for efficient insertion and earliest event access
13+ - ` TimeBasedStorageRBTree ` : Uses a Red-Black Tree for balanced performance (O(log n) insertions and efficient range queries)
1314- Thread-safe variants:
1415 - ` ThreadSafeTimeBasedStorage ` : Thread-safe version of TimeBasedStorage
1516 - ` ThreadSafeTimeBasedStorageHeap ` : Thread-safe version of TimeBasedStorageHeap
17+ - ` ThreadSafeTimeBasedStorageRBTree ` : Thread-safe version of TimeBasedStorageRBTree
1618- Support for:
1719 - Event creation and deletion
1820 - Range queries
@@ -106,31 +108,37 @@ consumer_thread.start()
106108## Choosing the Right Implementation
107109
108110### TimeBasedStorage
109- - ** Best for** : Applications with frequent range queries or sorted access patterns
110- - ** Advantages** : Efficient range queries, direct index access
111- - ** Trade-offs** : Slower insertion (O(n))
111+ - ** Best for** : Applications with small to medium datasets and simple access patterns
112+ - ** Advantages** : Efficient range queries, direct index access, simple implementation
113+ - ** Trade-offs** : Slower insertion (O(n)) especially with sorted data
112114
113115### TimeBasedStorageHeap
114116- ** Best for** : Applications needing fast insertion or frequent access to earliest events
115- - ** Advantages** : Fast insertion, efficient earliest event access
116- - ** Trade-offs** : Less efficient for range queries
117+ - ** Advantages** : Fast insertion (O(log n)), efficient earliest event access (O(1))
118+ - ** Trade-offs** : Less efficient for range queries (O(n log n))
119+
120+ ### TimeBasedStorageRBTree
121+ - ** Best for** : Applications requiring balanced performance across operations, especially range queries
122+ - ** Advantages** : Fast insertion (O(log n)), highly efficient range queries (O(log n + k)), maintains performance with sorted data
123+ - ** Trade-offs** : Slightly higher memory overhead, dependency on sortedcontainers package
124+ - ** Benchmark highlights** : Up to 470x faster for small precise range queries, 114x average speedup for range operations
117125
118126## API Reference
119127
120- ### Common Methods (Both Implementations)
121-
122- | Method | Description | Time Complexity |
123- | --------| -------------| -----------------|
124- | ` add(timestamp, value) ` | Add a value at a specific timestamp | O(n) / O(log n) |
125- | ` get_value_at(timestamp) ` | Get value at a specific timestamp | O(1) / O(n) |
126- | ` get_range(start, end) ` | Get values in a time range | O(log n) / O(n log n) |
127- | ` get_duration(seconds) ` | Get values within a duration | O(log n) / O(n log n) |
128- | ` remove(timestamp) ` | Remove value at a timestamp | O(n) / O(log n) |
129- | ` clear() ` | Remove all values | O(1) |
130- | ` size() ` | Get number of stored events | O(1) |
131- | ` is_empty() ` | Check if storage is empty | O(1) |
132- | ` get_all() ` | Get all stored values | O(1) |
133- | ` get_timestamps() ` | Get all timestamps | O(1) |
128+ ### Common Methods (All Implementations)
129+
130+ | Method | Description | Time Complexity (Standard/Heap/RBTree) |
131+ | --------| -------------| ----------------------------------------- |
132+ | ` add(timestamp, value) ` | Add a value at a specific timestamp | O(n) / O(log n) / O(log n) |
133+ | ` get_value_at(timestamp) ` | Get value at a specific timestamp | O(1) / O(n) / O(1) |
134+ | ` get_range(start, end) ` | Get values in a time range | O(n) / O(n log n) / O(log n + k ) |
135+ | ` get_duration(seconds) ` | Get values within a duration | O(n) / O(n log n) / O(log n + k ) |
136+ | ` remove(timestamp) ` | Remove value at a timestamp | O(n) / O(log n) / O(log n) |
137+ | ` clear() ` | Remove all values | O(1) / O(1) / O(1) |
138+ | ` size() ` | Get number of stored events | O(1) / O(1) / O(1) |
139+ | ` is_empty() ` | Check if storage is empty | O(1) / O(1) / O(1) |
140+ | ` get_all() ` | Get all stored values | O(1) / O(1) / O(1) |
141+ | ` get_timestamps() ` | Get all timestamps | O(1) / O(1) / O(1) |
134142| ` add_unique_timestamp() ` | Add with timestamp collision handling | Varies |
135143
136144### Thread-Safe Additional Methods
@@ -144,16 +152,33 @@ consumer_thread.start()
144152
145153### TimeBasedStorage
146154- Insertion: O(n)
147- - Range Queries: O(log n)
148- - Duration Queries: O(log n)
149- - Earliest/Latest: O(1)
155+ - Range Queries: O(n)
156+ - Duration Queries: O(n)
157+ - Earliest/Latest: O(n)
158+ - Memory Usage: Lower overhead per element
150159
151160### TimeBasedStorageHeap
152161- Insertion: O(log n)
153162- Range Queries: O(n log n)
154163- Duration Queries: O(n log n)
155164- Earliest Event: O(1)
156165- Latest Event: O(n log n)
166+ - Memory Usage: Moderate overhead
167+
168+ ### TimeBasedStorageRBTree
169+ - Insertion: O(log n)
170+ - Range Queries: O(log n + k) where k is the number of items in range
171+ - Duration Queries: O(log n + k)
172+ - Earliest Event: O(log n)
173+ - Latest Event: O(log n)
174+ - Memory Usage: Slightly higher overhead
175+
176+ ** Benchmark Results** (500,000 entries):
177+ - Range query performance: ** ~ 114x average speedup** over standard implementation
178+ - Small precise range queries (0.01% of data): ** ~ 470x faster**
179+ - Small range queries (0.1% of data): ** ~ 87x faster**
180+ - Medium range queries (1% of data): ** ~ 12x faster**
181+ - Most beneficial for targeted range queries on large datasets
157182
158183## Use Cases
159184
0 commit comments