-
Notifications
You must be signed in to change notification settings - Fork 178
Improve LRU cache test coverage by validating eviction order in load_genome_seq (test_veff.py) #1244
Description
Validate correct LRU eviction order in _load_genome_seq
Overview: Aim to add a test to verify that the genome sequence cache in Annotator follows true Least Recently Used (LRU) behaviour.
The Gap
The current tests check:
- Cache size (
currsize) - Cache capacity (
maxsize) - Cache clearing
- That eviction occurs when the cache is full
However, they do not verify that the correct entry is evicted when capacity is exceeded.
What is Missing?
LRU caching requires that the least recently used entry be removed when the cache is full.
Existing tests confirm eviction happens, but not whether eviction follows the correct order.
Effect
Without validating the eviction order:
- The cache could remove the wrong entry (e.g., most recently used)
- The cache could behave like FIFO or random eviction
- Current tests would still pass
This can lead to inefficient reuse of genome sequences and subtle performance issues.
Example
Cache size = 2
Access pattern:
chr1 → chr2 → chr1 → chr3
Expected:
chr2is least recently used and should be evicted
This behavior is not currently tested.
Proposed Solution
Add a test that ensures:
- Recently accessed entries are retained
- The least recently used entry is evicted
- Cache behavior matches LRU semantics under repeated access
Add Test (in test_veff.py)
def test_lru_order_correct():
contigs = [("chr1", FWD_SEQ), ("chr2", FWD_SEQ), ("chr3", FWD_SEQ)]
genome = make_genome(*contigs)
ann = Annotator(
genome=genome,
genome_features=FEATURES_BASIC_FWD.copy(),
genome_cache_maxsize=2,
)
ann._load_genome_seq("chr1")
ann._load_genome_seq("chr2")
ann._load_genome_seq("chr1")
ann._load_genome_seq("chr3")
ann._load_genome_seq("chr2")
info = ann._load_genome_seq.cache_info()
assert info.misses == 4Impact
- Improves test coverage for cache behaviour
- Ensures correctness of LRU eviction logic
- Helps prevent subtle bugs in repeated genome access patterns
- Does not change any existing functionality
I noticed that the eviction order was not explicitly validated in the current tests, so I added this case to ensure the cache behaves as expected under repeated access patterns. Happy to refine this further if needed. @jonbrenas, please let me know if this approach looks correct.