Skip to content

Improve LRU cache test coverage by validating eviction order in load_genome_seq (test_veff.py) #1244

@simplyshree

Description

@simplyshree

Validate correct LRU eviction order in _load_genome_seq

Overview: Aim to add a test to verify that the genome sequence cache in Annotator follows true Least Recently Used (LRU) behaviour.

The Gap

The current tests check:

  • Cache size (currsize)
  • Cache capacity (maxsize)
  • Cache clearing
  • That eviction occurs when the cache is full

However, they do not verify that the correct entry is evicted when capacity is exceeded.

What is Missing?

LRU caching requires that the least recently used entry be removed when the cache is full.
Existing tests confirm eviction happens, but not whether eviction follows the correct order.

Effect

Without validating the eviction order:

  • The cache could remove the wrong entry (e.g., most recently used)
  • The cache could behave like FIFO or random eviction
  • Current tests would still pass

This can lead to inefficient reuse of genome sequences and subtle performance issues.

Example

Cache size = 2

Access pattern:

chr1 → chr2 → chr1 → chr3

Expected:

  • chr2 is least recently used and should be evicted

This behavior is not currently tested.

Proposed Solution

Add a test that ensures:

  • Recently accessed entries are retained
  • The least recently used entry is evicted
  • Cache behavior matches LRU semantics under repeated access

Add Test (in test_veff.py)

def test_lru_order_correct():
    contigs = [("chr1", FWD_SEQ), ("chr2", FWD_SEQ), ("chr3", FWD_SEQ)]
    genome = make_genome(*contigs)
    ann = Annotator(
        genome=genome,
        genome_features=FEATURES_BASIC_FWD.copy(),
        genome_cache_maxsize=2,
    )

    ann._load_genome_seq("chr1")
    ann._load_genome_seq("chr2")

    ann._load_genome_seq("chr1")  

    ann._load_genome_seq("chr3")  

    ann._load_genome_seq("chr2")  

    info = ann._load_genome_seq.cache_info()
    assert info.misses == 4

Impact

  • Improves test coverage for cache behaviour
  • Ensures correctness of LRU eviction logic
  • Helps prevent subtle bugs in repeated genome access patterns
  • Does not change any existing functionality

I noticed that the eviction order was not explicitly validated in the current tests, so I added this case to ensure the cache behaves as expected under repeated access patterns. Happy to refine this further if needed. @jonbrenas, please let me know if this approach looks correct.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions