Skip to content

Add DurableCacheEvaluator #225

@ptomecek

Description

@ptomecek

Extend the existing in-memory caching prototype with a durable backend so that expensive CallableModel results survive process restarts. Long-running ETL, training, and reporting workflows can then resume without redoing completed work, and repeated runs across sessions reuse prior outputs.

Semantically this is the same as the in-memory cache: a write-through, invalidatable optimization. A miss simply triggers recomputation. It composes naturally with the in-memory cache as an L1/L2 layer.

Behavior

  • Pluggable storage backend (diskcache is a reasonable default; object storage is a useful follow-on).
  • Same identity / cache-key semantics as the in-memory evaluator.
  • Standard cache controls: TTL, size limits, manual invalidation.

Open questions

  • Identity and staleness. What goes into the cache key beyond model identity and context? Source hashing, explicit user-bumped versions, or a combination?
  • Large results. Inline storage of multi-GB results is impractical. Should result types be able to spill to external storage and store only a pointer in the cache record, and how does that interact with the arrow/narwhals/pandas result types?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions