Skip to content

Add optional response caching #6

@GmausDev

Description

@GmausDev

Summary

Add an optional caching layer to avoid repeated identical API calls, reducing latency and API costs for cacheable requests.

Use Cases

  • Identical prompts - Same question asked multiple times
  • System prompt reuse - Caching responses with identical system prompts
  • Development/testing - Avoid hitting API during debugging
  • Rate limiting - Reduce API calls when approaching limits

Proposed API

// Enable caching via options
services.AddCompactifAI(options =>
{
    options.ApiKey = "...";
    options.EnableCaching = true;
    options.CacheDuration = TimeSpan.FromMinutes(5);
});

// Or per-request control
var response = await client.ChatAsync(
    "What is 2+2?",
    cacheOptions: new CacheOptions 
    { 
        Enabled = true,
        Duration = TimeSpan.FromHours(1)
    });

Implementation Options

Option 1: IMemoryCache Integration

public class CachingCompactifAIClient : ICompactifAIClient
{
    private readonly IMemoryCache _cache;
    private readonly ICompactifAIClient _inner;

    public async Task<ChatCompletionResponse> CreateChatCompletionAsync(...)
    {
        var cacheKey = GenerateCacheKey(request);
        if (_cache.TryGetValue(cacheKey, out ChatCompletionResponse cached))
            return cached;

        var response = await _inner.CreateChatCompletionAsync(request);
        _cache.Set(cacheKey, response, _cacheDuration);
        return response;
    }
}

Option 2: Decorator Pattern

Allow users to wrap the client with their own caching strategy.

Cache Key Generation

private static string GenerateCacheKey(ChatCompletionRequest request)
{
    // Hash based on: model + messages + temperature + other deterministic params
    // Exclude: stream, user, etc.
}

Expected Benefits

  • Reduced API costs - Avoid paying for duplicate requests
  • Lower latency - Instant responses for cached results
  • Offline development - Work with cached responses during development

Priority

🟢 P2 - Medium Impact

Considerations

  • Cache invalidation strategy
  • Memory limits for cache size
  • Consider distributed cache support (IDistributedCache)
  • Non-deterministic responses (temperature > 0) may not be suitable for caching
  • Should be opt-in to avoid unexpected behavior

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions