-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
Add an optional caching layer to avoid repeated identical API calls, reducing latency and API costs for cacheable requests.
Use Cases
- Identical prompts - Same question asked multiple times
- System prompt reuse - Caching responses with identical system prompts
- Development/testing - Avoid hitting API during debugging
- Rate limiting - Reduce API calls when approaching limits
Proposed API
// Enable caching via options
services.AddCompactifAI(options =>
{
options.ApiKey = "...";
options.EnableCaching = true;
options.CacheDuration = TimeSpan.FromMinutes(5);
});
// Or per-request control
var response = await client.ChatAsync(
"What is 2+2?",
cacheOptions: new CacheOptions
{
Enabled = true,
Duration = TimeSpan.FromHours(1)
});Implementation Options
Option 1: IMemoryCache Integration
public class CachingCompactifAIClient : ICompactifAIClient
{
private readonly IMemoryCache _cache;
private readonly ICompactifAIClient _inner;
public async Task<ChatCompletionResponse> CreateChatCompletionAsync(...)
{
var cacheKey = GenerateCacheKey(request);
if (_cache.TryGetValue(cacheKey, out ChatCompletionResponse cached))
return cached;
var response = await _inner.CreateChatCompletionAsync(request);
_cache.Set(cacheKey, response, _cacheDuration);
return response;
}
}Option 2: Decorator Pattern
Allow users to wrap the client with their own caching strategy.
Cache Key Generation
private static string GenerateCacheKey(ChatCompletionRequest request)
{
// Hash based on: model + messages + temperature + other deterministic params
// Exclude: stream, user, etc.
}Expected Benefits
- Reduced API costs - Avoid paying for duplicate requests
- Lower latency - Instant responses for cached results
- Offline development - Work with cached responses during development
Priority
🟢 P2 - Medium Impact
Considerations
- Cache invalidation strategy
- Memory limits for cache size
- Consider distributed cache support (IDistributedCache)
- Non-deterministic responses (temperature > 0) may not be suitable for caching
- Should be opt-in to avoid unexpected behavior
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request