A LLM semantic caching system aiming to enhance user experience by reducing response time via cached query-result pairs.
-
Updated
Jun 30, 2025 - Python
A LLM semantic caching system aiming to enhance user experience by reducing response time via cached query-result pairs.
Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.
mimir is a drop-in proxy that caches LLM API responses using semantic similarity, reducing costs and latency for repeated or similar queries.
SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI. Features semantic caching, model profiling, and automatic failover for local AI labs.
Unified AI Gateway for 30+ LLMs (OpenAI, Anthropic, Bedrock, Azure etc) with Caching, Guardrails, A/B test & cost controls. Go-native Fastest & Scalable AI Gateway LiteLLM & Kong AI Gateway alternative.
Reliable and Efficient Semantic Prompt Caching with vCache
Redis integration for Google Agent Development Kit (ADK) - Memory, Sessions, Search Tools, MCP
Redis Vector Library (RedisVL) -- the AI-native Java client for Redis.
This is a RAG based chatbot in which semantic cache and guardrails have been incorporated.
This repository contains sample code demonstrating how to implement a verified semantic cache using Amazon Bedrock Knowledge Bases to prevent hallucinations in Large Language Model (LLM) responses while improving latency and reducing costs.
High-performance LLM query cache with semantic search. Reduce API costs 80% and latency from 8.5s to 1ms using Redis + Qdrant vector DB. Multi-provider support (OpenAI, Anthropic).
Local-first semantic cache for AI agents. A small C daemon + CLI that remembers what your agent learned across sessions. Plugs into Claude Code, Codex, Gemini CLI, and Claude Desktop / ChatGPT via MCP. No LLM calls, no SaaS, no API key.
OpenAI-compatible LLM gateway that reduces API costs using Redis exact cache and Qdrant semantic cache.
Enhance LLM retrieval performance with Azure Cosmos DB Semantic Cache. Learn how to integrate and optimize caching strategies in real-world web applications.
Ultra-fast Semantic Cache Proxy written in pure C
An operating system for autonomous AI agents — 5-tier cache-first routing (97.5% cost reduction), Ed25519 constitution enforcement, 130 agents, 106 plugins. Rust.
AI real-estate automation platform: Telegram bot, RAG, apartment search, CRM workflows, voice agent, Langfuse observability, and Dockerized AI runtime.
VCAL Core — high-performance semantic cache and vector cache library for LLM applications.
Redis Vector Similarity Search, Semantic Caching, Recommendation Systems and RAG
Add a description, image, and links to the semantic-cache topic page so that developers can more easily learn about it.
To associate your repository with the semantic-cache topic, visit your repo's landing page and select "manage topics."