Production-grade multi-tenant AI Agent memory backend built with FastAPI, LangGraph, LangMem, Redis, Qdrant, and PostgreSQL.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β API Layer (FastAPI) β
β /prompt /llm /memory /emotion /personality β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β DI (dependency-injector)
βββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β Service Layer β
β PromptService LLMService MemoryService β
β EmotionService PersonalityService β
ββββββββ¬βββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β β
ββββββββΌβββββββ βββββΌβββββββββββββββββββββββββββββββββββ
β LangGraph β β Store Layer β
β Workflow β β Redis | Qdrant | PostgreSQL (async) β
βββββββββββββββ ββββββββββββββββββββββββββββββββββββββββ
| Layer | Storage | Purpose |
|---|---|---|
| Short-term | Redis | Recent messages, session summary, emotion state |
| Long-term | Qdrant | Semantic vector search over user history |
| Core Memory | PostgreSQL | Stable user facts, personality, preferences |
All features are toggled via environment variables:
| Flag | Default | Effect |
|---|---|---|
MEMORY_ENABLED |
true | Master switch β if false, pass directly to LLM |
SHORT_TERM_MEMORY_ENABLED |
true | Redis short-term memory |
LONG_TERM_MEMORY_ENABLED |
true | Qdrant vector memory |
CORE_MEMORY_ENABLED |
true | PostgreSQL core memory |
EMOTION_ENABLED |
true | Emotion detection and injection |
PERSONALITY_ENABLED |
true | Personality profile injection |
LLM_ENABLED |
true | Whether to call the LLM at all |
cp .env.example .env
# Edit .env with your OpenAI API key, DB credentials, etc.pip install -e ".[dev]"
# or with uv:
uv syncYou need PostgreSQL, Redis, and Qdrant running. Minimal quick-start commands:
# PostgreSQL
docker run -d --name pg -e POSTGRES_PASSWORD=postgres -p 5432:5432 postgres:16
# Redis
docker run -d --name redis -p 6379:6379 redis:7
# Qdrant
docker run -d --name qdrant -p 6333:6333 qdrant/qdrant# From the project root (where alembic.ini lives):
alembic upgrade headcd src
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000Navigate to: http://localhost:8000/docs
| Method | Path | Description |
|---|---|---|
| POST | /api/v1/prompt/build |
Build prompt from all memory layers |
| POST | /api/v1/llm/invoke |
Build prompt + call LLM |
| POST | /api/v1/memory/short-term/message |
Write message to Redis |
| GET | /api/v1/memory/short-term/{tenant_id}/{user_id}/{session_id} |
Get short-term memory |
| POST | /api/v1/memory/short-term/summary |
Update session summary |
| POST | /api/v1/memory/core/upsert |
Upsert core memory variable |
| GET | /api/v1/memory/core/{tenant_id}/{user_id} |
Get all core memory |
| POST | /api/v1/memory/long-term/upsert |
Store fact + embedding in Qdrant |
| POST | /api/v1/memory/long-term/search |
Semantic search in Qdrant |
| POST | /api/v1/memory/extract |
LangMem memory extraction |
| POST | /api/v1/emotion/analyze |
Analyse emotion from text |
| GET | /api/v1/emotion/{tenant_id}/{user_id}/{session_id} |
Get emotion state |
| POST | /api/v1/emotion/config |
Update emotion feature flags |
| GET | /api/v1/personality/{tenant_id}/{user_id} |
Get personality profile |
| POST | /api/v1/personality/upsert |
Update personality variables |
| POST | /api/v1/personality/config |
Toggle personality injection |
src/
βββ app/
β βββ main.py # FastAPI app + lifespan
β βββ containers.py # DI container (dependency-injector)
β βββ constants.py # Env-loaded constants
β βββ lifecycle.py # @Init / @Destroy decorators
β βββ events.py # Event subscribers
βββ config/
β βββ settings.py # All settings and feature flags
βββ api/
β βββ deps.py # FastAPI dependency factories
β βββ memory_api.py # Memory endpoints
β βββ prompt_api.py # Prompt build endpoint
β βββ llm_api.py # LLM invoke endpoint
β βββ emotion_api.py # Emotion endpoints
β βββ personality_api.py # Personality endpoints
βββ models/ # SQLAlchemy ORM models
β βββ core_memory.py
β βββ personality.py
β βββ emotion.py
βββ schemas/ # Pydantic request/response schemas
β βββ common.py
β βββ memory.py
β βββ emotion.py
β βββ personality.py
β βββ prompt.py
βββ stores/ # Data access layer
β βββ redis_store.py # Async Redis (short-term memory)
β βββ qdrant_store.py # Async Qdrant (long-term memory)
β βββ postgres_repo.py # Async SQLAlchemy repos
βββ services/ # Business logic
β βββ memory_service.py
β βββ prompt_service.py
β βββ llm_service.py
βββ workflows/
β βββ prompt_workflow.py # LangGraph 7-node workflow
βββ agents/
β βββ memory_agent.py # LangMem extraction agent
βββ prompt/
β βββ builder.py # PromptBuilder (assembles blocks)
βββ emotion/
β βββ analyzer.py # LLM + rule-based emotion classifiers
β βββ service.py # Emotion orchestration
βββ personality/
β βββ service.py
β βββ prompt_adapter.py # Personality β prompt block
βββ llm/
β βββ base.py # LLMProvider ABC
β βββ openai_provider.py # OpenAI + Mock implementations
βββ embeddings/
β βββ base.py # EmbeddingProvider ABC
β βββ openai_embeddings.py # OpenAI + Mock implementations
βββ db/
β βββ base.py # SQLAlchemy DeclarativeBase
β βββ session.py # Async session factory
βββ migrations/
β βββ env.py # Alembic env
β βββ script.py.mako # Migration template
β βββ versions/
β βββ 20240101_001_initial_schema.py
βββ utils/ # Existing utilities (unchanged)
Migrations are managed with Alembic. Tables are created in the schema specified by DB_SCHEMA (default public; set to mem in the current .env).
Make sure your .env has the correct schema:
DB_SCHEMA=mem # all tables land here
DB_DATABASE=demo # target database# From the project root (where alembic.ini lives)
alembic upgrade headThis will:
- Create the schema if it does not exist (handled by
search_pathin asyncpg connect args) - Create
alembic_versionin the target schema to track state - Apply every pending migration in
src/migrations/versions/
# Rollback the last applied migration
alembic downgrade -1
# Rollback all the way to a clean slate
alembic downgrade base
# Auto-generate a migration after changing a model
alembic revision --autogenerate -m "add user_preference table"
# Show applied / pending migration history
alembic history --verbose
# Show current DB revision
alembic current- Schema isolation β
DB_SCHEMAis read at runtime. Changing it in.envand re-runningalembic upgrade headwill create tables in the new schema without touching the old one. - asyncpg
search_pathβ The env.py passesserver_settings={"search_path": DB_SCHEMA}to asyncpg so the schema is active before Alembic opens its transaction. This is why?options=in the URL is NOT used. - Existing tables β All migration scripts include explicit
schema=SCHEMAarguments so they are always idempotent and schema-aware regardless of the connection's default search path.
# Build image
docker build -t mem-backend:latest .
# Run (assumes external services are accessible)
docker run -d \
--name mem-backend \
-p 8000:8000 \
--env-file .env \
mem-backend:latestEvery API request includes tenant_id, user_id, and session_id.
- Redis keys:
shortmem:{tenant_id}:{user_id}:{session_id}:{suffix} - Qdrant points: filtered by
tenant_id+user_idin payload - PostgreSQL rows: all tables have
tenant_idcolumn with unique indexes
The prompt-building pipeline runs as a 7-node LangGraph directed graph:
load_core_memory β load_short_term β search_long_term
β load_emotion β load_personality β build_prompt β END
Each node is independently disabled via feature flags without modifying the graph.
POST /api/v1/memory/extract runs LangMem extraction against a conversation.
If langmem is not installed, falls back to a direct LLM-based extractor.
| Component | Strategy |
|---|---|
| PostgreSQL | pg_dump daily, point-in-time recovery (WAL archiving) |
| Redis | BGSAVE + RDB snapshots, or AOF for persistence |
| Qdrant | Built-in snapshot API: POST /collections/{name}/snapshots |
See .env.example for the full list with descriptions.