A long-term memory architecture for conversational AI that addresses memory fragmentation, temporal confusion, and cross-session reasoning instability through unified memory representation, retrieval-reading closed-loop, and temporal version consistency mechanisms.
┌─────────────────────────────────────────────────────────────────────────────┐
│ DuMF-Agent Architecture │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌──────────────────────────────────────────────────┐ │
│ │ User │ │ Dual-Channel Memory │ │
│ │ Query │───▶│ ┌────────────────┐ ┌────────────────────────┐ │ │
│ └─────────────┘ │ │ RAW Channel │ │ CONSOLIDATED Channel │ │ │
│ │ │ (Evidence) │ │ (SimpleFact + Triple) │ │ │
│ │ └────────────────┘ └────────────────────────┘ │ │
│ └──────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────▼────────────────┐ │
│ │ Hybrid Retrieval │ │
│ │ • Query Expansion │ │
│ │ • Vector + BM25 + Multi-hop │ │
│ │ • Unified Re-ranking │ │
│ └────────────────┬────────────────┘ │
│ │ │
│ ┌────────────────▼────────────────┐ │
│ │ Context Construction │ │
│ │ • Version Detection │ │
│ │ • Temporal Filtering │ │
│ │ • Evidence Organization │ │
│ └────────────────┬────────────────┘ │
│ │ │
│ ┌────────────────▼────────────────┐ │
│ │ LLM Generation │ │
│ └─────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
- Dual-Channel Memory Architecture: RAW channel preserves evidence completeness; CONSOLIDATED channel structures facts for efficient retrieval — balances completeness and retrieval efficiency
- Triple-SimpleFact Separation: Structured Triple layer optimized for multi-hop reasoning; SimpleFact layer optimized for direct QA — decoupled optimization
- Generalized Extractor: Entities and relation types dynamically extracted from text without hardcoded schemas
- Multi-Factor Comprehensive Scoring: Fusion of semantic similarity, confidence, channel priority, and temporal decay for unified retrieval ranking
- Dual-Dimensional Temporal Decay: Time-aware weighting combining real-world timestamps and conversation turns for cross-session and intra-session reasoning
- Append-Only Full Retention Storage: No deletion — all historical versions preserved, enabling version tracking and temporal queries
- Hybrid Retrieval + Query Expansion: Vector similarity search, BM25 full-text search, and multi-hop graph traversal with query expansion
- Python 3.9+
- Neo4j 5.x (local or Aura cloud)
- CUDA-compatible GPU (optional, for local embeddings)
- Clone the repository:
git clone https://github.com/leyulv-wang/long_memory_agent.git --branch v1.0.0
cd long_memory_agent- Create virtual environment:
python -m venv .venv
source .venv/bin/activate # Linux/Mac
# or
.venv\Scripts\activate # Windows- Install dependencies:
pip install -r requirements.txt- Configure environment:
cp .env.example .env
# Edit .env with your API keys and database credentials- Initialize Neo4j schema:
python utils/init_neo4j_schema.py
python utils/create_fulltext_index.py- (Optional) Start local embedding server:
python embedding_server.pyThis project uses the LongMemEval benchmark for evaluation.
# Clone LongMemEval repository
git clone https://github.com/xiaowu0162/LongMemEval.git
# Copy test files to your project
mkdir -p data/long_memory_eval
cp LongMemEval/data/*.json data/long_memory_eval/data/
└── long_memory_eval/
├── longmemeval_oracle.json # Sample setting
└── longmemeval_s.json # Hard setting
Copy .env.example to .env and configure the following:
# LLM API (OpenAI-compatible)
GRAPHRAG_API_BASE=https://api.openai.com/v1
GRAPHRAG_CHAT_API_KEY=sk-your-api-key-here
GRAPHRAG_CHAT_MODEL=gpt-4o-mini
# Cheap LLM for extraction tasks
CHEAP_GRAPHRAG_API_BASE=https://api.openai.com/v1
CHEAP_GRAPHRAG_CHAT_API_KEY=sk-your-api-key-here
CHEAP_GRAPHRAG_CHAT_MODEL=gpt-4o-mini
# Embedding Model
GRAPHRAG_EMBEDDING_API_BASE=http://127.0.0.1:8000 # Local server
GRAPHRAG_EMBEDDING_API_KEY=local
GRAPHRAG_EMBEDDING_MODEL=BAAI/bge-m3
# Neo4j Database
NEO4J_URI=neo4j://127.0.0.1:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=your-password-here# Evidence filtering: strict | medium | lenient
EVIDENCE_FILTER_LEVEL=lenient
# TextUnit fallback: off | order | always
EVIDENCE_TEXTUNIT_FALLBACK_SCOPE=order
# Confidence scores
RAW_REL_CONFIDENCE=0.95
CONSOLIDATED_REL_CONFIDENCE=0.85
CONSOLIDATED_ASSERTS_CONFIDENCE=0.6| Parameter | Value | Description |
|---|---|---|
SimpleFact k |
100 | Top-k for SimpleFact retrieval |
TextUnit k |
10 | Top-k for TextUnit retrieval |
Fulltext k |
20 | Top-k for BM25 fulltext search |
Multi-hop limit |
20 | Max nodes in graph expansion |
Multi-hop decay |
0.85 | Score decay per hop |
Similarity weight |
0.7 | Weight for semantic similarity |
Confidence weight |
0.2 | Weight for fact confidence |
Channel weight |
0.1 | Weight for channel priority |
Version threshold |
0.75 | Threshold for version detection |
See config.py for all configurable parameters.
from agent.agent import DuMFAgent
# Initialize agent
agent = DuMFAgent(agent_id="user_001")
# Process conversation
response = agent.chat("What did we discuss about the project last week?")Once you have the dataset and Neo4j database ready:
# Initialize database schema (first time only)
python utils/init_neo4j_schema.py
python utils/create_fulltext_index.py
# Run evaluation
python test/Long_Memory_test.pyResults will be saved to test/long_memory_results.json
Note: To test different settings (sample/hard), modify the DEFAULT_DATA_PATH in test/Long_Memory_test.py (line 47):
- Sample setting:
"data/long_memory_eval/longmemeval_oracle.json" - Hard setting:
"data/long_memory_eval/longmemeval_s.json"
Or use command line argument:
python test/Long_Memory_test.py --data data/long_memory_eval/longmemeval_s.jsonFor local embedding (recommended for development):
# Start the embedding server first
python embedding_server.py
# Configure in .env:
# GRAPHRAG_EMBEDDING_API_BASE=http://127.0.0.1:8000For online embedding API, configure SiliconFlow or other providers in .env.
long_memory_agent/
├── agent/ # Core agent implementation
│ ├── agent.py # Main agent class
│ ├── simple_retriever.py # Hybrid retrieval system
│ └── context_builder.py # Context construction
├── memory/ # Dual-channel memory system
│ ├── dual_memory_system.py
│ ├── structured_memory.py
│ └── stores.py
├── temporal_reasoning/ # Temporal reasoning module
│ ├── executor.py
│ └── intent_router.py
├── prompts/ # Prompt templates
├── utils/ # Utility functions
└── test/ # Test scripts
# Check if Neo4j is running
neo4j status
# Start Neo4j
neo4j start
# Verify connection
python utils/connection_tests.py# If using local embedding, check server status
curl http://127.0.0.1:8000/health
# Alternative: Use online embedding API
# Edit .env:
GRAPHRAG_EMBEDDING_API_BASE=https://api.siliconflow.cn/v1
GRAPHRAG_EMBEDDING_API_KEY=your-api-key# Reduce batch size in .env
EMBED_BATCH_SIZE=1
EMBED_MAX_CONCURRENCY=1Performance comparison on LongMemEval benchmark. All results averaged over 10 independent runs with ± half-range.
- LLM: Direct LLM prompting with full conversation history
- RAG: Retrieval-augmented generation with vector search
- Mem0: Memory layer with fact extraction and consolidation
- Mem0Graph: Memory layer with graph-based structured memory
- LangMem: LangChain-based memory system
- LightMem: Lightweight memory architecture
- Generative Agent: Stanford's generative agents with memory stream (recency, importance, relevance scoring)
- DuMF-Agent (ours): Dual-channel memory framework with structured reasoning and temporal consistency
| Method | Overall Acc. (sample) | Overall Acc. (hard) | Task-avg. Acc. (hard) |
|---|---|---|---|
| LLM | 75.00 ± 1.30 | 55.41 ± 0.68 | 54.20 ± 0.85 |
| RAG | 66.17 ± 1.51 | 49.33 ± 1.36 | 48.84 ± 1.34 |
| Mem0 | 50.22 ± 1.94 | 34.18 ± 1.53 | 33.97 ± 0.99 |
| Mem0Graph | 53.40 ± 0.31 | 36.52 ± 0.16 | 35.75 ± 0.10 |
| LangMem | 63.36 ± 1.22 | 46.40 ± 0.60 | 46.99 ± 0.53 |
| LightMem | 61.20 ± 0.40 | 50.00 ± 0.80 | 50.25 ± 0.75 |
| GA | 61.42 ± 0.65 | 23.56 ± 1.00 | 24.12 ± 1.26 |
| DuMF-Agent | 75.38 ± 0.37 | 69.59 ± 0.19 | 69.80 ± 0.23 |
DuMF-Agent achieves the best performance across all settings, demonstrating superior capability in handling long-term conversational memory with complex reasoning requirements.
This project is licensed under the MIT License - see the LICENSE file for details.
- LongMemEval benchmark for evaluation framework
- Neo4j for graph database support
