Skip to content

Latest commit

 

History

History
124 lines (92 loc) · 4.87 KB

File metadata and controls

124 lines (92 loc) · 4.87 KB

GAM (General Agentic Memory via Deep Research in An Agent File System)

English | 中文版

A highly modular agentic file system framework that provides structured memory and operating environments for Large Language Models (LLMs). GAM supports both text and video modalities, offering four access levels: Python SDK, CLI, REST API, and Web Platform.

Features

1. Core Features

  • 📝 Intelligent Chunking: LLM-based text segmentation that automatically identifies semantic boundaries.
  • 🧠 Memory Generation: Generates structured memory summaries (Memory + TLDR) for each text chunk.
  • 📂 Hierarchical Organization: Automatically organizes memories into a hierarchical directory structure (Taxonomy).
  • Incremental Addition: Append new content to existing GAMs without rebuilding.
  • 🐳 Multi-environment Support: Supports both local file systems and Docker container workspaces.
  • 🔌 Flexible LLM Backends: Compatible with OpenAI, SGLang, and other inference engines.

2. Supported Tasks

  • 📄 Long Text: Hierarchical memory organization and exploratory QA for long documents.
  • 🎥 Long Video: Automated detection, segmentation, and description for building long video memory.
  • 🎞️ Long-horizon (Agent Trajectory): Efficient compression and organization of long-sequence agent trajectories (e.g., complex reasoning steps, tool invocation logs), enabling agents to manage context across extensive operations.

3. Implementation Methods

  • 🐍 Python SDK: High-level Python SDK for easy integration into agentic workflows.
  • 💻 CLI Tools: Unified gam-add and gam-request commands for command-line interaction.
  • 🚀 REST API: High-performance RESTful API (FastAPI + Uvicorn) with auto-generated OpenAPI docs, request validation, and CORS support.
  • 🌐 Web Platform: Flask-based visualization and management interface.

Quick Start

Installation

# Full installation with all features
pip install -e ".[all]"

Usage Overview

GAM can be used through the Python SDK, CLI, REST API, or Web interface.

1. Python SDK (Workflow API)

from gam import Workflow
wf = Workflow("text", gam_dir="./my_gam", model="gpt-4o-mini", api_key="sk-xxx")
wf.add(input_file="paper.pdf")
result = wf.request("What is the main conclusion?")
print(result.answer)

2. CLI Tools

# Add content
gam-add --type text --gam-dir ./my_gam --input paper.pdf
# Query content
gam-request --type text --gam-dir ./my_gam --question "What is the main conclusion?"

3. REST API

# Start REST API server (FastAPI + Uvicorn)
python examples/run_api.py --port 5001
# Interactive docs available at http://localhost:5001/docs
# See usage example
python examples/rest_api_client.py

4. Web Interface

python examples/run_web.py --model gpt-4o-mini --api-key sk-xxx

Configuration

Set up environment variables to avoid repeated parameter input. GAM Agent (memory building) and Chat Agent (Q&A) can be configured independently:

# GAM Agent (memory building)
export GAM_API_KEY="sk-your-api-key"
export GAM_MODEL="gpt-4o-mini"
export GAM_API_BASE="https://api.openai.com/v1"

# Chat Agent (Q&A) — falls back to GAM Agent config when not set
export GAM_CHAT_API_KEY="sk-your-chat-api-key"
export GAM_CHAT_MODEL="gpt-4o"
export GAM_CHAT_API_BASE="https://api.openai.com/v1"

Documentation

Detailed usage instructions for each component can be found in the following guides:

Examples

Check the examples/ directory for sample projects and usage guides:

Example Description
long_text/ Text GAM building and QA.
long_video/ Video GAM building and QA.
long_horizon/ Long-horizon agent trajectory compression with search/memorize/recall.

Research

The research/ directory contains the original research codebase for the GAM paper, including benchmark evaluation scripts (LoCoMo, HotpotQA, RULER, NarrativeQA) and the dual-agent (Memorizer + Researcher) implementation:

cd research
pip install -e .
from gam_research import MemoryAgent, ResearchAgent

For more details, see the Research README.

License

This project is licensed under the MIT License - see the LICENSE file for details.