diff --git a/docs/api-reference-index.md b/docs/api-reference-index.md index 68237fb5..0cd6e280 100644 --- a/docs/api-reference-index.md +++ b/docs/api-reference-index.md @@ -1,6 +1,8 @@ # API Reference -Complete reference documentation for all Redis Agent Memory Server interfaces. +Complete reference documentation for all Redis Agent Memory Server interfaces and client SDKs. + +## Server Interfaces
@@ -30,6 +32,36 @@ Complete reference documentation for all Redis Agent Memory Server interfaces.
+## Client SDKs + +
+ +- 🐍 **Python SDK** + + --- + + Async-first client with tool schemas for OpenAI and Anthropic + + [Python SDK →](python-sdk.md) + +- 📘 **TypeScript SDK** + + --- + + Type-safe client for Node.js and browser applications + + [TypeScript SDK →](typescript-sdk.md) + +- ☕ **Java SDK** + + --- + + Java client for JVM applications + + [Java SDK →](java-sdk.md) + +
+ ## Interface Comparison | Interface | Best For | Authentication | @@ -37,6 +69,9 @@ Complete reference documentation for all Redis Agent Memory Server interfaces. | REST API | Applications, backends, custom integrations | OAuth2/JWT or token | | MCP Server | Claude Desktop, MCP-compatible AI agents | Environment config | | CLI | Server administration, development | Local access | +| Python SDK | Python applications with LLM tool integration | Token or OAuth2 | +| TypeScript SDK | Node.js, browser, and TypeScript applications | Token or OAuth2 | +| Java SDK | JVM-based applications | Token or OAuth2 | ## Interactive API Docs diff --git a/docs/aws-bedrock.md b/docs/aws-bedrock.md deleted file mode 100644 index b61d5ef1..00000000 --- a/docs/aws-bedrock.md +++ /dev/null @@ -1,423 +0,0 @@ -# AWS Bedrock Models - -> **Note:** This documentation has been consolidated into [LLM Providers](llm-providers.md#aws-bedrock). -> This page is kept for reference but the LLM Providers guide is the authoritative source. - -The Redis Agent Memory Server supports [Amazon Bedrock](https://aws.amazon.com/bedrock/) for both **embedding models** and **LLM generation models**. This allows you to use AWS-native AI models while keeping your data within the AWS ecosystem. - -## Quick Reference - -For complete AWS Bedrock configuration, see [LLM Providers - AWS Bedrock](llm-providers.md#aws-bedrock). - -**Key points:** -- All LLM operations use [LiteLLM](https://docs.litellm.ai/) internally -- Bedrock embedding models require the `bedrock/` prefix (e.g., `bedrock/amazon.titan-embed-text-v2:0`) -- Bedrock generation models do not need a prefix (e.g., `anthropic.claude-sonnet-4-5-20250929-v1:0`) -- The `[aws]` extra installs `boto3` and `botocore` for AWS authentication - -## Overview - -Amazon Bedrock provides access to a wide variety of foundation models from leading AI providers. The Redis Agent Memory Server supports using Bedrock for: - -1. **Embedding Models** - For semantic search and memory retrieval -2. **LLM Generation Models** - For memory extraction, summarization, and topic modeling - -### Supported Embedding Models - -> **Important:** Use the `bedrock/` prefix for embedding models. - -| Model ID | Provider | Dimensions | Description | -|----------|----------|------------|-------------| -| `bedrock/amazon.titan-embed-text-v2:0` | Amazon | 1024 | Latest Titan embedding model | -| `bedrock/amazon.titan-embed-text-v1` | Amazon | 1536 | Original Titan embedding model | -| `bedrock/cohere.embed-english-v3` | Cohere | 1024 | English-focused embeddings | -| `bedrock/cohere.embed-multilingual-v3` | Cohere | 1024 | Multilingual embeddings | - -### Pre-configured LLM Generation Models - -The following models are pre-configured in the codebase: - -| Model ID | Provider | Max Tokens | Description | -|----------|----------|------------|-------------| -| `anthropic.claude-sonnet-4-5-20250929-v1:0` | Anthropic | 200,000 | Claude 4.5 Sonnet | -| `anthropic.claude-haiku-4-5-20251001-v1:0` | Anthropic | 200,000 | Claude 4.5 Haiku | -| `anthropic.claude-opus-4-5-20251101-v1:0` | Anthropic | 200,000 | Claude 4.5 Opus | - -## Installation - -AWS Bedrock support requires additional dependencies. Install them with: - -```bash -pip install agent-memory-server[aws] -``` - -This installs: - -- `boto3` - AWS SDK for Python -- `botocore` - Low-level AWS client library - -## Configuration - -### Environment Variables - -Configure the following environment variables to use Bedrock models: - -```bash -# Required: AWS region where Bedrock is available -AWS_REGION_NAME=us-east-1 - -# For Bedrock Embedding Models (note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 - -# For Bedrock LLM Generation Models (no prefix needed) -GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 - -# AWS Credentials (choose one method below) -``` - -### AWS Credentials - -There are several ways to provide AWS credentials: - -#### Option 1: Environment Variables (Explicit) - -```bash -export AWS_ACCESS_KEY_ID=your-access-key-id -export AWS_SECRET_ACCESS_KEY=your-secret-access-key -export AWS_SESSION_TOKEN=your-session-token # Optional, for temporary credentials -``` - -#### Option 2: AWS Credentials File - -The server will automatically use credentials from `~/.aws/credentials`: - -```ini -[default] -aws_access_key_id = your-access-key-id -aws_secret_access_key = your-secret-access-key -``` - -#### Option 3: IAM Role (Recommended for AWS deployments) - -When running on AWS infrastructure (EC2, ECS, Lambda, etc.), use IAM roles for automatic credential management. No explicit credentials are needed. - -#### Option 4: AWS SSO / AWS CLI Profile - -If you've configured AWS SSO or named profiles: - -```bash -# First, login via SSO -aws sso login --profile your-profile - -# The server will use the default profile, or set explicitly -export AWS_PROFILE=your-profile -``` - -### Docker Configuration - -The Docker image supports two build targets: - -- **`standard`** (default): OpenAI/Anthropic support only -- **`aws`**: Includes AWS Bedrock embedding models support - -#### Building the AWS-enabled Image - -```bash -# Build directly with Docker -docker build --target aws -t agent-memory-server:aws . - -# Or use Docker Compose with the DOCKER_TARGET variable -DOCKER_TARGET=aws docker-compose up --build -``` - -#### Docker Compose Configuration - -When using Docker Compose, set the `DOCKER_TARGET` environment variable to `aws`: - -```bash -# Start with AWS Bedrock support -DOCKER_TARGET=aws docker-compose up --build - -# Or for the production-like setup -DOCKER_TARGET=aws docker-compose -f docker-compose-task-workers.yml up --build -``` - -Create a `.env` file with your credentials and configuration: - -```bash -# Docker build target -DOCKER_TARGET=aws - -# Embedding model (note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 - -# AWS credentials -AWS_REGION_NAME=us-east-1 -AWS_ACCESS_KEY_ID=your-access-key-id -AWS_SECRET_ACCESS_KEY=your-secret-access-key -AWS_SESSION_TOKEN=your-session-token # Optional -``` - -The Docker Compose files already include the AWS environment variables, so you only need to set them in your `.env` file or environment. - -## Required IAM Permissions - -The AWS credentials must have permissions to invoke Bedrock models. Here's a minimal IAM policy: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": [ - "bedrock:InvokeModel", - "bedrock:ListFoundationModels" - ], - "Resource": "*" - } - ] -} -``` - -For production, scope down the `Resource` to specific model ARNs: - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "bedrock:InvokeModel", - "Resource": [ - "arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0", - "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-5-20250929-v1:0", - "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-haiku-4-5-20251001-v1:0" - ] - }, - { - "Effect": "Allow", - "Action": "bedrock:ListFoundationModels", - "Resource": "*" - } - ] -} -``` - -**Note**: When using Bedrock LLM models for generation tasks (memory extraction, summarization, topic modeling), ensure your IAM policy includes permissions for all the models you've configured (`GENERATION_MODEL`, `FAST_MODEL`, `SLOW_MODEL`, `TOPIC_MODEL`). - -## Vector Dimensions - -When using Bedrock embedding models, make sure to update the vector dimensions setting to match your chosen model: - -```bash -# For Titan v2 and Cohere models (1024 dimensions) -REDISVL_VECTOR_DIMENSIONS=1024 - -# For Titan v1 (1536 dimensions) -REDISVL_VECTOR_DIMENSIONS=1536 -``` - -## Complete Configuration Examples - -### Example 1: Bedrock Embeddings with OpenAI Generation - -```bash -# Embedding model (Bedrock - note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 - -# AWS Configuration -AWS_REGION_NAME=us-east-1 -AWS_ACCESS_KEY_ID=your-access-key-id -AWS_SECRET_ACCESS_KEY=your-secret-access-key - -# Vector store dimensions (must match embedding model) -REDISVL_VECTOR_DIMENSIONS=1024 - -# Generation model (OpenAI) -GENERATION_MODEL=gpt-4o -OPENAI_API_KEY=your-openai-key - -# Other settings -REDIS_URL=redis://localhost:6379 -``` - -### Example 2: Full Bedrock Stack (Recommended for AWS-only deployments) - -```bash -# AWS Configuration -AWS_REGION_NAME=us-east-1 -AWS_ACCESS_KEY_ID=your-access-key-id -AWS_SECRET_ACCESS_KEY=your-secret-access-key - -# Embedding model (Bedrock Titan - note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 -REDISVL_VECTOR_DIMENSIONS=1024 - -# Generation models (Bedrock Claude - no prefix needed) -GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 -SLOW_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -TOPIC_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 - -# Other settings -REDIS_URL=redis://localhost:6379 -``` - -### YAML Configuration - -```yaml -# config.yaml - Full Bedrock Stack -region_name: us-east-1 -embedding_model: amazon.titan-embed-text-v2:0 -redisvl_vector_dimensions: 1024 -generation_model: anthropic.claude-sonnet-4-5-20250929-v1:0 -fast_model: anthropic.claude-haiku-4-5-20251001-v1:0 -slow_model: anthropic.claude-sonnet-4-5-20250929-v1:0 -topic_model: anthropic.claude-haiku-4-5-20251001-v1:0 -redis_url: redis://localhost:6379 -``` - -## Model Validation - -The server validates that the specified Bedrock embedding model exists in your configured region at startup. If the model is not found, you'll see an error like: - -``` -ValueError: Bedrock embedding model amazon.titan-embed-text-v2:0 not found in region us-east-1. -``` - -This helps catch configuration errors early. Common causes: - -1. **Model not enabled**: You may need to enable the model in the Bedrock console -2. **Wrong region**: The model may not be available in your configured region -3. **Typo in model ID**: Double-check the model ID spelling - -## Enabling Bedrock Models in AWS Console - -Before using a Bedrock model, you must enable it in the AWS Console: - -1. Navigate to **Amazon Bedrock** in the AWS Console -2. Select **Model access** from the left navigation -3. Click **Manage model access** -4. Enable the embedding models you want to use -5. Wait for access to be granted (usually immediate for Amazon models) - -## Mixing Providers - -You can mix and match providers for different use cases: - -### Bedrock Embeddings with OpenAI Generation - -```bash -# Embeddings via Bedrock (note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 -AWS_REGION_NAME=us-east-1 - -# Generation via OpenAI -GENERATION_MODEL=gpt-4o -OPENAI_API_KEY=your-openai-key -``` - -### Full Bedrock Stack (Embeddings + Generation) - -```bash -# All AWS - keep everything within your AWS environment -AWS_REGION_NAME=us-east-1 - -# Embeddings via Bedrock (note: bedrock/ prefix required) -EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 -REDISVL_VECTOR_DIMENSIONS=1024 - -# Generation via Bedrock Claude (no prefix needed) -GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -FAST_MODEL=anthropic.claude-haiku-4-5-20251001-v1:0 -``` - -### OpenAI Embeddings with Bedrock Generation - -```bash -# Embeddings via OpenAI -EMBEDDING_MODEL=text-embedding-3-small -OPENAI_API_KEY=your-openai-key - -# Generation via Bedrock -AWS_REGION_NAME=us-east-1 -GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 -``` - -This flexibility allows you to: -- Keep all data within AWS for compliance requirements -- Use the best model for each task -- Optimize costs by choosing appropriate models for different operations - -## Troubleshooting - -### "AWS-related dependencies might be missing" - -Install the AWS extras: - -```bash -pip install agent-memory-server[aws] -``` - -### "Missing environment variable 'AWS_REGION_NAME'" - -Set the AWS region: - -```bash -export AWS_REGION_NAME=us-east-1 -``` - -### "Bedrock embedding model not found" - -1. Verify the model ID is correct -2. Check the model is available in your region -3. Ensure the model is enabled in Bedrock console -4. Verify your IAM permissions include `bedrock:ListFoundationModels` - -### "Bedrock LLM model not responding correctly" - -1. Verify the model ID matches exactly (including version suffix like `:0`) -2. Check the model is enabled in your Bedrock console -3. Verify your IAM permissions include `bedrock:InvokeModel` for the specific model -4. Some models may have different regional availability - check AWS documentation - -### "Error creating chat completion with Bedrock" - -1. Check that the model ID is correct and the model is enabled -2. Verify your AWS credentials have `bedrock:InvokeModel` permission -3. Check the request isn't exceeding the model's token limits -4. Review CloudWatch logs for detailed error messages - -### Credential Errors - -If you see authentication errors: - -1. Verify your AWS credentials are correctly set -2. Check the credentials have the required IAM permissions -3. If using temporary credentials, ensure they haven't expired -4. Try running `aws sts get-caller-identity` to verify your credentials work - -### Model-Specific Issues - -**Model IDs**: Ensure you're using the Bedrock-specific model IDs (e.g., `anthropic.claude-sonnet-4-5-20250929-v1:0`) not the direct provider API model IDs. - -**Embedding Model Prefix**: Bedrock embedding models require the `bedrock/` prefix (e.g., `bedrock/amazon.titan-embed-text-v2:0`). Unprefixed names will work but emit a deprecation warning. - -**LiteLLM Backend**: All Bedrock operations use [LiteLLM](https://docs.litellm.ai/) internally, which provides a unified interface for all Bedrock models. Model-specific formatting is handled automatically. - -## Performance Considerations - -- **Latency**: Bedrock API calls may have different latency characteristics than OpenAI -- **Rate limits**: Check your Bedrock service quotas for your region -- **Caching**: The server caches model existence checks for 1 hour to reduce API calls -- **Cost**: Review Bedrock pricing for your chosen embedding model - -## Related Documentation - -- [LLM Providers](llm-providers.md) - Comprehensive LLM provider guide (recommended) -- [Embedding Providers](embedding-providers.md) - Embedding model configuration -- [Configuration](configuration.md) - Full configuration reference -- [Vector Store Backends](vector-store-backends.md) - Custom vector store setup -- [Getting Started](getting-started.md) - Initial setup guide diff --git a/docs/developer-guide-index.md b/docs/developer-guide-index.md index 6aec98c2..ddd777a0 100644 --- a/docs/developer-guide-index.md +++ b/docs/developer-guide-index.md @@ -1,6 +1,6 @@ # Developer Guide -Learn how to integrate memory into your AI applications. This guide covers integration patterns, memory types, extraction strategies, and production considerations. +Learn how to integrate memory into your AI applications. This guide covers integration patterns, memory types, extraction strategies, and memory lifecycle management. ## Core Concepts @@ -44,13 +44,9 @@ Learn how to integrate memory into your AI applications. This guide covers integ | Topic | Description | |-------|-------------| +| [Summary Views](summary-views.md) | Pre-computed memory summaries for efficient context | | [Memory Lifecycle](memory-lifecycle.md) | How memories are created, updated, and managed over time | -| [LLM Providers](llm-providers.md) | Configure OpenAI, Anthropic, AWS Bedrock, Ollama, and more | -| [Embedding Providers](embedding-providers.md) | Configure embedding models for semantic search | -| [Vector Store Backends](vector-store-backends.md) | Configure Redis, Pinecone, Chroma, or other backends | -| [AWS Bedrock](aws-bedrock.md) | AWS-specific setup for Bedrock models | -| [Authentication](authentication.md) | OAuth2/JWT and token-based authentication | -| [Security](security-custom-prompts.md) | Security considerations for custom prompts | +| [LangChain Integration](langchain-integration.md) | Use memory with LangChain agents and chains | ## Where to Start @@ -59,3 +55,5 @@ Learn how to integrate memory into your AI applications. This guide covers integ **Need to understand the data model?** Read [Working Memory](working-memory.md) and [Long-term Memory](long-term-memory.md). **Configuring extraction behavior?** See [Memory Extraction Strategies](memory-extraction-strategies.md). + +**Looking for server configuration?** See the [Operations Guide](operations-guide-index.md) for authentication, LLM providers, and deployment. diff --git a/docs/java-sdk.md b/docs/java-sdk.md new file mode 100644 index 00000000..cf4f65bc --- /dev/null +++ b/docs/java-sdk.md @@ -0,0 +1,320 @@ +# Java SDK + +The Java SDK (`agent-memory-client-java`) provides a type-safe client for integrating memory capabilities into JVM-based applications. + +**Version**: 0.1.0+ +**Requirements**: Java 21 or higher + +## Installation + +### Gradle (Kotlin DSL) + +```kotlin +dependencies { + implementation("com.redis:agent-memory-client-java:0.1.0") +} +``` + +### Gradle (Groovy) + +```groovy +dependencies { + implementation 'com.redis:agent-memory-client-java:0.1.0' +} +``` + +### Maven + +```xml + + com.redis + agent-memory-client-java + 0.1.0 + +``` + +## Quick Start + +```java +import com.redis.agentmemory.MemoryAPIClient; +import com.redis.agentmemory.models.longtermemory.*; +import java.util.*; + +// Create client +MemoryAPIClient client = MemoryAPIClient.builder("http://localhost:8000") + .defaultNamespace("my-app") + .timeout(30.0) + .build(); + +// Store a memory +MemoryRecord memory = MemoryRecord.builder() + .text("User prefers morning meetings") + .memoryType(MemoryType.SEMANTIC) + .topics(List.of("scheduling", "preferences")) + .userId("alice") + .build(); + +client.longTermMemory().createLongTermMemories(List.of(memory)); + +// Search memories +SearchRequest request = SearchRequest.builder() + .text("when does user prefer meetings") + .userId("alice") + .topics(List.of("scheduling")) + .limit(5) + .build(); + +MemoryRecordResults results = client.longTermMemory().searchLongTermMemories(request); + +for (MemoryRecordResult result : results.getMemories()) { + System.out.printf("%s (distance: %.3f)%n", result.getText(), result.getDist()); +} + +// Clean up +client.close(); +``` + +## Client Configuration + +The client uses the Builder pattern: + +```java +MemoryAPIClient client = MemoryAPIClient.builder("http://localhost:8000") + .timeout(30.0) // Request timeout (seconds) + .defaultNamespace("production") // Default namespace + .defaultModelName("gpt-4o") // For auto-summarization + .defaultContextWindowMax(128000) // Context window limit + .build(); +``` + +The client implements `AutoCloseable` for try-with-resources: + +```java +try (MemoryAPIClient client = MemoryAPIClient.builder("http://localhost:8000").build()) { + // Use client +} +``` + +## Service Architecture + +The Java SDK uses a service-based architecture. Access services through the client: + +```java +client.health() // HealthService - health checks +client.workingMemory() // WorkingMemoryService - session management +client.longTermMemory() // LongTermMemoryService - persistent memories +client.hydration() // MemoryHydrationService - prompt hydration +client.summaryViews() // SummaryViewService - summary views +client.tasks() // TaskService - background tasks +``` + +## Memory Operations + +### Creating Memories + +```java +List memories = List.of( + MemoryRecord.builder() + .text("User works at TechCorp") + .memoryType(MemoryType.SEMANTIC) + .topics(List.of("career", "work")) + .entities(List.of("TechCorp")) + .userId("alice") + .build() +); + +client.longTermMemory().createLongTermMemories(memories); +``` + +### Searching Memories + +```java +// Using builder pattern +SearchRequest request = SearchRequest.builder() + .text("user preferences") + .namespace("my-app") + .userId("alice") + .topics(List.of("preferences")) + .limit(10) + .offset(0) + .build(); + +MemoryRecordResults results = client.longTermMemory().searchLongTermMemories(request); + +// Simple text search +MemoryRecordResults simpleResults = client.longTermMemory() + .searchLongTermMemories("user preferences"); +``` + +### Get, Edit, and Delete + +```java +// Get a specific memory +MemoryRecord memory = client.longTermMemory().getLongTermMemory("memory-id"); + +// Edit a memory +Map updates = Map.of( + "text", "Updated text content", + "topics", List.of("updated", "topics") +); +client.longTermMemory().editLongTermMemory("memory-id", updates); + +// Delete memories +client.longTermMemory().deleteLongTermMemories(List.of("id1", "id2")); +``` + +## Working Memory + +```java +import com.redis.agentmemory.models.workingmemory.*; + +// Get or create working memory +WorkingMemoryResult result = client.workingMemory() + .getOrCreateWorkingMemory("session-123"); + +boolean wasCreated = result.isCreated(); +WorkingMemoryResponse memory = result.getMemory(); + +// Update with messages +WorkingMemory update = WorkingMemory.builder() + .sessionId("session-123") + .messages(List.of( + new MemoryMessage("user", "I'm planning a trip to Italy"), + new MemoryMessage("assistant", "That sounds exciting!") + )) + .data(Map.of("destination", "Italy")) + .build(); + +client.workingMemory().putWorkingMemory("session-123", update); + +// Append messages (more efficient than full update) +List newMessages = List.of( + new MemoryMessage("user", "What are the best places?") +); +client.workingMemory().appendMessagesToWorkingMemory("session-123", newMessages); + +// Delete session +client.workingMemory().deleteWorkingMemory("session-123"); +``` + +## Forgetting Memories + +```java +Map policy = Map.of( + "max_age_days", 90, + "max_inactive_days", 30, + "budget", 100, + "memory_type_allowlist", List.of("episodic") +); + +// Preview (dry run) +ForgetResponse preview = client.longTermMemory().forgetLongTermMemories( + policy, + "my-app", // namespace + null, // userId + null, // sessionId + 1000, // limit + true, // dryRun + List.of("keep-this-id") // pinnedIds +); +System.out.printf("Would delete %d of %d%n", preview.getDeleted(), preview.getScanned()); + +// Execute +ForgetResponse result = client.longTermMemory().forgetLongTermMemories( + policy, "my-app", null, null, 1000, false, null +); +``` + +## Bulk Operations + +```java +// Bulk create with rate limiting +List> batches = List.of(memories1, memories2, memories3); +List results = client.longTermMemory().bulkCreateLongTermMemories( + batches, + 50, // batchSize + 100 // delayBetweenBatchesMs +); + +// Auto-paginating search with Iterator +Iterator iterator = client.longTermMemory().searchAllLongTermMemories( + "user preferences", // text + null, // sessionId + "my-app", // namespace + null, // topics + null, // entities + "alice", // userId + 50 // batchSize +); + +while (iterator.hasNext()) { + MemoryRecord memory = iterator.next(); + System.out.println(memory.getText()); +} + +// Or use Stream API +Stream stream = client.longTermMemory().searchAllLongTermMemoriesStream( + "user preferences", null, "my-app", null, null, "alice", 50 +); +stream.forEach(m -> System.out.println(m.getText())); +``` + +## Summary Views + +```java +import com.redis.agentmemory.models.summaryview.*; + +// Create a summary view +CreateSummaryViewRequest request = CreateSummaryViewRequest.builder() + .name("User Topic Summaries") + .source("long_term") + .groupBy(List.of("user_id", "topics")) + .timeWindowDays(30) + .continuous(true) + .build(); + +SummaryView view = client.summaryViews().createSummaryView(request); + +// Run a partition +Map group = Map.of("user_id", "alice", "topics", "travel"); +SummaryViewPartitionResult partition = client.summaryViews() + .runSummaryViewPartition(view.getId(), group); + +// List views +List views = client.summaryViews().listSummaryViews(); +``` + +## Error Handling + +```java +import com.redis.agentmemory.exceptions.*; + +try { + MemoryRecord memory = client.longTermMemory().getLongTermMemory("invalid-id"); +} catch (MemoryNotFoundException e) { + System.out.println("Memory not found: " + e.getMessage()); +} catch (MemoryServerException e) { + System.out.println("Server error: " + e.getMessage()); +} catch (MemoryValidationException e) { + System.out.println("Validation error: " + e.getMessage()); +} catch (MemoryClientException e) { + System.out.println("Client error: " + e.getMessage()); +} +``` + +## Validation + +The client provides validation utilities: + +```java +// Validate a memory record +client.validateMemoryRecord(memory); + +// Validate search filters +Map filters = Map.of( + "limit", 10, + "offset", 0, + "distance_threshold", 0.5 +); +client.validateSearchFilters(filters); +``` diff --git a/docs/langchain-integration.md b/docs/langchain-integration.md index 7ca243d5..1715b4da 100644 --- a/docs/langchain-integration.md +++ b/docs/langchain-integration.md @@ -6,7 +6,7 @@ The Python SDK (agent-memory-client) provides a LangChain integration that helps The SDK provides a `get_memory_tools()` function that returns a list of LangChain `StructuredTool` instances. These tools give your LangChain LLMs and agents access to the memory server's capabilities. -For details on available memory operations, see the [Tool Methods](python-sdk.md#tool-methods) section of the Python SDK documentation. +For details on available memory operations, see the [Tool Integration](python-sdk.md#tool-integration) section of the Python SDK documentation. ### Direct LLM Integration diff --git a/docs/llm-providers.md b/docs/llm-providers.md index f475ae8b..911a579c 100644 --- a/docs/llm-providers.md +++ b/docs/llm-providers.md @@ -125,6 +125,16 @@ export EMBEDDING_MODEL=text-embedding-3-small AWS Bedrock provides access to foundation models from multiple providers (Anthropic Claude, Amazon Titan, Cohere, etc.) through AWS infrastructure. +#### Installation + +AWS Bedrock support requires additional dependencies: + +```bash +pip install agent-memory-server[aws] +``` + +This installs `boto3` and `botocore` for AWS authentication. + #### Authentication Bedrock uses standard AWS credentials. Configure using any of these methods: @@ -142,6 +152,10 @@ export AWS_REGION_NAME=us-east-1 # Option 3: IAM role (recommended for production on AWS) # No credentials needed - uses instance/container role export AWS_REGION_NAME=us-east-1 + +# Option 4: AWS SSO +aws sso login --profile your-profile +export AWS_PROFILE=your-profile ``` #### Generation Models @@ -170,16 +184,30 @@ export GENERATION_MODEL=amazon.titan-text-premier-v1:0 ```bash # Correct - use bedrock/ prefix export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +REDISVL_VECTOR_DIMENSIONS=1024 # Must match embedding model # Deprecated - unprefixed names emit a warning export EMBEDDING_MODEL=amazon.titan-embed-text-v2:0 # Works but shows deprecation warning ``` **Supported Bedrock embedding models:** -- `bedrock/amazon.titan-embed-text-v2:0` (1024 dimensions, recommended) -- `bedrock/amazon.titan-embed-text-v1` (1536 dimensions) -- `bedrock/cohere.embed-english-v3` (1024 dimensions) -- `bedrock/cohere.embed-multilingual-v3` (1024 dimensions) + +| Model ID | Dimensions | Description | +|----------|------------|-------------| +| `bedrock/amazon.titan-embed-text-v2:0` | 1024 | Latest Titan (recommended) | +| `bedrock/amazon.titan-embed-text-v1` | 1536 | Original Titan | +| `bedrock/cohere.embed-english-v3` | 1024 | English-focused | +| `bedrock/cohere.embed-multilingual-v3` | 1024 | Multilingual | + +#### Enabling Bedrock Models + +Before using a Bedrock model, enable it in the AWS Console: + +1. Navigate to **Amazon Bedrock** in the AWS Console +2. Select **Model access** from the left navigation +3. Click **Manage model access** +4. Enable the models you need +5. Wait for access to be granted (usually immediate for Amazon models) #### IAM Permissions @@ -193,7 +221,8 @@ Your IAM role/user needs these permissions: "Effect": "Allow", "Action": [ "bedrock:InvokeModel", - "bedrock:InvokeModelWithResponseStream" + "bedrock:InvokeModelWithResponseStream", + "bedrock:ListFoundationModels" ], "Resource": [ "arn:aws:bedrock:*::foundation-model/anthropic.claude-*", @@ -206,13 +235,27 @@ Your IAM role/user needs these permissions: #### Docker Configuration -When running in Docker, pass AWS credentials: +The Docker image supports two build targets: + +- **`standard`** (default): OpenAI/Anthropic support only +- **`aws`**: Includes AWS Bedrock support + +```bash +# Build AWS-enabled image +docker build --target aws -t agent-memory-server:aws . + +# Or with Docker Compose +DOCKER_TARGET=aws docker-compose up --build +``` + +When running, pass AWS credentials: ```bash docker run -e AWS_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY -e AWS_REGION_NAME \ -e GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 \ -e EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 \ - agent-memory-server + -e REDISVL_VECTOR_DIMENSIONS=1024 \ + agent-memory-server:aws ``` Or mount credentials: @@ -221,7 +264,26 @@ Or mount credentials: docker run -v ~/.aws:/root/.aws:ro \ -e AWS_PROFILE=my-profile \ -e AWS_REGION_NAME=us-east-1 \ - agent-memory-server + agent-memory-server:aws +``` + +#### Complete Example + +Full Bedrock stack (keep all AI operations within AWS): + +```bash +# AWS credentials +export AWS_REGION_NAME=us-east-1 +export AWS_ACCESS_KEY_ID=... +export AWS_SECRET_ACCESS_KEY=... + +# Embeddings (bedrock/ prefix required) +export EMBEDDING_MODEL=bedrock/amazon.titan-embed-text-v2:0 +export REDISVL_VECTOR_DIMENSIONS=1024 + +# Generation (no prefix needed) +export GENERATION_MODEL=anthropic.claude-sonnet-4-5-20250929-v1:0 +export FAST_MODEL=anthropic.claude-3-5-haiku-20241022-v1:0 ``` ### Ollama (Local Models) diff --git a/docs/operations-guide-index.md b/docs/operations-guide-index.md new file mode 100644 index 00000000..47b8ebb5 --- /dev/null +++ b/docs/operations-guide-index.md @@ -0,0 +1,84 @@ +# Operations Guide + +Configure, secure, and deploy your Redis Agent Memory Server in production. This guide covers server configuration, authentication, LLM providers, and infrastructure setup. + +## Server Configuration + +
+ +- ⚙️ **Configuration** + + --- + + Environment variables, YAML config files, and all server settings + + [Configuration Reference →](configuration.md) + +- 🔐 **Authentication** + + --- + + OAuth2/JWT, token-based auth, and multi-provider setup + + [Authentication Guide →](authentication.md) + +- 🛡️ **Security** + + --- + + Security considerations for custom prompts and production deployments + + [Security Guide →](security-custom-prompts.md) + +
+ +## AI Provider Setup + +
+ +- 🤖 **LLM Providers** + + --- + + Configure OpenAI, Anthropic, AWS Bedrock, Ollama, and 100+ providers via LiteLLM + + [LLM Providers →](llm-providers.md) + +- 📐 **Embedding Providers** + + --- + + Set up embedding models for semantic search + + [Embedding Providers →](embedding-providers.md) + +- 🗄️ **Vector Store Backends** + + --- + + Configure Redis, Pinecone, Chroma, or other vector stores + + [Vector Store Backends →](vector-store-backends.md) + +
+ +## Quick Reference + +| Topic | Description | +|-------|-------------| +| [Configuration](configuration.md) | All environment variables and YAML settings | +| [Authentication](authentication.md) | OAuth2, token auth, and development mode | +| [Security](security-custom-prompts.md) | Custom prompt security and best practices | +| [LLM Providers](llm-providers.md) | Generation models including AWS Bedrock | +| [Embedding Providers](embedding-providers.md) | Embedding models and dimensions | +| [Vector Store Backends](vector-store-backends.md) | Storage backend configuration | + +## Where to Start + +**Deploying to production?** Start with [Configuration](configuration.md) to understand all server settings. + +**Setting up authentication?** See [Authentication](authentication.md) for OAuth2 or token-based auth. + +**Using AWS?** The [LLM Providers](llm-providers.md) guide covers AWS Bedrock setup for both generation and embedding models. + +**Customizing storage?** Check [Vector Store Backends](vector-store-backends.md) for Redis, Pinecone, and other options. diff --git a/docs/python-sdk.md b/docs/python-sdk.md index b233b547..cb7f34b3 100644 --- a/docs/python-sdk.md +++ b/docs/python-sdk.md @@ -2,6 +2,8 @@ The Python SDK (`agent-memory-client`) provides the easiest way to integrate memory into your AI applications. It includes high-level abstractions, tool integration for OpenAI and Anthropic, and automatic function call resolution. +**Version**: 0.14.0+ + ## Installation **Requirements**: Python 3.10 or higher @@ -13,67 +15,83 @@ pip install agent-memory-client ## Quick Start ```python -from agent_memory_client import MemoryAPIClient +from agent_memory_client import MemoryAPIClient, MemoryClientConfig +from agent_memory_client.models import ClientMemoryRecord, MemoryTypeEnum -# Connect to your memory server -client = MemoryAPIClient( +# Configure and create client +config = MemoryClientConfig( base_url="http://localhost:8000", - api_key="your-api-key" # Optional if auth disabled + default_namespace="my-app" ) -# Store a memory -await client.create_long_term_memories([{ - "text": "User prefers morning meetings and hates scheduling calls after 4 PM", - "memory_type": "semantic", - "topics": ["scheduling", "preferences"], - "user_id": "alice" -}]) +async with MemoryAPIClient(config) as client: + # Store a memory + await client.create_long_term_memory([ + ClientMemoryRecord( + text="User prefers morning meetings", + memory_type=MemoryTypeEnum.SEMANTIC, + topics=["scheduling", "preferences"], + user_id="alice" + ) + ]) -# Search memories -results = await client.search_long_term_memory( - text="when does user prefer meetings", - limit=5 -) + # Search memories + results = await client.search_long_term_memory( + text="when does user prefer meetings", + limit=5 + ) + + for memory in results.memories: + print(f"{memory.text} (score: {1 - memory.dist:.2f})") ``` ## Client Configuration -### Basic Setup +### Using MemoryClientConfig ```python -from agent_memory_client import MemoryAPIClient +from agent_memory_client import MemoryAPIClient, MemoryClientConfig # Minimal configuration (development) -client = MemoryAPIClient(base_url="http://localhost:8000") +config = MemoryClientConfig(base_url="http://localhost:8000") +client = MemoryAPIClient(config) -# Production configuration -client = MemoryAPIClient( +# Production configuration with defaults +config = MemoryClientConfig( base_url="https://your-memory-server.com", - api_key="your-api-token", timeout=30.0, - session_id="user-session-123", - user_id="user-456", - namespace="production" + default_namespace="production", + default_model_name="gpt-4o", # For token counting + default_context_window_max=128000 # Override context window ) +client = MemoryAPIClient(config) ``` -### Authentication +### Configuration Options -```python -# Token authentication -client = MemoryAPIClient( - base_url="https://your-server.com", - api_key="your-token-here" -) +| Option | Type | Description | +|--------|------|-------------| +| `base_url` | `str` | Memory server URL (required) | +| `timeout` | `float` | HTTP timeout in seconds (default: 30.0) | +| `default_namespace` | `str` | Default namespace for all operations | +| `default_model_name` | `str` | Model name for context window sizing | +| `default_context_window_max` | `int` | Override max context window tokens | -# OAuth2/JWT authentication -client = MemoryAPIClient( - base_url="https://your-server.com", - bearer_token="your-jwt-token" -) +### Async Context Manager + +The client supports async context manager for proper resource cleanup: -# Development (no auth) -client = MemoryAPIClient(base_url="http://localhost:8000") +```python +async with MemoryAPIClient(config) as client: + # Client automatically closes when exiting the context + results = await client.search_long_term_memory(text="query") + +# Or manually manage lifecycle +client = MemoryAPIClient(config) +try: + results = await client.search_long_term_memory(text="query") +finally: + await client.close() ``` ## Tool Integration @@ -84,21 +102,22 @@ The SDK provides automatic tool schemas and function call resolution for OpenAI: ```python import openai -from agent_memory_client import MemoryAPIClient +from agent_memory_client import MemoryAPIClient, MemoryClientConfig # Setup clients -memory_client = MemoryAPIClient(base_url="http://localhost:8000") +config = MemoryClientConfig(base_url="http://localhost:8000") +memory_client = MemoryAPIClient(config) openai_client = openai.AsyncClient() -# Get tool schemas for OpenAI +# Get tool schemas for OpenAI (returns ToolSchemaCollection) memory_tools = MemoryAPIClient.get_all_memory_tool_schemas() async def chat_with_memory(message: str, session_id: str): - # Make request with memory tools + # Make request with memory tools (convert to list for API) response = await openai_client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": message}], - tools=memory_tools, + tools=memory_tools.to_list(), tool_choice="auto" ) @@ -149,20 +168,21 @@ Similar tool integration for Anthropic Claude: ```python import anthropic -from agent_memory_client import MemoryAPIClient +from agent_memory_client import MemoryAPIClient, MemoryClientConfig # Setup clients -memory_client = MemoryAPIClient(base_url="http://localhost:8000") +config = MemoryClientConfig(base_url="http://localhost:8000") +memory_client = MemoryAPIClient(config) anthropic_client = anthropic.AsyncClient() -# Get tool schemas for Anthropic +# Get tool schemas for Anthropic (returns ToolSchemaCollection) memory_tools = MemoryAPIClient.get_all_memory_tool_schemas_anthropic() async def chat_with_memory(message: str, session_id: str): response = await anthropic_client.messages.create( model="claude-3-5-sonnet-20241022", messages=[{"role": "user", "content": message}], - tools=memory_tools, + tools=memory_tools.to_list(), max_tokens=1000 ) @@ -360,119 +380,286 @@ response = await anthropic_client.messages.create( ## Memory Operations -### Creating Memories +### Creating Long-Term Memories ```python +from agent_memory_client.models import ClientMemoryRecord, MemoryTypeEnum + # Create multiple memories memories = [ - { - "text": "User works as a software engineer at TechCorp", - "memory_type": "semantic", - "topics": ["career", "work", "company"], - "entities": ["TechCorp", "software engineer"], - "user_id": "alice" - }, - { - "text": "User prefers Python and TypeScript for development", - "memory_type": "semantic", - "topics": ["programming", "preferences", "languages"], - "entities": ["Python", "TypeScript"], - "user_id": "alice" - } + ClientMemoryRecord( + text="User works as a software engineer at TechCorp", + memory_type=MemoryTypeEnum.SEMANTIC, + topics=["career", "work", "company"], + entities=["TechCorp", "software engineer"], + user_id="alice" + ), + ClientMemoryRecord( + text="User prefers Python and TypeScript for development", + memory_type=MemoryTypeEnum.SEMANTIC, + topics=["programming", "preferences", "languages"], + entities=["Python", "TypeScript"], + user_id="alice" + ) ] -result = await client.create_long_term_memories(memories) -print(f"Created {len(result.memories)} memories") +result = await client.create_long_term_memory(memories) +print(f"Created memories: {result.status}") ``` -### Searching Memories +### Searching Memories with Filters + +The SDK provides powerful filter classes for precise memory retrieval: ```python +from agent_memory_client.filters import ( + Topics, Entities, CreatedAt, UserId, Namespace, MemoryType +) +from datetime import datetime, timedelta, timezone + # Basic semantic search results = await client.search_long_term_memory( text="user programming experience", limit=10 ) -# Advanced filtering +# Filter using filter objects (recommended) results = await client.search_long_term_memory( text="user preferences", - user_id="alice", - topics=["programming", "food"], - limit=5, - min_relevance_score=0.7 + user_id=UserId(eq="alice"), + topics=Topics(any=["programming", "food"]), # Match any of these topics + distance_threshold=0.3, # Lower = more relevant (0-1 scale) + limit=5 ) -# Time-based filtering -from datetime import datetime, timedelta - -week_ago = datetime.now() - timedelta(days=7) +# Time-based filtering with CreatedAt +week_ago = datetime.now(timezone.utc) - timedelta(days=7) results = await client.search_long_term_memory( text="recent updates", - created_after=week_ago, + created_at=CreatedAt(gte=week_ago), # Greater than or equal + limit=10 +) + +# Filter by memory type +results = await client.search_long_term_memory( + text="events that happened", + memory_type=MemoryType(eq="episodic"), limit=10 ) # Process results for memory in results.memories: - print(f"Relevance: {memory.relevance_score:.2f}") + relevance = 1 - memory.dist if memory.dist else None + print(f"Relevance: {relevance:.2f}" if relevance else "No score") print(f"Text: {memory.text}") print(f"Topics: {', '.join(memory.topics or [])}") ``` +#### Filter Reference + +| Filter | Options | Description | +|--------|---------|-------------| +| `SessionId` | `eq`, `in_`, `not_eq`, `not_in`, `startswith` | Filter by session ID | +| `Namespace` | `eq`, `in_`, `not_eq`, `not_in`, `startswith` | Filter by namespace | +| `UserId` | `eq`, `in_`, `not_eq`, `not_in`, `startswith` | Filter by user ID | +| `Topics` | `any`, `all`, `none` | Filter by topics | +| `Entities` | `any`, `all`, `none` | Filter by entities | +| `CreatedAt` | `gte`, `lte`, `eq` | Filter by creation date | +| `LastAccessed` | `gte`, `lte`, `eq` | Filter by last access date | +| `MemoryType` | `eq`, `in_`, `not_eq`, `not_in` | Filter by memory type | + ### Memory Editing ```python -# Update a memory -await client.edit_memory( - memory_id="memory-123", +# Update a memory by ID (get ID from search results) +updated = await client.edit_long_term_memory( + memory_id="01HXYZ...", # ULID from search results updates={ "text": "User works as a senior software engineer at TechCorp", "topics": ["career", "work", "company", "senior"], "entities": ["TechCorp", "senior software engineer"] } ) +print(f"Updated: {updated.text}") -# Add context to existing memory -await client.edit_memory( - memory_id="memory-456", - updates={ - "text": "User prefers Python and TypeScript for development. Recently started learning Rust.", - "topics": ["programming", "preferences", "languages", "rust"], - "entities": ["Python", "TypeScript", "Rust"] - } -) +# Get a specific memory by ID +memory = await client.get_long_term_memory(memory_id="01HXYZ...") +print(f"Memory: {memory.text}") + +# Delete memories +await client.delete_long_term_memories(["memory-id-1", "memory-id-2"]) ``` ### Working Memory ```python -# Store conversation context -conversation = { - "messages": [ - {"role": "user", "content": "I'm planning a trip to Italy"}, - {"role": "assistant", "content": "That sounds exciting! What cities are you thinking of visiting?"}, - {"role": "user", "content": "Rome and Florence, maybe Venice too"} - ], - "memories": [ - { - "text": "User is planning a trip to Italy, considering Rome, Florence, and Venice", - "memory_type": "semantic", - "topics": ["travel", "italy", "vacation"], - "entities": ["Italy", "Rome", "Florence", "Venice"] - } - ] -} - -await client.set_working_memory("session-123", conversation) +from agent_memory_client.models import ( + WorkingMemory, MemoryMessage, ClientMemoryRecord, MemoryTypeEnum +) -# Retrieve or create working memory -created, memory = await client.get_or_create_working_memory("session-123") +# Get or create working memory (returns tuple of created, memory) +created, memory = await client.get_or_create_working_memory( + session_id="session-123", + user_id="alice", + namespace="my-app" +) if created: print("Created new session") else: - print("Found existing session") -print(f"Session has {len(memory.messages)} messages") + print(f"Found existing session with {len(memory.messages)} messages") + +# Store/update working memory with messages +working_memory = WorkingMemory( + session_id="session-123", + namespace="my-app", + messages=[ + MemoryMessage(role="user", content="I'm planning a trip to Italy"), + MemoryMessage(role="assistant", content="That sounds exciting!"), + ], + memories=[ + ClientMemoryRecord( + text="User is planning a trip to Italy", + memory_type=MemoryTypeEnum.SEMANTIC, + topics=["travel", "italy"] + ) + ], + data={"destination": "Italy", "budget": 2000} +) + +response = await client.put_working_memory("session-123", working_memory) +print(f"Stored {len(response.messages)} messages") + +# Convenience: Set only the data portion +await client.set_working_memory_data( + session_id="session-123", + data={"trip_destination": "Rome", "travel_dates": ["2024-06-01", "2024-06-07"]} +) + +# Convenience: Add memories to working memory +await client.add_memories_to_working_memory( + session_id="session-123", + memories=[ + ClientMemoryRecord( + text="User prefers boutique hotels", + memory_type=MemoryTypeEnum.SEMANTIC, + topics=["travel", "preferences"] + ) + ] +) + +# Delete working memory when session ends +await client.delete_working_memory(session_id="session-123") +``` + +### Forgetting Memories + +Use `ForgetPolicy` to clean up old or inactive memories: + +```python +from agent_memory_client.models import ForgetPolicy + +# Define a forget policy +policy = ForgetPolicy( + max_age_days=90, # Forget memories older than 90 days + max_inactive_days=30, # Or inactive for 30+ days + budget=100, # Process up to 100 memories per run + memory_type_allowlist=["episodic"] # Only forget episodic memories +) + +# Dry run to preview what would be deleted +preview = await client.forget_long_term_memories( + policy=policy, + namespace="my-app", + user_id="alice", + dry_run=True # Preview only, do not delete +) +print(f"Would delete {preview.deleted} of {preview.scanned} memories") + +# Execute forget operation +result = await client.forget_long_term_memories( + policy=policy, + namespace="my-app", + user_id="alice", + pinned_ids=["memory-to-keep-1", "memory-to-keep-2"], # Exclude these + dry_run=False +) +print(f"Deleted {result.deleted} memories: {result.deleted_ids}") +``` + +### Summary Views + +Summary Views create aggregated summaries of memories, grouped by fields you specify: + +```python +from agent_memory_client.models import CreateSummaryViewRequest, SummaryViewSource + +# Create a summary view that groups by user and topic +request = CreateSummaryViewRequest( + name="User Topic Summaries", + source=SummaryViewSource.LONG_TERM, + group_by=["user_id", "topics"], + time_window_days=30, # Only last 30 days + continuous=True, # Auto-refresh in background + prompt="Summarize these memories concisely:", # Custom prompt + model_name="gpt-4o-mini" # Override model +) + +view = await client.create_summary_view(request) +print(f"Created view: {view.id}") + +# List all views +views = await client.list_summary_views() +for v in views: + print(f"View: {v.name} (groups by: {v.group_by})") + +# Run a specific partition (sync) +partition_result = await client.run_summary_view_partition( + view_id=view.id, + group={"user_id": "alice", "topics": "travel"} +) +print(f"Summary: {partition_result.summary}") +print(f"Based on {partition_result.memory_count} memories") + +# Run full view as background task +task = await client.run_summary_view(view_id=view.id, force=True) +print(f"Task ID: {task.id}, Status: {task.status}") + +# Poll for completion +import asyncio +while True: + task = await client.get_task(task.id) + if task.status in ["completed", "failed"]: + break + await asyncio.sleep(1) + +# List computed partitions +partitions = await client.list_summary_view_partitions( + view_id=view.id, + user_id="alice" +) +for p in partitions: + print(f"Group: {p.group}, Summary: {p.summary[:100]}...") + +# Delete a view +await client.delete_summary_view(view.id) +``` + +### Recency Boosting + +Use `RecencyConfig` to boost recent memories in search results: + +```python +from agent_memory_client.models import RecencyConfig + +# Boost recently accessed memories +results = await client.search_long_term_memory( + text="user preferences", + recency=RecencyConfig( + decay_factor=0.9, # How fast relevance decays (0-1) + reference_timestamp=None # Use current time + ), + limit=10 +) ``` ## Memory-Enhanced Conversations @@ -541,6 +728,8 @@ async def chat_with_auto_memory(message: str, session_id: str): ### Bulk Memory Creation ```python +from agent_memory_client.models import ClientMemoryRecord, MemoryTypeEnum + # Process large datasets efficiently async def import_user_data(user_data: list, user_id: str): batch_size = 50 @@ -549,24 +738,26 @@ async def import_user_data(user_data: list, user_id: str): batch = user_data[i:i + batch_size] memories = [ - { - "text": item["description"], - "memory_type": "semantic", - "topics": item.get("categories", []), - "entities": item.get("entities", []), - "user_id": user_id, - "metadata": {"source": item["source"]} - } + ClientMemoryRecord( + text=item["description"], + memory_type=MemoryTypeEnum.SEMANTIC, + topics=item.get("categories", []), + entities=item.get("entities", []), + user_id=user_id, + ) for item in batch ] - result = await client.create_long_term_memories(memories) - print(f"Imported batch {i//batch_size + 1}, {len(result.memories)} memories") + result = await client.create_long_term_memory(memories) + print(f"Imported batch {i//batch_size + 1}: {result.status}") ``` ### Bulk Search Operations ```python +import asyncio +from agent_memory_client.filters import UserId + # Search multiple queries efficiently async def multi_search(queries: list[str], user_id: str): results = {} @@ -575,7 +766,7 @@ async def multi_search(queries: list[str], user_id: str): search_tasks = [ client.search_long_term_memory( text=query, - user_id=user_id, + user_id=UserId(eq=user_id), limit=3 ) for query in queries @@ -591,58 +782,66 @@ async def multi_search(queries: list[str], user_id: str): ## Error Handling +### Exception Classes + +```python +from agent_memory_client import ( + MemoryClientError, # Base exception for all client errors + MemoryNotFoundError, # Memory not found (404) + MemoryServerError, # Server error (5xx) + MemoryValidationError, # Invalid input (400) +) +``` + ### Robust Client Usage ```python -from agent_memory_client import MemoryAPIClient, MemoryError +from agent_memory_client import ( + MemoryAPIClient, MemoryClientConfig, + MemoryClientError, MemoryNotFoundError, MemoryServerError +) import asyncio import logging +config = MemoryClientConfig(base_url="http://localhost:8000") + async def robust_memory_operation(client: MemoryAPIClient): try: - # Attempt memory operation results = await client.search_long_term_memory( text="user preferences", limit=5 ) - return results.memories - except MemoryError as e: - if e.status_code == 401: - logging.error("Authentication failed - check API key") - elif e.status_code == 429: - logging.warning("Rate limited - waiting before retry") - await asyncio.sleep(5) - return await robust_memory_operation(client) - else: - logging.error(f"Memory API error: {e}") - return [] + except MemoryNotFoundError: + logging.warning("No matching memories found") + return [] + + except MemoryServerError as e: + logging.error(f"Server error: {e}") + await asyncio.sleep(5) + return await robust_memory_operation(client) # Retry + + except MemoryClientError as e: + logging.error(f"Client error: {e}") + return [] except Exception as e: logging.error(f"Unexpected error: {e}") return [] ``` -### Connection Management +### Using Async Context Manager ```python -import httpx -from agent_memory_client import MemoryAPIClient - -# Custom timeout and retry configuration -async with httpx.AsyncClient( - timeout=30.0, - limits=httpx.Limits(max_keepalive_connections=10, max_connections=20) -) as http_client: +from agent_memory_client import MemoryAPIClient, MemoryClientConfig - client = MemoryAPIClient( - base_url="http://localhost:8000", - http_client=http_client - ) +config = MemoryClientConfig(base_url="http://localhost:8000", timeout=30.0) - # Perform operations +# Recommended: Use context manager for automatic cleanup +async with MemoryAPIClient(config) as client: results = await client.search_long_term_memory(text="query") + # Client automatically closes when exiting ``` ## Advanced Features @@ -650,6 +849,10 @@ async with httpx.AsyncClient( ### Custom Tool Workflows ```python +from agent_memory_client import MemoryAPIClient, MemoryClientConfig +from agent_memory_client.models import ClientMemoryRecord, MemoryTypeEnum +from agent_memory_client.filters import UserId + class CustomMemoryAgent: def __init__(self, memory_client: MemoryAPIClient): self.memory = memory_client @@ -658,7 +861,7 @@ class CustomMemoryAgent: # Multi-stage search with refinement initial_results = await self.memory.search_long_term_memory( text=query, - user_id=user_id, + user_id=UserId(eq=user_id), limit=20 ) @@ -669,10 +872,10 @@ class CustomMemoryAgent: limit=10 ) - # Filter by relevance threshold + # Filter by distance (lower is more relevant) relevant_memories = [ m for m in initial_results.memories - if m.relevance_score > 0.7 + if m.dist and m.dist < 0.3 # Close matches ] return relevant_memories[:5] @@ -685,44 +888,46 @@ class CustomMemoryAgent: # Search for similar existing memories similar = await self.memory.search_long_term_memory( text=text, - user_id=user_id, + user_id=UserId(eq=user_id), limit=3, - min_relevance_score=0.8 + distance_threshold=0.2 # Close matches only ) if similar.memories: # Update existing memory instead of creating duplicate - await self.memory.edit_memory( - memory_id=similar.memories[0].id, + existing = similar.memories[0] + await self.memory.edit_long_term_memory( + memory_id=existing.id, updates={ - "text": f"{similar.memories[0].text}. {text}", - "topics": list(set(similar.memories[0].topics + topics)), - "entities": list(set(similar.memories[0].entities + entities)) + "text": f"{existing.text}. {text}", + "topics": list(set((existing.topics or []) + topics)), + "entities": list(set((existing.entities or []) + entities)) } ) else: # Create new memory - await self.memory.create_long_term_memories([{ - "text": text, - "memory_type": "semantic", - "topics": topics, - "entities": entities, - "user_id": user_id - }]) + await self.memory.create_long_term_memory([ + ClientMemoryRecord( + text=text, + memory_type=MemoryTypeEnum.SEMANTIC, + topics=topics, + entities=entities, + user_id=user_id + ) + ]) ``` ### Performance Optimization ```python -from functools import lru_cache import asyncio +from agent_memory_client.filters import UserId class OptimizedMemoryClient: def __init__(self, client: MemoryAPIClient): self.client = client self._search_cache = {} - @lru_cache(maxsize=100) def _cache_key(self, text: str, user_id: str, limit: int) -> str: return f"{text}:{user_id}:{limit}" @@ -734,7 +939,7 @@ class OptimizedMemoryClient: results = await self.client.search_long_term_memory( text=text, - user_id=user_id, + user_id=UserId(eq=user_id), limit=limit ) @@ -754,50 +959,72 @@ class OptimizedMemoryClient: ### 1. Client Management ```python +import os +from agent_memory_client import MemoryAPIClient, MemoryClientConfig + # Use a single client instance per application class MemoryService: def __init__(self): - self.client = MemoryAPIClient( - base_url=os.getenv("MEMORY_SERVER_URL"), - api_key=os.getenv("MEMORY_API_KEY") + config = MemoryClientConfig( + base_url=os.getenv("MEMORY_SERVER_URL", "http://localhost:8000"), + default_namespace=os.getenv("DEFAULT_NAMESPACE", "production"), + timeout=float(os.getenv("MEMORY_TIMEOUT", "30")) ) + self.client = MemoryAPIClient(config) async def close(self): await self.client.close() -# Singleton pattern -memory_service = MemoryService() +# Usage with context manager +async def main(): + service = MemoryService() + try: + # Use service.client + pass + finally: + await service.close() ``` ### 2. Memory Organization ```python +from agent_memory_client.models import ClientMemoryRecord, MemoryTypeEnum + # Use consistent naming patterns async def create_user_memory(text: str, user_id: str, category: str): - return await client.create_long_term_memories([{ - "text": text, - "memory_type": "semantic", - "topics": [category, "user-preference"], - "user_id": user_id, - "namespace": f"user:{user_id}:preferences" - }]) + return await client.create_long_term_memory([ + ClientMemoryRecord( + text=text, + memory_type=MemoryTypeEnum.SEMANTIC, + topics=[category, "user-preference"], + user_id=user_id, + namespace=f"user:{user_id}:preferences" + ) + ]) ``` ### 3. Context Management ```python +from agent_memory_client.models import ClientMemoryRecord, MemoryTypeEnum + # Implement context-aware memory storage -async def store_conversation_memory(conversation: dict, session_id: str): - # Extract key information - important_facts = extract_facts(conversation) - - if important_facts: - await client.create_long_term_memories([{ - "text": fact, - "memory_type": "semantic", - "session_id": session_id, - "metadata": {"conversation_turn": i} - } for i, fact in enumerate(important_facts)]) +async def store_conversation_memory( + facts: list[str], + session_id: str, + user_id: str +): + if facts: + memories = [ + ClientMemoryRecord( + text=fact, + memory_type=MemoryTypeEnum.EPISODIC, + session_id=session_id, + user_id=user_id + ) + for fact in facts + ] + await client.create_long_term_memory(memories) ``` ## Configuration Reference @@ -807,31 +1034,30 @@ async def store_conversation_memory(conversation: dict, session_id: str): ```bash # Client configuration MEMORY_SERVER_URL=http://localhost:8000 -MEMORY_API_KEY=your-api-token # Connection settings MEMORY_TIMEOUT=30 -MEMORY_MAX_RETRIES=3 -# Default user settings -DEFAULT_USER_ID=default-user +# Default settings DEFAULT_NAMESPACE=production ``` -### Client Options +### MemoryClientConfig Options ```python -client = MemoryAPIClient( - base_url="http://localhost:8000", - api_key="optional-token", - bearer_token="optional-jwt", - timeout=30.0, - max_retries=3, - session_id="default-session", - user_id="default-user", - namespace="default", - http_client=custom_httpx_client +from agent_memory_client import MemoryAPIClient, MemoryClientConfig + +config = MemoryClientConfig( + base_url="http://localhost:8000", # Required: Server URL + timeout=30.0, # HTTP timeout (seconds) + default_namespace="production", # Default namespace for all ops + default_model_name="gpt-4o", # Model for token counting + default_context_window_max=128000, # Override context window ) + +async with MemoryAPIClient(config) as client: + # Use client... + pass ``` The Python SDK makes it easy to add sophisticated memory capabilities to any AI application, with minimal setup and maximum flexibility. Use the tool integrations for LLM-driven memory, direct API calls for code-driven approaches, or combine both patterns for hybrid solutions. diff --git a/docs/typescript-sdk.md b/docs/typescript-sdk.md new file mode 100644 index 00000000..dc0fe629 --- /dev/null +++ b/docs/typescript-sdk.md @@ -0,0 +1,357 @@ +# TypeScript SDK + +The TypeScript SDK (`agent-memory-client`) provides a type-safe client for integrating memory capabilities into Node.js and browser applications. + +**Version**: 0.3.2+ +**Requirements**: Node.js 20.0.0 or higher + +## Installation + +```bash +npm install agent-memory-client +# or +yarn add agent-memory-client +# or +pnpm add agent-memory-client +``` + +## Quick Start + +```typescript +import { MemoryAPIClient, UserId, Topics } from "agent-memory-client"; + +// Create client +const client = new MemoryAPIClient({ + baseUrl: "http://localhost:8000", + defaultNamespace: "my-app", +}); + +// Store a memory +await client.createLongTermMemory([ + { + text: "User prefers morning meetings", + memory_type: "semantic", + topics: ["scheduling", "preferences"], + user_id: "alice", + }, +]); + +// Search memories with filters +const results = await client.searchLongTermMemory({ + text: "when does user prefer meetings", + userId: new UserId({ eq: "alice" }), + topics: new Topics({ any: ["scheduling"] }), + limit: 5, +}); + +for (const memory of results.memories) { + console.log(`${memory.text} (distance: ${memory.dist})`); +} + +// Clean up +client.close(); +``` + +## Client Configuration + +```typescript +import { MemoryAPIClient, type MemoryClientConfig } from "agent-memory-client"; + +const config: MemoryClientConfig = { + baseUrl: "http://localhost:8000", // Required + timeout: 30000, // Request timeout (ms) + defaultNamespace: "production", // Default namespace + defaultModelName: "gpt-4o", // For auto-summarization + defaultContextWindowMax: 128000, // Context window limit + apiKey: "your-api-key", // Optional API key auth + bearerToken: "your-jwt", // Optional JWT auth +}; + +const client = new MemoryAPIClient(config); +``` + +## Memory Operations + +### Creating Memories + +```typescript +import type { MemoryRecord } from "agent-memory-client"; + +const memories: MemoryRecord[] = [ + { + text: "User works as a software engineer at TechCorp", + memory_type: "semantic", + topics: ["career", "work"], + entities: ["TechCorp"], + user_id: "alice", + }, +]; + +await client.createLongTermMemory(memories); +``` + +### Searching with Filters + +The SDK provides type-safe filter classes: + +```typescript +import { + SessionId, + Namespace, + UserId, + Topics, + Entities, + CreatedAt, + LastAccessed, + MemoryType, +} from "agent-memory-client"; + +// Basic search +const results = await client.searchLongTermMemory({ + text: "user preferences", + limit: 10, +}); + +// With filters +const filtered = await client.searchLongTermMemory({ + text: "programming languages", + userId: new UserId({ eq: "alice" }), + topics: new Topics({ any: ["programming", "languages"] }), + memoryType: new MemoryType({ eq: "semantic" }), + createdAt: new CreatedAt({ gte: new Date("2024-01-01") }), + distanceThreshold: 0.3, + limit: 5, +}); + +// Process results +for (const memory of filtered.memories) { + const relevance = memory.dist ? 1 - memory.dist : null; + console.log(`[${relevance?.toFixed(2)}] ${memory.text}`); +} +``` + +### Filter Reference + +| Filter | Options | Description | +|--------|---------|-------------| +| `SessionId` | `eq`, `in_`, `not_eq`, `not_in` | Filter by session ID | +| `Namespace` | `eq`, `in_`, `not_eq`, `not_in` | Filter by namespace | +| `UserId` | `eq`, `in_`, `not_eq`, `not_in` | Filter by user ID | +| `Topics` | `any`, `all`, `none` | Filter by topics | +| `Entities` | `any`, `all`, `none` | Filter by entities | +| `CreatedAt` | `gte`, `lte`, `eq` | Filter by creation date | +| `LastAccessed` | `gte`, `lte`, `eq` | Filter by last access | +| `MemoryType` | `eq`, `in_`, `not_eq`, `not_in` | Filter by type | + +### Editing and Deleting + +```typescript +// Edit a memory +const updated = await client.editLongTermMemory("memory-id", { + text: "Updated text content", + topics: ["updated", "topics"], +}); + +// Get a specific memory +const memory = await client.getLongTermMemory("memory-id"); + +// Delete memories +await client.deleteLongTermMemories(["memory-id-1", "memory-id-2"]); +``` + +## Working Memory + +```typescript +import type { WorkingMemory } from "agent-memory-client"; + +// Get or create working memory +const response = await client.getOrCreateWorkingMemory("session-123", { + userId: "alice", + namespace: "my-app", +}); + +// Update working memory +const workingMemory: Partial = { + messages: [ + { role: "user", content: "I'm planning a trip to Italy" }, + { role: "assistant", content: "That sounds exciting!" }, + ], + memories: [ + { + text: "User is planning a trip to Italy", + memory_type: "semantic", + topics: ["travel"], + }, + ], + data: { destination: "Italy" }, +}; + +await client.putWorkingMemory("session-123", workingMemory); + +// Delete working memory +await client.deleteWorkingMemory("session-123"); +``` + +## Forgetting Memories + +```typescript +import type { ForgetPolicy } from "agent-memory-client"; + +const policy: ForgetPolicy = { + max_age_days: 90, + max_inactive_days: 30, + budget: 100, + memory_type_allowlist: ["episodic"], +}; + +// Preview what would be deleted +const preview = await client.forgetLongTermMemories({ + policy, + namespace: "my-app", + dryRun: true, +}); +console.log(`Would delete ${preview.deleted} of ${preview.scanned}`); + +// Execute forget +const result = await client.forgetLongTermMemories({ + policy, + namespace: "my-app", + pinnedIds: ["keep-this-memory"], +}); +``` + +## Summary Views + +```typescript +import type { CreateSummaryViewRequest } from "agent-memory-client"; + +// Create a summary view +const request: CreateSummaryViewRequest = { + name: "User Topic Summaries", + source: "long_term", + group_by: ["user_id", "topics"], + time_window_days: 30, + continuous: true, +}; + +const view = await client.createSummaryView(request); + +// Run a partition +const partition = await client.runSummaryViewPartition(view.id, { + user_id: "alice", + topics: "travel", +}); +console.log(`Summary: ${partition.summary}`); + +// Run full view as background task +const task = await client.runSummaryView(view.id, { force: true }); + +// Poll for completion +let taskStatus = await client.getTask(task.id); +while (taskStatus && !["completed", "failed"].includes(taskStatus.status)) { + await new Promise((r) => setTimeout(r, 1000)); + taskStatus = await client.getTask(task.id); +} + +// List and delete views +const views = await client.listSummaryViews(); +await client.deleteSummaryView(view.id); +``` + +## Bulk Operations + +```typescript +// Bulk create with rate limiting +const batches = [memories1, memories2, memories3]; +const results = await client.bulkCreateLongTermMemories(batches, { + batchSize: 50, + delayBetweenBatches: 100, +}); + +// Auto-paginating search +for await (const memory of client.searchAllLongTermMemories({ + text: "user preferences", + userId: new UserId({ eq: "alice" }), + batchSize: 50, +})) { + console.log(memory.text); +} +``` + +## Error Handling + +```typescript +import { + MemoryClientError, + MemoryNotFoundError, + MemoryServerError, + MemoryValidationError, +} from "agent-memory-client"; + +try { + const memory = await client.getLongTermMemory("invalid-id"); + if (memory === null) { + console.log("Memory not found"); + } +} catch (error) { + if (error instanceof MemoryNotFoundError) { + console.log("Memory does not exist"); + } else if (error instanceof MemoryServerError) { + console.log(`Server error: ${error.message}`); + } else if (error instanceof MemoryValidationError) { + console.log(`Invalid input: ${error.message}`); + } else if (error instanceof MemoryClientError) { + console.log(`Client error: ${error.message}`); + } +} +``` + +## Memory Prompt + +```typescript +import type { MemoryPromptRequest } from "agent-memory-client"; + +const request: MemoryPromptRequest = { + query: "What are the user's preferences?", + session: { + session_id: "session-123", + user_id: "alice", + model_name: "gpt-4o", + }, + long_term_search: { + text: "user preferences", + limit: 5, + }, +}; + +const context = await client.memoryPrompt(request); +// Use context.messages with your LLM +``` + +## Type Exports + +The SDK exports all types for TypeScript usage: + +```typescript +import type { + // Client config + MemoryClientConfig, + SearchOptions, + // Models + WorkingMemory, + WorkingMemoryResponse, + MemoryMessage, + MemoryRecord, + MemoryRecordResults, + // Forget + ForgetPolicy, + ForgetResponse, + // Summary Views + SummaryView, + CreateSummaryViewRequest, + SummaryViewPartitionResult, + // Tasks + Task, + TaskStatus, +} from "agent-memory-client"; +``` diff --git a/mkdocs.yml b/mkdocs.yml index d5eab753..3020b131 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -85,18 +85,16 @@ nav: - Summary Views: summary-views.md - Memory Extraction Strategies: memory-extraction-strategies.md - Memory Lifecycle: memory-lifecycle.md + - LangChain Integration: langchain-integration.md + + - Operations Guide: + - operations-guide-index.md + - Configuration: configuration.md + - Authentication: authentication.md + - Security: security-custom-prompts.md - LLM Providers: llm-providers.md - Embedding Providers: embedding-providers.md - Vector Store Backends: vector-store-backends.md - - AWS Bedrock: aws-bedrock.md - - Authentication: authentication.md - - Security: security-custom-prompts.md - - - Python SDK: - - python-sdk-index.md - - SDK Documentation: python-sdk.md - - LangChain Integration: langchain-integration.md - - Configuration: configuration.md - Examples: - Agent Examples: agent-examples.md @@ -113,6 +111,10 @@ nav: - REST API: api.md - MCP Server: mcp.md - CLI Reference: cli.md + - Client SDKs: + - Python SDK: python-sdk.md + - TypeScript SDK: typescript-sdk.md + - Java SDK: java-sdk.md - Development: - Development Guide: development.md