diff --git a/README.md b/README.md index f4c529f..a6e2426 100644 --- a/README.md +++ b/README.md @@ -1,860 +1,44 @@ -# ๐Ÿฐ vCon Server +# vCon Server (Conserver) -vCon Server is a powerful conversation processing and storage system that enables advanced analysis and management of conversation data. It provides a flexible pipeline for processing, storing, and analyzing conversations through various modules and integrations. The system includes secure API endpoints for both internal use and external partner integration, allowing third-party systems to securely submit conversation data with scoped access controls. +vCon Server is a pipeline-based conversation processing and storage system. It ingests [vCon](https://datatracker.ietf.org/doc/draft-ietf-vcon-vcon-container/) (Voice Conversation) records, routes them through configurable processing chains โ€” transcription, AI analysis, tagging, webhooks โ€” and writes results to one or more storage backends. -## Table of Contents -- [๐Ÿฐ vCon Server](#-vcon-server) - - [Table of Contents](#table-of-contents) - - [Prerequisites](#prerequisites) - - [Quick Start](#quick-start) - - [Installation](#installation) - - [Manual Installation](#manual-installation) - - [Automated Installation](#automated-installation) - - [Configuration](#configuration) - - [Environment Variables](#environment-variables) - - [Configuration File](#configuration-file) - - [API Endpoints](#api-endpoints) - - [Authentication](#authentication) - - [Main API Endpoints](#main-api-endpoints) - - [External Ingress API](#external-ingress-api) - - [Deployment](#deployment) - - [Docker Deployment](#docker-deployment) - - [Scaling](#scaling) - - [Storage Modules](#storage-modules) - - [PostgreSQL Storage](#postgresql-storage) - - [S3 Storage](#s3-storage) - - [Elasticsearch Storage](#elasticsearch-storage) - - [Milvus Vector Database Storage](#milvus-vector-database-storage) - - [Tracer Modules](#tracer-modules) - - [JLINC Zero-Knowledge Auditing](#jlinc-zero-knowledge-auditing) - - [Monitoring and Logging](#monitoring-and-logging) - - [Troubleshooting](#troubleshooting) - - [License](#license) - - [Production Deployment Best Practices](#production-deployment-best-practices) - - [Example Directory Layout](#example-directory-layout) - - [Example Redis Volume in docker-compose.yml](#example-redis-volume-in-docker-composeyml) - - [User Creation and Permissions](#user-creation-and-permissions) - -## Prerequisites - -- Docker and Docker Compose -- Git -- Python 3.12 or higher (for local development) -- Poetry (for local development) +**Full documentation:** https://vcon-dev.github.io/vcon-server/ ## Quick Start -For a quick start using the automated installation script: - -```bash -# Download the installation script -curl -O https://raw.githubusercontent.com/vcon-dev/vcon-server/main/scripts/install_conserver.sh -chmod +x install_conserver.sh - -# Run the installation script -sudo ./install_conserver.sh --domain your-domain.com --email your-email@example.com -``` - -### Running tests - -Run tests inside the Docker environment (so all dependencies are available): - -```bash -docker compose run --rm api poetry run pytest server/links/analyze/tests/ server/storage/milvus/test_milvus.py -v -``` - -To avoid OpenTelemetry export errors when Datadog is not configured, unset the OTLP endpoint: - -```bash -docker compose run --rm -e OTEL_EXPORTER_OTLP_ENDPOINT= api poetry run pytest server/links/analyze/tests/ -v -``` - -## Installation - -### Manual Installation - -1. Clone the repository: ```bash git clone https://github.com/vcon-dev/vcon-server.git cd vcon-server -``` - -2. Set up a docker-compose.yml file: -```bash cp example_docker-compose.yml docker-compose.yml -# Edit to customize as needed - -3. Create and configure the environment file: -```bash -cp .env.example .env -# Edit .env with your configuration -``` - -4. Create the Docker network: -```bash +cp .env.example .env # edit CONSERVER_API_TOKEN at minimum docker network create conserver - -``` - -5. Build and start the services: -```bash -docker compose build -docker compose up -d +docker compose up -d --build +curl http://localhost:8000/api/health ``` -### Automated Installation +## Documentation by Audience -The repository includes an automated installation script that handles the complete setup process. The script: +| Audience | Start here | +|----------|-----------| +| **New users** | [Getting Started](https://vcon-dev.github.io/vcon-server/getting-started/) | +| **Operators / DevOps** | [Installation](https://vcon-dev.github.io/vcon-server/installation/) ยท [Configuration](https://vcon-dev.github.io/vcon-server/configuration/) ยท [Operations](https://vcon-dev.github.io/vcon-server/operations/) | +| **Developers** | [Contributing](https://vcon-dev.github.io/vcon-server/contributing/) ยท [Extending](https://vcon-dev.github.io/vcon-server/extending/) ยท [Reference](https://vcon-dev.github.io/vcon-server/reference/) | -- Installs required dependencies -- Sets up Docker and Docker Compose -- Configures the environment -- Deploys the services -- Sets up monitoring +## Key Features -To use the automated installation: +- **Chain-based processing** โ€” compose reusable links into pipelines driven by Redis queues +- **20+ processing links** โ€” transcription (Deepgram, Whisper), AI analysis (OpenAI, Groq), tagging, routing, webhooks, compliance (SCITT, DataTrails) +- **10+ storage backends** โ€” PostgreSQL, MongoDB, S3, Elasticsearch, Milvus, Redis, SFTP, and more +- **Multi-worker scaling** โ€” parallel workers with configurable process count and parallel storage writes +- **External ingress** โ€” scoped API keys let third-party systems submit vCons to specific queues +- **OpenTelemetry** โ€” built-in tracing and metrics export -```bash -./scripts/install_conserver.sh --domain your-domain.com --email your-email@example.com [--token YOUR_API_TOKEN] -``` - -Options: -- `--domain`: Your domain name (required) -- `--email`: Email for DNS registration (required) -- `--token`: API token (optional, generates random token if not provided) - -## Configuration - -### Environment Variables - -Create a `.env` file in the root directory with the following variables: +## Running Tests ```bash -REDIS_URL=redis://redis -CONSERVER_API_TOKEN=your_api_token -CONSERVER_CONFIG_FILE=./config.yml -GROQ_API_KEY=your_groq_api_key -DNS_HOST=your-domain.com -DNS_REGISTRATION_EMAIL=your-email@example.com - -# LLM credentials (LiteLLM proxy or OpenAI/Azure) are passed as options to the link's config -``` - -### Configuration File - -The `config.yml` file defines the processing pipeline, storage options, chain configurations, and external API access. Here's an example configuration: - -```yaml -# External API access configuration -# Configure API keys for external partners to submit vCons to specific ingress lists -ingress_auth: - # Single API key for an ingress list - customer_data: "customer-api-key-12345" - - # Multiple API keys for the same ingress list (different clients) - support_calls: - - "support-api-key-67890" - - "support-client-2-key" - - "support-vendor-key-xyz" - - # Multiple API keys for sales leads - sales_leads: - - "sales-api-key-abcdef" - - "sales-partner-key-123" - -links: - webhook_store_call_log: - module: links.webhook - options: - webhook-urls: - - https://example.com/conserver - deepgram_link: - module: links.deepgram_link - options: - DEEPGRAM_KEY: your_deepgram_key - minimum_duration: 30 - api: - model: "nova-2" - smart_format: true - detect_language: true - summarize: - module: links.analyze - options: - OPENAI_API_KEY: your_openai_key - prompt: "Summarize this transcript..." - analysis_type: summary - model: 'gpt-4' - -storages: - postgres: - module: storage.postgres - options: - user: postgres - password: your_password - host: your_host - port: "5432" - database: postgres - s3: - module: storage.s3 - options: - aws_access_key_id: your_key - aws_secret_access_key: your_secret - aws_bucket: your_bucket - -chains: - main_chain: - links: - - deepgram_link - - summarize - - webhook_store_call_log - storages: - - postgres - - s3 - ingress_lists: - - customer_data - enabled: 1 -``` - -## Dynamic Module Installation - -The vCon server supports dynamic installation of modules from PyPI or GitHub repositories. This applies to both link modules and general imports, allowing you to use external packages without pre-installing them, making deployment more flexible. - -### Dynamic Imports - -For general module imports that need to be available globally, use the `imports` section: - -```yaml -imports: - # PyPI package with different module name - custom_utility: - module: custom_utils - pip_name: custom-utils-package - - # GitHub repository - github_helper: - module: github_helper - pip_name: git+https://github.com/username/helper-repo.git - - # Module name matches pip package name - requests_import: - module: requests - # pip_name not needed since it matches module name - - # Legacy format (string value) - still supported - legacy_module: some.legacy.module -``` - -### Dynamic Link Modules - -### Basic Usage - -For modules where the pip package name matches the module name: - -```yaml -links: - requests_link: - module: requests - # Will automatically install "requests" from PyPI if not found - options: - timeout: 30 -``` - -### Custom Pip Package Name - -For modules where the pip package name differs from the module name: - -```yaml -links: - custom_link: - module: my_module - pip_name: custom-package-name - options: - api_key: secret -``` - -### GitHub Repositories - -Install directly from GitHub repositories: - -```yaml -links: - github_link: - module: github_module - pip_name: git+https://github.com/username/repo.git@main - options: - debug: true -``` - -For private repositories, use a personal access token: - -```yaml -links: - private_link: - module: private_module - pip_name: git+https://token:your_github_token@github.com/username/private-repo.git - options: - config_param: value -``` - -The system will automatically detect missing modules and install them during processing. Modules are cached after installation for performance. - -## Module Version Management - -The vCon server supports sophisticated version management for dynamically installed modules (both imports and links). This allows you to control exactly which versions of external packages are used and when they should be updated. - -### Version Specification Methods - -#### 1. Exact Version Pinning - -Install a specific version of a package: - -```yaml -# For imports -imports: - my_import: - module: my_module - pip_name: my-package==1.2.3 - -# For links -links: - my_link: - module: my_module - pip_name: my-package==1.2.3 - options: - config: value -``` - -#### 2. Version Ranges - -Use version constraints to allow compatible updates: - -```yaml -links: - flexible_link: - module: flexible_module - pip_name: flexible-package>=1.0.0,<2.0.0 - options: - setting: value -``` - -#### 3. Git Repository Versions - -Install from specific Git tags, branches, or commits: - -```yaml -links: - # Install from specific tag - git_tag_link: - module: git_module - pip_name: git+https://github.com/username/repo.git@v1.2.3 - - # Install from specific branch - git_branch_link: - module: git_module - pip_name: git+https://github.com/username/repo.git@develop - - # Install from specific commit - git_commit_link: - module: git_module - pip_name: git+https://github.com/username/repo.git@abc123def456 -``` - -#### 4. Pre-release Versions - -Include pre-release versions: - -```yaml -links: - prerelease_link: - module: beta_module - pip_name: beta-package --pre - options: - experimental: true -``` - -### Version Updates - -To install a new version of an already-installed link, rebuild the Docker container: - -```yaml -links: - upgraded_link: - module: my_module - pip_name: my-package==2.0.0 # Updated from 1.0.0 - options: - new_feature: enabled -``` - -**Recommended approach for version updates:** -- Update the version in your configuration file -- Rebuild the Docker container to ensure clean installation -- This approach ensures consistent, reproducible deployments - -### Version Update Strategies - -#### Container Rebuild (Recommended) - -For all deployments, the recommended approach is to rebuild containers: - -1. Update your configuration file with the new version: -```yaml -# For imports -imports: - my_import: - module: my_module - pip_name: my-package==2.0.0 # Updated from 1.0.0 - -# For links -links: - my_link: - module: my_module - pip_name: my-package==2.0.0 # Updated from 1.0.0 -``` - -2. Rebuild and deploy the container: -```bash -docker compose build -docker compose up -d -``` - -This ensures clean, reproducible deployments without version conflicts. - -### Best Practices - -#### Development Environment -```yaml -links: - dev_link: - module: dev_module - pip_name: git+https://github.com/username/repo.git@develop - # Rebuild container frequently to get latest changes -``` - -#### Staging Environment -```yaml -links: - staging_link: - module: staging_module - pip_name: staging-package>=1.0.0,<2.0.0 - # Use version ranges for compatibility testing -``` - -#### Production Environment -```yaml -links: - prod_link: - module: prod_module - pip_name: prod-package==1.2.3 - # Exact version pinning for stability -``` - -### Troubleshooting Version Issues - -#### Container Rebuild Issues -If you're experiencing import issues after a version update: - -1. Ensure you've rebuilt the container: `docker compose build` -2. Clear any cached images: `docker system prune` -3. Restart with fresh containers: `docker compose up -d` - -#### Check Installed Versions -```bash -pip list | grep package-name -pip show package-name -``` - -#### Dependency Conflicts -If you encounter dependency conflicts: - -1. Use virtual environments -2. Check compatibility with `pip check` -3. Consider using dependency resolution tools like `pip-tools` - -### Version Monitoring - -Monitor link versions in your logs: - -```python -# Links log their versions during import -logger.info("Imported %s version %s", module_name, module.__version__) -``` - -Consider implementing version reporting endpoints for operational visibility. - -## API Endpoints - -The vCon Server provides RESTful API endpoints for managing conversation data. All endpoints require authentication using API keys. - -### Authentication - -API authentication is handled through the `x-conserver-api-token` header: - -```bash -curl -H "x-conserver-api-token: YOUR_API_TOKEN" \ - -X POST \ - "https://your-domain.com/api/endpoint" -``` - -### Main API Endpoints - -#### Standard vCon Submission - -For internal use with full system access: - -```bash -POST /vcon?ingress_list=my_ingress -Content-Type: application/json -x-conserver-api-token: YOUR_MAIN_API_TOKEN - -{ - "uuid": "123e4567-e89b-12d3-a456-426614174000", - "vcon": "0.0.1", - "created_at": "2024-01-15T10:30:00Z", - "parties": [...] -} -``` - -#### External Partner Submission - -For external partners and 3rd party systems with limited access: - -```bash -POST /vcon/external-ingress?ingress_list=partner_data -Content-Type: application/json -x-conserver-api-token: PARTNER_SPECIFIC_API_TOKEN - -{ - "uuid": "123e4567-e89b-12d3-a456-426614174000", - "vcon": "0.0.1", - "created_at": "2024-01-15T10:30:00Z", - "parties": [...] -} -``` - -### External Ingress API - -The `/vcon/external-ingress` endpoint is specifically designed for external partners and 3rd party systems to securely submit vCons with limited API access. - -#### Security Model - -- **Scoped Access**: Each API key grants access only to predefined ingress list(s) -- **Isolation**: No access to other API endpoints or system resources -- **Multi-Key Support**: Multiple API keys can be configured for the same ingress list -- **Configuration-Based**: API keys are managed through the `ingress_auth` section in `config.yml` - -#### Configuration - -Configure external API access in your `config.yml`: - -```yaml -ingress_auth: - # Single API key for customer data ingress - customer_data: "customer-api-key-12345" - - # Multiple API keys for support calls (different clients) - support_calls: - - "support-api-key-67890" - - "support-client-2-key" - - "support-vendor-key-xyz" - - # Multiple partners for sales leads - sales_leads: - - "sales-api-key-abcdef" - - "sales-partner-key-123" -``` - -#### Usage Examples - -**Single Partner Access:** -```bash -curl -X POST "https://your-domain.com/vcon/external-ingress?ingress_list=customer_data" \ - -H "Content-Type: application/json" \ - -H "x-conserver-api-token: customer-api-key-12345" \ - -d '{ - "uuid": "123e4567-e89b-12d3-a456-426614174000", - "vcon": "0.0.1", - "created_at": "2024-01-15T10:30:00Z", - "parties": [] - }' -``` - -**Multiple Partner Access:** -```bash -# Partner 1 using their key -curl -X POST "https://your-domain.com/vcon/external-ingress?ingress_list=support_calls" \ - -H "x-conserver-api-token: support-api-key-67890" \ - -d @vcon_data.json - -# Partner 2 using their key -curl -X POST "https://your-domain.com/vcon/external-ingress?ingress_list=support_calls" \ - -H "x-conserver-api-token: support-client-2-key" \ - -d @vcon_data.json -``` - -#### Response Format - -**Success (HTTP 204 No Content):** -``` -HTTP/1.1 204 No Content -``` - -**Authentication Error (HTTP 403 Forbidden):** -```json -{ - "detail": "Invalid API Key for ingress list 'customer_data'" -} -``` - -**Validation Error (HTTP 422 Unprocessable Entity):** -```json -{ - "detail": [ - { - "loc": ["body", "uuid"], - "msg": "field required", - "type": "value_error.missing" - } - ] -} -``` - -#### Best Practices - -1. **Generate Strong API Keys**: Use cryptographically secure random strings -2. **Rotate Keys Regularly**: Update API keys periodically for security -3. **Monitor Usage**: Track API usage per partner for billing and monitoring -4. **Rate Limiting**: Consider implementing rate limiting for external partners -5. **Logging**: Monitor external submissions for security and compliance - -#### Integration Examples - -**Python Integration:** -```python -import requests -import json - -def submit_vcon_to_partner_ingress(vcon_data, ingress_list, api_key, base_url): - """Submit vCon to external ingress endpoint.""" - url = f"{base_url}/vcon/external-ingress" - headers = { - "Content-Type": "application/json", - "x-conserver-api-token": api_key - } - params = {"ingress_list": ingress_list} - - response = requests.post(url, json=vcon_data, headers=headers, params=params) - - if response.status_code == 204: - return {"success": True} - else: - return {"success": False, "error": response.json()} - -# Usage -result = submit_vcon_to_partner_ingress( - vcon_data=my_vcon, - ingress_list="customer_data", - api_key="customer-api-key-12345", - base_url="https://your-domain.com" -) -``` - -**Node.js Integration:** -```javascript -async function submitVconToIngress(vconData, ingressList, apiKey, baseUrl) { - const response = await fetch(`${baseUrl}/vcon/external-ingress?ingress_list=${ingressList}`, { - method: 'POST', - headers: { - 'Content-Type': 'application/json', - 'x-conserver-api-token': apiKey - }, - body: JSON.stringify(vconData) - }); - - if (response.status === 204) { - return { success: true }; - } else { - const error = await response.json(); - return { success: false, error }; - } -} -``` - -## Deployment - -### Docker Deployment - -The system is containerized using Docker and can be deployed using Docker Compose: - -```bash -# Build the containers -docker compose build - -# Start the services -docker compose up -d - -# Scale the conserver service -docker compose up --scale conserver=4 -d -``` - -### Scaling - -The system is designed to scale horizontally. The conserver service can be scaled to handle increased load: - -```bash -docker compose up --scale conserver=4 -d -``` - -## Storage Modules - -### PostgreSQL Storage - -```yaml -storages: - postgres: - module: storage.postgres - options: - user: postgres - password: your_password - host: your_host - port: "5432" - database: postgres -``` - -### S3 Storage - -```yaml -storages: - s3: - module: storage.s3 - options: - aws_access_key_id: your_key - aws_secret_access_key: your_secret - aws_bucket: your_bucket -``` - -### Elasticsearch Storage - -```yaml -storages: - elasticsearch: - module: storage.elasticsearch - options: - cloud_id: "your_cloud_id" - api_key: "your_api_key" - index: vcon_index -``` - -### Milvus Vector Database Storage - -For semantic search capabilities: - -```yaml -storages: - milvus: - module: storage.milvus - options: - host: "localhost" - port: "19530" - collection_name: "vcons" - embedding_model: "text-embedding-3-small" - embedding_dim: 1536 - api_key: "your-openai-api-key" - organization: "your-org-id" - create_collection_if_missing: true -``` - -## Tracer Modules - -Tracer modules enable functions to run on data as it passes through each link in the Conserver chain. - -### JLINC Zero-Knowledge Auditing - -JLINC provides cryptographic signing for tamper-proof data and provenance, coupled with zero-knowledge audit records that can be stored with third-parties to allow for secure and private third-party auditing. - -```yaml -tracers: - jlinc: - module: tracers.jlinc - options: - data_store_api_url: http://jlinc-server:9090 - data_store_api_key: your_data_store_api_key - archive_api_url: http://jlinc-server:9090 - archive_api_key: your_archive_api_key - system_prefix: VCONTest - agreement_id: 00000000-0000-0000-0000-000000000000 - hash_event_data: True - dlq_vcon_on_error: True -``` - -## Monitoring and Logging - -vCon Server is instrumented with OpenTelemetry and can send traces and metrics to any OTLP-compatible backend. See [docs/operations/monitoring.md](docs/operations/monitoring.md) for full configuration details, including how to fan out to multiple backends simultaneously using the OTel Collector. - -Quick setup โ€” add to your `.env`: - -```bash -OTEL_EXPORTER_OTLP_ENDPOINT= -OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf # or grpc -OTEL_EXPORTER_OTLP_HEADERS= -``` - -View logs using: -```bash -docker compose logs -f -``` - -## Troubleshooting - -Common issues and solutions: - -1. Redis Connection Issues: - - Check if Redis container is running: `docker ps | grep redis` - - Verify Redis URL in .env file - - Check Redis logs: `docker compose logs redis` - -2. Service Scaling Issues: - - Ensure sufficient system resources - - Check network connectivity between containers - - Verify Redis connection for all instances - -3. Storage Module Issues: - - Verify credentials and connection strings - - Check storage service availability - - Review storage module logs - -For additional help, check the logs: -```bash -docker compose logs -f [service_name] +docker compose run --rm api poetry run pytest server/links/analyze/tests/ -v ``` ## License -This project is licensed under the terms specified in the LICENSE file. - -## Production Deployment Best Practices - -- **Install as a non-root user**: Create a dedicated user (e.g., `vcon`) for running the application and Docker containers. -- **Clone repositories to /opt**: Place `vcon-admin` and `vcon-server` in `/opt` for system-wide, non-root access. -- **Use persistent Docker volumes**: Map Redis and other stateful service data to `/opt/vcon-data` for durability. -- **Follow the updated install script**: Use `scripts/install_conserver.sh` which now implements these best practices. - -### Example Directory Layout - -``` -/opt/vcon-admin -/opt/vcon-server -/opt/vcon-data/redis -``` - -### Example Redis Volume in docker-compose.yml - -```yaml -volumes: - - /opt/vcon-data/redis:/data -``` - -### User Creation and Permissions - -The install script creates the `vcon` user and sets permissions for all necessary directories. - ---- +See [LICENSE](LICENSE). diff --git a/docs/contributing.md b/docs/contributing.md new file mode 100644 index 0000000..3a28371 --- /dev/null +++ b/docs/contributing.md @@ -0,0 +1,253 @@ +# Contributing to vCon Server + +This guide covers everything you need to set up a local development environment, run the test suite, understand the project layout, and submit changes. + +--- + +## Prerequisites + +| Tool | Minimum Version | Notes | +|---|---|---| +| Python | 3.12 | Required for local development outside Docker | +| Poetry | latest stable | Dependency and virtualenv management | +| Docker | latest stable | Required for running the full stack | +| Docker Compose | v2 (bundled with Docker Desktop) | Used for all service orchestration | + +Verify you have these installed: + +```bash +python --version # 3.12+ +poetry --version +docker --version +docker compose version +``` + +--- + +## Cloning and Initial Setup + +```bash +# 1. Clone the repository +git clone https://github.com/vcon-dev/vcon-server.git +cd vcon-server + +# 2. Copy and configure the environment file +cp .env.example .env +# Edit .env โ€” at minimum set REDIS_URL and CONSERVER_API_TOKEN + +# 3. Copy and customise the Docker Compose file +cp example_docker-compose.yml docker-compose.yml + +# 4. Create the shared Docker network +docker network create conserver + +# 5. Build the containers +docker compose build +``` + +To start all services: + +```bash +docker compose up -d +``` + +--- + +## Running the Test Suite + +Tests are run inside the Docker environment so that all service dependencies (Redis, etc.) are available. + +```bash +# Run the full test suite +docker compose run --rm api poetry run pytest server/links/analyze/tests/ server/storage/milvus/test_milvus.py -v +``` + +To suppress OpenTelemetry export errors when Datadog is not configured, unset the OTLP endpoint: + +```bash +docker compose run --rm -e OTEL_EXPORTER_OTLP_ENDPOINT= api poetry run pytest server/links/analyze/tests/ -v +``` + +Run a specific test file: + +```bash +docker compose run --rm api poetry run pytest server/links/my_link/tests/test_my_link.py -v +``` + +--- + +## Project Structure + +``` +vcon-server/ +โ”œโ”€โ”€ server/ # All application source code +โ”‚ โ”œโ”€โ”€ main.py # vCon processing pipeline โ€” chain management, +โ”‚ โ”‚ # worker processes, Redis queue consumption +โ”‚ โ”œโ”€โ”€ api.py # FastAPI REST API โ€” CRUD for vCons, chain +โ”‚ โ”‚ # ingress/egress, config and DLQ endpoints +โ”‚ โ”œโ”€โ”€ vcon.py # vCon data model โ€” Vcon class with all +โ”‚ โ”‚ # builder methods and property accessors +โ”‚ โ”œโ”€โ”€ config.py # Config loading โ€” reads CONSERVER_CONFIG_FILE +โ”‚ โ”‚ # (YAML) and exposes worker/storage settings +โ”‚ โ”œโ”€โ”€ settings.py # Environment variables โ€” all os.getenv() calls +โ”‚ โ”‚ # with defaults live here +โ”‚ โ”œโ”€โ”€ redis_mgr.py # Redis connection management โ€” single shared +โ”‚ โ”‚ # connection pool used by API and pipeline +โ”‚ โ”œโ”€โ”€ follower.py # Ingress follower โ€” BLPOP loop that drives +โ”‚ โ”‚ # vCons from queue into a chain +โ”‚ โ”œโ”€โ”€ hook.py # Pre/post-link hook system +โ”‚ โ”œโ”€โ”€ dlq_utils.py # Dead-letter queue helpers +โ”‚ โ”œโ”€โ”€ version.py # Version string utilities +โ”‚ โ”‚ +โ”‚ โ”œโ”€โ”€ links/ # Processing link modules (one subdirectory each) +โ”‚ โ”‚ โ”œโ”€โ”€ analyze/ # LLM-based analysis (summary, labels, etc.) +โ”‚ โ”‚ โ”œโ”€โ”€ deepgram_link/ # Deepgram speech-to-text transcription +โ”‚ โ”‚ โ”œโ”€โ”€ tag/ # Applies configurable tags to vCons +โ”‚ โ”‚ โ”œโ”€โ”€ webhook/ # HTTP webhook delivery +โ”‚ โ”‚ โ””โ”€โ”€ ... # Additional built-in links +โ”‚ โ”‚ +โ”‚ โ”œโ”€โ”€ storage/ # Storage adapter modules (one subdirectory each) +โ”‚ โ”‚ โ”œโ”€โ”€ base.py # Abstract Storage base class +โ”‚ โ”‚ โ”œโ”€โ”€ file/ # Local filesystem storage +โ”‚ โ”‚ โ”œโ”€โ”€ mongo/ # MongoDB storage +โ”‚ โ”‚ โ”œโ”€โ”€ postgres/ # PostgreSQL storage +โ”‚ โ”‚ โ”œโ”€โ”€ s3/ # AWS S3 (and S3-compatible) storage +โ”‚ โ”‚ โ”œโ”€โ”€ milvus/ # Milvus vector database storage +โ”‚ โ”‚ โ””โ”€โ”€ ... # Additional storage adapters +โ”‚ โ”‚ +โ”‚ โ”œโ”€โ”€ tracers/ # Audit tracer modules +โ”‚ โ”‚ โ””โ”€โ”€ jlinc/ # JLINC zero-knowledge auditing tracer +โ”‚ โ”‚ +โ”‚ โ””โ”€โ”€ lib/ # Shared utilities +โ”‚ โ”œโ”€โ”€ logging_utils.py # init_logger() โ€” standard logger factory +โ”‚ โ”œโ”€โ”€ vcon_redis.py # VconRedis โ€” high-level Redis get/store for vCons +โ”‚ โ”œโ”€โ”€ context_utils.py # OpenTelemetry trace context propagation +โ”‚ โ”œโ”€โ”€ metrics.py # Counter and histogram helpers +โ”‚ โ”œโ”€โ”€ error_tracking.py # Error tracker initialisation +โ”‚ โ””โ”€โ”€ ... +โ”‚ +โ”œโ”€โ”€ docs/ # MkDocs documentation source +โ”œโ”€โ”€ example_config.yml # Annotated reference configuration +โ”œโ”€โ”€ example_docker-compose.yml # Starting point for docker-compose.yml +โ”œโ”€โ”€ pyproject.toml # Poetry project and dependency manifest +โ””โ”€โ”€ .env.example # Template for the required environment file +``` + +--- + +## Environment Variables + +All environment variables are declared in `server/settings.py`. Key variables: + +| Variable | Default | Description | +|---|---|---| +| `REDIS_URL` | `redis://localhost` | Redis connection string | +| `CONSERVER_API_TOKEN` | _(none)_ | Bearer token for the REST API | +| `CONSERVER_CONFIG_FILE` | `./example_config.yml` | Path to the YAML config file | +| `CONSERVER_WORKERS` | `1` | Number of parallel worker processes | +| `CONSERVER_PARALLEL_STORAGE` | `true` | Write to storage backends concurrently | +| `CONSERVER_START_METHOD` | _(platform default)_ | Multiprocessing start method: `fork`, `spawn`, or `forkserver` | +| `LOG_LEVEL` | `DEBUG` | Python logging level | +| `ENV` | `dev` | Runtime environment label | +| `VCON_REDIS_EXPIRY` | `3600` | TTL (seconds) for vCons cached back in Redis | +| `VCON_INDEX_EXPIRY` | `86400` | TTL (seconds) for the vCon sorted-set index | +| `VCON_DLQ_EXPIRY` | `604800` | TTL (seconds) for dead-letter queue entries | +| `UUID8_DOMAIN_NAME` | `strolid.com` | DNS domain used when generating UUID v8 identifiers | +| `OPENAI_API_KEY` | _(none)_ | OpenAI key (used by analysis links) | +| `DEEPGRAM_KEY` | _(none)_ | Deepgram key (used by transcription links) | +| `OTEL_EXPORTER_OTLP_ENDPOINT` | _(none)_ | OTLP endpoint for OpenTelemetry traces/metrics | + +--- + +## Coding Conventions + +### Logger initialisation + +Every module that needs logging calls `init_logger` from `lib/logging_utils.py` rather than using `logging.getLogger` directly: + +```python +from lib.logging_utils import init_logger +logger = init_logger(__name__) +``` + +### The VconRedis pattern + +Links and other components that need to read or write vCons use `VconRedis` from `lib/vcon_redis.py`. This provides a consistent interface and ensures the correct Redis key format (`vcon:`): + +```python +from lib.vcon_redis import VconRedis + +vcon_redis = VconRedis() +vcon = vcon_redis.get_vcon(vcon_uuid) # returns a Vcon instance or None +# ... modify vcon ... +vcon_redis.store_vcon(vcon) # serialises and saves back to Redis +``` + +### default_options dict + +Every link module exposes a module-level `default_options` dictionary that declares all supported configuration keys and their defaults. The `run` function merges caller-supplied options on top of the defaults: + +```python +default_options = { + "threshold": 0.5, + "model": "gpt-4", +} + +def run(vcon_uuid, link_name, opts=default_options): + options = {**default_options, **opts} + ... +``` + +This convention makes options self-documenting and ensures backward compatibility when new options are added. + +### Error handling + +- Raise an exception (letting the pipeline move the vCon to the DLQ) for permanent or unknown failures. +- Return `None` to silently filter a vCon out of the chain without moving it to the DLQ. +- Return `vcon_uuid` to continue processing normally. + +--- + +## Creating a New Processing Link + +See [docs/extending/creating-links.md](extending/creating-links.md) for the complete guide, including the required `run()` signature, directory layout, configuration registration, testing patterns, and best practices. + +Quick reference โ€” minimum viable link: + +``` +server/links/my_link/ + __init__.py # must contain run() and default_options + tests/ + __init__.py + test_my_link.py +``` + +--- + +## Creating a New Storage Adapter + +See [docs/extending/creating-storage-adapters.md](extending/creating-storage-adapters.md) for the full guide, including how to subclass `storage.base.Storage`, the required method signatures, and how to register the adapter in `config.yml`. + +--- + +## Branch and PR Workflow + +1. **Branch from `main`**: + + ```bash + git checkout main + git pull origin main + git checkout -b your-feature-branch + ``` + +2. **Make your changes**, following the coding conventions above. Add or update tests. + +3. **Run the tests** inside Docker to confirm nothing is broken: + + ```bash + docker compose run --rm api poetry run pytest server/links/analyze/tests/ -v + ``` + +4. **Open a Pull Request against `main`** on GitHub at [https://github.com/vcon-dev/vcon-server](https://github.com/vcon-dev/vcon-server). Provide a clear description of what the PR does and why, and link any related issues. + +5. Maintainers will review the PR. Address any requested changes by pushing additional commits to the same branch โ€” do not force-push. diff --git a/docs/reference/links/analyze-and-label.md b/docs/reference/links/analyze-and-label.md new file mode 100644 index 0000000..1e3e1b1 --- /dev/null +++ b/docs/reference/links/analyze-and-label.md @@ -0,0 +1,79 @@ +# analyze_and_label + +Analyzes vCon dialog content with an OpenAI model and applies the returned labels as tags on the vCon. The model is prompted to return a JSON object with a `labels` array; each label is then added as both a structured analysis entry and a vCon tag. + +## Prerequisites + +- `OPENAI_API_KEY` environment variable (or provided in options) + +## Configuration + +```yaml +links: + analyze_and_label: + module: links.analyze_and_label + options: + model: gpt-4-turbo + prompt: "Analyze this transcript and provide a list of relevant labels..." +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `prompt` | string | `"Analyze this transcript and provide a list of relevant labels for categorization. Return your response as a JSON object with a single key 'labels' containing an array of strings."` | Prompt sent to the model; must instruct the model to return `{"labels": [...]}` | +| `analysis_type` | string | `labeled_analysis` | Type name stored on the resulting analysis entry | +| `model` | string | `gpt-4-turbo` | OpenAI model to use | +| `sampling_rate` | float | `1` | Fraction of vCons to process (1 = 100 %, 0.5 = 50 %) | +| `temperature` | float | `0.2` | Model temperature (0โ€“1) | +| `source.analysis_type` | string | `transcript` | Type of analysis to read as input text | +| `source.text_location` | string | `body.paragraphs.transcript` | Dot-separated path to the text field within the source analysis | +| `response_format` | dict | `{"type": "json_object"}` | OpenAI response format parameter | +| `OPENAI_API_KEY` | string | โ€” | OpenAI API key (overrides environment variable) | + +## Example + +```yaml +chains: + label_calls: + links: + - analyze_and_label: + model: gpt-4-turbo + prompt: | + Identify key topics, sentiments, and issues in this conversation. + Return your response as a JSON object with a single key 'labels' + containing an array of strings. + sampling_rate: 1 + temperature: 0.2 + storages: + - postgres + ingress_lists: + - transcribed + enabled: 1 +``` + +## Output + +Adds a `labeled_analysis` entry to the vCon and applies each returned label as a tag: + +```json +{ + "analysis": [ + { + "type": "labeled_analysis", + "vendor": "openai", + "dialog": 0, + "body": "{\"labels\": [\"billing\", \"refund\", \"escalation\"]}" + } + ] +} +``` + +The vCon will also have tags `billing`, `refund`, and `escalation` applied. + +## Behavior + +- Skips dialogs that have no source analysis of the configured type +- Skips dialogs that already have a `labeled_analysis` (or the configured `analysis_type`) +- Respects `sampling_rate` to process only a fraction of vCons +- Retries API calls with exponential backoff (up to 6 attempts) diff --git a/docs/reference/links/analyze-vcon.md b/docs/reference/links/analyze-vcon.md new file mode 100644 index 0000000..304f393 --- /dev/null +++ b/docs/reference/links/analyze-vcon.md @@ -0,0 +1,80 @@ +# analyze_vcon + +Performs AI-powered analysis on the entire vCon object (rather than a single dialog) using an OpenAI GPT model. The whole vCon is serialized to JSON and sent to the model, which must return a structured JSON response. The result is stored as a single analysis entry on the vCon. + +## Prerequisites + +- `OPENAI_API_KEY` environment variable (or provided in options) + +## Configuration + +```yaml +links: + analyze_vcon: + module: links.analyze_vcon + options: + model: gpt-3.5-turbo-16k + prompt: "Analyze this vCon and return a JSON object with your analysis." +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `prompt` | string | `"Analyze this vCon and return a JSON object with your analysis."` | User-level instruction given to the model | +| `analysis_type` | string | `json_analysis` | Type name stored on the resulting analysis entry | +| `model` | string | `gpt-3.5-turbo-16k` | OpenAI model to use | +| `sampling_rate` | float | `1` | Fraction of vCons to process (1 = 100 %, 0.5 = 50 %) | +| `temperature` | int | `0` | Model temperature (0โ€“1); 0 gives the most deterministic output | +| `system_prompt` | string | `"You are a helpful assistant that analyzes conversation data and returns structured JSON output."` | System-level context prompt | +| `remove_body_properties` | bool | `true` | Strip `body` fields from dialogs before sending to save tokens | +| `OPENAI_API_KEY` | string | โ€” | OpenAI API key (overrides environment variable) | + +## Example + +```yaml +chains: + full_vcon_analysis: + links: + - analyze_vcon: + model: gpt-3.5-turbo-16k + prompt: | + Analyze this vCon and return a JSON object containing: + - summary: a two-sentence summary + - sentiment: overall sentiment (positive/neutral/negative) + - topics: list of key topics + remove_body_properties: true + storages: + - postgres + ingress_lists: + - transcribed + enabled: 1 +``` + +## Output + +Adds a `json_analysis` entry to the vCon: + +```json +{ + "analysis": [ + { + "type": "json_analysis", + "vendor": "openai", + "dialog": 0, + "body": { + "summary": "Customer called about a billing dispute...", + "sentiment": "negative", + "topics": ["billing", "refund"] + } + } + ] +} +``` + +## Behavior + +- Skips the vCon if a `json_analysis` entry (or the configured `analysis_type`) already exists +- Respects `sampling_rate` to process only a fraction of vCons +- Validates that the model response is valid JSON before storing +- Retries API calls with exponential backoff (up to 6 attempts) diff --git a/docs/reference/links/check-and-tag.md b/docs/reference/links/check-and-tag.md new file mode 100644 index 0000000..ae43623 --- /dev/null +++ b/docs/reference/links/check-and-tag.md @@ -0,0 +1,86 @@ +# check_and_tag + +Evaluates dialog content against a yes/no question using an OpenAI model and conditionally applies a tag if the answer is positive. Useful for quality-assurance checks such as greeting compliance, issue resolution, or call-handling standards. + +## Prerequisites + +- `OPENAI_API_KEY` environment variable (or provided in options) + +## Configuration + +```yaml +links: + check_and_tag: + module: links.check_and_tag + options: + tag_name: portal:eval_proper_greeting + tag_value: "true" + evaluation_question: "Did the specialist identify United Way 211 with a warm tone and thank the caller?" +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `tag_name` | string | **Required** | Name of the tag to apply when the evaluation is positive | +| `tag_value` | string | **Required** | Value of the tag to apply when the evaluation is positive | +| `evaluation_question` | string | **Required** | Question the model evaluates against the dialog text (should yield a yes/no answer) | +| `analysis_type` | string | `tag_evaluation` | Type name stored on the resulting analysis entry | +| `model` | string | `gpt-5` | OpenAI model to use | +| `sampling_rate` | float | `1` | Fraction of vCons to process (1 = 100 %, 0.5 = 50 %) | +| `source.analysis_type` | string | `transcript` | Type of analysis to read as input text | +| `source.text_location` | string | `body` | Dot-separated path to the text field within the source analysis | +| `response_format` | dict | `{"type": "json_object"}` | OpenAI response format parameter | +| `verbosity` | string | `low` | Response verbosity hint passed to the model (`low`, `medium`, `high`) | +| `minimal_reasoning` | bool | `true` | Hint for faster model responses | +| `OPENAI_API_KEY` | string | โ€” | OpenAI API key (overrides environment variable) | + +## Example + +```yaml +chains: + qa_greeting: + links: + - check_and_tag: + tag_name: qa:proper_greeting + tag_value: pass + evaluation_question: "Did the agent introduce themselves by name and greet the caller warmly?" + model: gpt-5 + source: + analysis_type: transcript + text_location: body + storages: + - postgres + ingress_lists: + - transcribed + enabled: 1 +``` + +## Output + +Adds a `tag_evaluation` analysis entry to the vCon. If the evaluation is positive, the specified tag is also applied: + +```json +{ + "analysis": [ + { + "type": "tag_evaluation", + "vendor": "openai", + "dialog": 0, + "body": { + "link_name": "check_and_tag", + "tag": "qa:proper_greeting:pass", + "applies": true + } + } + ] +} +``` + +## Behavior + +- Raises a `ValueError` if `tag_name`, `tag_value`, or `evaluation_question` are missing +- Skips dialogs that have no source analysis of the configured type +- Skips dialogs that already have a `tag_evaluation` entry (or the configured `analysis_type`) +- Respects `sampling_rate` to process only a fraction of vCons +- Retries API calls with exponential backoff (up to 6 attempts) diff --git a/docs/reference/links/detect-engagement.md b/docs/reference/links/detect-engagement.md new file mode 100644 index 0000000..048d929 --- /dev/null +++ b/docs/reference/links/detect-engagement.md @@ -0,0 +1,76 @@ +# detect_engagement + +Determines whether both the customer and the agent were actively engaged in a conversation, using an OpenAI model to evaluate each dialog transcript. The result (`true` or `false`) is stored as an analysis entry and also applied as an `engagement` tag on the vCon. + +## Prerequisites + +- `OPENAI_API_KEY` environment variable (or provided in options) + +## Configuration + +```yaml +links: + detect_engagement: + module: links.detect_engagement + options: + model: gpt-4.1 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `prompt` | string | `"Did both the customer and the agent speak? Respond with 'true' if yes, 'false' if not. Respond with only 'true' or 'false'."` | Prompt used to evaluate engagement | +| `analysis_type` | string | `engagement_analysis` | Type name stored on the resulting analysis entry | +| `model` | string | `gpt-4.1` | OpenAI model to use | +| `sampling_rate` | float | `1` | Fraction of vCons to process (1 = 100 %, 0.5 = 50 %) | +| `temperature` | float | `0.2` | Model temperature (0โ€“1) | +| `source.analysis_type` | string | `transcript` | Type of analysis to read as input text | +| `source.text_location` | string | `body.paragraphs.transcript` | Dot-separated path to the text field within the source analysis | +| `OPENAI_API_KEY` | string | โ€” | OpenAI API key (overrides environment variable) | + +## Example + +```yaml +chains: + engagement_check: + links: + - detect_engagement: + model: gpt-4.1 + temperature: 0.2 + source: + analysis_type: transcript + text_location: body.paragraphs.transcript + storages: + - postgres + ingress_lists: + - transcribed + enabled: 1 +``` + +## Output + +Adds an `engagement_analysis` entry and an `engagement` tag to the vCon: + +```json +{ + "analysis": [ + { + "type": "engagement_analysis", + "vendor": "openai", + "dialog": 0, + "body": "true" + } + ] +} +``` + +The vCon will also have a tag `engagement: true` (or `engagement: false`). + +## Behavior + +- Skips dialogs that have no source transcript analysis +- Skips dialogs that already have an `engagement_analysis` entry +- Respects `sampling_rate` to process only a fraction of vCons +- Gracefully skips the vCon if no API credentials are configured +- Retries API calls with exponential backoff (up to 6 attempts) diff --git a/docs/reference/links/diet.md b/docs/reference/links/diet.md new file mode 100644 index 0000000..303a59e --- /dev/null +++ b/docs/reference/links/diet.md @@ -0,0 +1,87 @@ +# diet + +Reduces the size of vCon objects by selectively removing or offloading content. Useful for data minimization, privacy compliance, and keeping Redis memory usage low after downstream processing has already consumed the full data. + +## Configuration + +```yaml +links: + diet: + module: links.diet + options: + remove_dialog_body: true + remove_analysis: false + remove_system_prompts: false +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `remove_dialog_body` | bool | `false` | Remove the `body` field from every dialog entry. If an external storage target is also configured, the body is offloaded there first. | +| `post_media_to_url` | string | `""` | HTTP endpoint that receives the dialog body via `POST`. On success the body is replaced with the returned URL. Ignored when `s3_bucket` is set. | +| `remove_analysis` | bool | `false` | Delete the entire `analysis` array from the vCon. | +| `remove_attachment_types` | list | `[]` | List of MIME types (e.g. `["image/jpeg", "audio/mp3"]`) whose attachments should be deleted. | +| `remove_system_prompts` | bool | `false` | Recursively remove all `system_prompt` keys from the vCon to prevent LLM prompt-injection attacks. | +| `s3_bucket` | string | `""` | S3 bucket name for offloading dialog bodies. Takes precedence over `post_media_to_url`. | +| `s3_path` | string | `""` | Optional key prefix within the bucket (e.g. `"dialogs/archived"`). | +| `aws_access_key_id` | string | `""` | AWS access key ID. | +| `aws_secret_access_key` | string | `""` | AWS secret access key. | +| `aws_region` | string | `us-east-1` | AWS region for the S3 bucket. | +| `presigned_url_expiration` | int \| null | `null` | Presigned URL lifetime in seconds. `null` defaults to 3600 (1 hour). | + +## Example + +### Remove dialog bodies and offload to S3 + +```yaml +chains: + archive: + links: + - diet: + remove_dialog_body: true + s3_bucket: my-vcon-archive + s3_path: dialogs/processed + aws_access_key_id: "${AWS_ACCESS_KEY_ID}" + aws_secret_access_key: "${AWS_SECRET_ACCESS_KEY}" + aws_region: us-west-2 + presigned_url_expiration: 86400 + storages: + - postgres + ingress_lists: + - analyzed + enabled: 1 +``` + +### Strip analysis and system prompts for long-term storage + +```yaml +chains: + slim_storage: + links: + - diet: + remove_analysis: true + remove_system_prompts: true + storages: + - postgres + ingress_lists: + - archived + enabled: 1 +``` + +## Behavior + +1. Loads the vCon directly from Redis using `JSON.GET`. +2. For each dialog, if `remove_dialog_body` is enabled: + - If `s3_bucket` is set: uploads the body to S3 and replaces it with a presigned URL. + - Else if `post_media_to_url` is set: POSTs the body to the URL and replaces it with the returned URL. + - Otherwise: sets the body to an empty string. +3. Removes the `analysis` array if `remove_analysis` is `true`. +4. Removes attachments whose `mime_type` is listed in `remove_attachment_types`. +5. Recursively removes all `system_prompt` keys if `remove_system_prompts` is `true`. +6. Writes the modified vCon back to Redis using `JSON.SET`. + +## Prerequisites + +- AWS credentials with `s3:PutObject` and `s3:GetObject` permissions are required when using S3 offload. +- The `boto3` Python package must be installed when using S3 offload. diff --git a/docs/reference/links/expire-vcon.md b/docs/reference/links/expire-vcon.md new file mode 100644 index 0000000..8141b19 --- /dev/null +++ b/docs/reference/links/expire-vcon.md @@ -0,0 +1,59 @@ +# expire_vcon + +Sets a Redis TTL on a vCon key so that it is automatically deleted after a configured duration. Useful for enforcing data-retention policies and keeping Redis memory usage bounded. + +## Configuration + +```yaml +links: + expire_vcon: + module: links.expire_vcon + options: + seconds: 86400 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `seconds` | int | `86400` | Number of seconds after which the vCon key expires and is removed from Redis. Default is 24 hours (60 ร— 60 ร— 24). | + +## Example + +```yaml +chains: + retain_7_days: + links: + - expire_vcon: + seconds: 604800 + storages: + - postgres + ingress_lists: + - default + enabled: 1 +``` + +### Combine with other processing + +```yaml +chains: + process_and_expire: + links: + - deepgram_link + - analyze + - expire_vcon: + seconds: 172800 + storages: + - postgres + ingress_lists: + - audio_input + enabled: 1 +``` + +## Behavior + +1. Calls Redis `EXPIRE vcon: ` to set the TTL on the vCon key. +2. Logs the expiration at INFO level. +3. Returns the vCon UUID so processing continues down the chain. + +The link does **not** retrieve or modify the vCon object itself โ€” it only sets the expiry on the existing Redis key. diff --git a/docs/reference/links/groq-whisper.md b/docs/reference/links/groq-whisper.md new file mode 100644 index 0000000..599611a --- /dev/null +++ b/docs/reference/links/groq-whisper.md @@ -0,0 +1,68 @@ +# groq_whisper + +Transcribes audio recordings in vCon dialogs using the Groq Whisper ASR API. + +## Prerequisites + +- `GROQ_API_KEY` environment variable + +## Configuration + +```yaml +links: + groq_whisper: + module: links.groq_whisper + options: + minimum_duration: 30 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `API_KEY` | string | `$GROQ_API_KEY` | Groq API key for authentication | +| `minimum_duration` | int | `30` | Minimum recording duration in seconds to transcribe | +| `Content-Type` | string | `audio/flac` | MIME type of audio content sent to the API | + +## Example + +```yaml +chains: + transcription: + links: + - groq_whisper: + minimum_duration: 60 + storages: + - postgres + ingress_lists: + - audio_input + enabled: 1 +``` + +## Output + +Adds a transcript analysis entry to the vCon for each qualifying dialog: + +```json +{ + "analysis": [ + { + "type": "transcript", + "vendor": "groq_whisper", + "dialog": 0, + "body": { + "text": "Hello, how can I help you today?", + "language": "en" + } + } + ] +} +``` + +## Behavior + +- Skips dialogs that are not of type `recording` +- Skips dialogs shorter than `minimum_duration` +- Skips dialogs that already have a transcript analysis +- Supports both inline base64-encoded audio and external URL references +- Retries failed API calls with exponential backoff (up to 6 attempts) diff --git a/docs/reference/links/huggingface-whisper.md b/docs/reference/links/huggingface-whisper.md new file mode 100644 index 0000000..ea737be --- /dev/null +++ b/docs/reference/links/huggingface-whisper.md @@ -0,0 +1,75 @@ +# hugging_face_whisper + +Transcribes audio recordings in vCon dialogs using a Hugging Face hosted Whisper ASR endpoint. + +## Prerequisites + +- A Hugging Face API key with access to a deployed Whisper inference endpoint +- The endpoint URL for your Hugging Face Whisper deployment + +## Configuration + +```yaml +links: + hugging_face_whisper: + module: links.hugging_face_whisper + options: + API_URL: https://your-endpoint.us-east-1.aws.endpoints.huggingface.cloud + API_KEY: hf_your_key_here + minimum_duration: 30 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `API_URL` | string | `https://xxxxxx.us-east-1.aws.endpoints.huggingface.cloud` | Hugging Face inference endpoint URL | +| `API_KEY` | string | `Bearer hf_XXXXX` | Hugging Face API key (include "Bearer " prefix) | +| `minimum_duration` | int | `30` | Minimum recording duration in seconds to transcribe | +| `Content-Type` | string | `audio/flac` | MIME type of the audio content sent to the endpoint | + +## Example + +```yaml +chains: + transcription: + links: + - hugging_face_whisper: + API_URL: https://abc123.us-east-1.aws.endpoints.huggingface.cloud + API_KEY: Bearer hf_xxxxxxxxxxxxxxxxxxxx + minimum_duration: 30 + Content-Type: audio/flac + storages: + - postgres + ingress_lists: + - audio_input + enabled: 1 +``` + +## Output + +Adds a transcript analysis entry to the vCon for each qualifying dialog: + +```json +{ + "analysis": [ + { + "type": "transcript", + "vendor": "hugging_face_whisper", + "dialog": 0, + "body": { + "text": "Hello, how can I help you today?" + } + } + ] +} +``` + +## Behavior + +- Skips dialogs that are not of type `recording` +- Skips dialogs shorter than `minimum_duration` +- Skips dialogs that already have a transcript analysis +- Supports both inline base64-encoded audio and external URL references +- Verifies file integrity via SHA-512 signature when provided on external URLs +- Retries failed API calls with exponential backoff (up to 6 attempts) diff --git a/docs/reference/links/jq.md b/docs/reference/links/jq.md new file mode 100644 index 0000000..35b08ab --- /dev/null +++ b/docs/reference/links/jq.md @@ -0,0 +1,98 @@ +# jq_link + +Filters vCons using a [jq](https://jqlang.github.io/jq/) expression. The link evaluates the expression against the vCon and either forwards or drops the vCon based on whether the result is truthy and the `forward_matches` setting. + +No vCon content is modified โ€” the link only decides whether to continue chain processing. + +## Prerequisites + +- The `jq` Python package must be installed (`pip install jq`). + +## Configuration + +```yaml +links: + jq_link: + module: links.jq_link + options: + filter: ".dialog | length > 0" + forward_matches: true +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `filter` | string | `.` | A jq expression evaluated against the vCon dictionary. The first output value is cast to a boolean to determine a match. | +| `forward_matches` | bool | `true` | When `true`, the vCon is forwarded if the filter result is truthy. When `false`, the vCon is forwarded if the filter result is falsy (i.e. acts as a "drop if matches" gate). | + +## Example + +### Forward only vCons that have a transcript + +```yaml +chains: + requires_transcript: + links: + - jq_link: + filter: '.analysis | map(select(.type == "transcript")) | length > 0' + forward_matches: true + - analyze + storages: + - postgres + ingress_lists: + - default + enabled: 1 +``` + +### Drop vCons with no dialogs + +```yaml +chains: + non_empty_only: + links: + - jq_link: + filter: ".dialog | length == 0" + forward_matches: false + - deepgram_link + storages: + - postgres + ingress_lists: + - default + enabled: 1 +``` + +### Filter by a metadata attribute + +```yaml +chains: + cats_only: + links: + - jq_link: + filter: '.meta.arc_display_type == "Cat"' + forward_matches: true + storages: + - postgres + ingress_lists: + - default + enabled: 1 +``` + +## Common filter patterns + +| Goal | Filter expression | +|------|-------------------| +| vCon has at least one dialog | `.dialog \| length > 0` | +| vCon has a transcript analysis | `.analysis[] \| select(.type == "transcript") \| any` | +| Dialog duration over 60 s | `.dialog[] \| select(.duration > 60) \| any` | +| Specific party role present | `.parties[] \| select(.role == "agent") \| any` | +| At least two parties | `.parties \| length >= 2` | +| Analysis list is empty | `.analysis \| length == 0` | + +## Behavior + +1. Retrieves the vCon from Redis and converts it to a dictionary. +2. Compiles and runs the jq `filter` expression. +3. Treats the first result as a boolean (`matches`). +4. Returns the vCon UUID when `matches == forward_matches`, otherwise returns `None` to halt the chain. +5. Returns `None` (drops the vCon) if the vCon cannot be found or the filter expression raises an error. diff --git a/docs/reference/links/openai-transcribe.md b/docs/reference/links/openai-transcribe.md new file mode 100644 index 0000000..73446b3 --- /dev/null +++ b/docs/reference/links/openai-transcribe.md @@ -0,0 +1,75 @@ +# openai_transcribe + +!!! note "Stub" + No source directory for this link (`server/links/openai_transcribe/`) exists in the current codebase. This page is a placeholder based on the expected behaviour of a link that transcribes audio using the [OpenAI Whisper API](https://platform.openai.com/docs/guides/speech-to-text). + +Transcribes audio recordings in vCon dialogs using the OpenAI Whisper speech-to-text API. The transcription result is stored as a `transcript` analysis entry on the vCon. + +## Prerequisites + +- `OPENAI_API_KEY` environment variable must be set with a valid OpenAI API key. + +## Configuration + +```yaml +links: + openai_transcribe: + module: links.openai_transcribe + options: + model: whisper-1 + language: en + minimum_duration: 0 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `model` | string | `whisper-1` | OpenAI Whisper model to use for transcription. | +| `language` | string | โ€” | Optional BCP-47 language code (e.g. `en`, `es`). Omit to let the model auto-detect. | +| `minimum_duration` | int | `0` | Minimum recording duration in seconds. Dialogs shorter than this value are skipped. | +| `API_KEY` | string | `$OPENAI_API_KEY` | OpenAI API key. Reads from the `OPENAI_API_KEY` environment variable by default. | + +## Example + +```yaml +chains: + transcription: + links: + - openai_transcribe: + model: whisper-1 + language: en + minimum_duration: 5 + storages: + - postgres + ingress_lists: + - audio_input + enabled: 1 +``` + +## Output + +Adds a `transcript` analysis entry to the vCon for each processed dialog: + +```json +{ + "analysis": [ + { + "type": "transcript", + "vendor": "openai", + "dialog": 0, + "body": { + "text": "Hello, how can I help you today?" + } + } + ] +} +``` + +## Behavior + +- Skips dialogs that are not of type `recording`. +- Skips dialogs shorter than `minimum_duration` seconds. +- Skips dialogs that already have a `transcript` analysis entry. +- Supports both inline base64-encoded audio (`body`) and external URL references (`url`). +- Saves the updated vCon back to Redis after processing all dialogs. diff --git a/docs/reference/links/sampler.md b/docs/reference/links/sampler.md new file mode 100644 index 0000000..7a6b0f6 --- /dev/null +++ b/docs/reference/links/sampler.md @@ -0,0 +1,73 @@ +# sampler + +Selectively passes vCons through a processing chain based on a configurable sampling strategy. Returns the vCon UUID when the vCon is selected, or `None` to drop it from the chain. + +## Configuration + +```yaml +links: + sampler: + module: links.sampler + options: + method: percentage + value: 50 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `method` | string | `percentage` | Sampling algorithm to apply. One of `percentage`, `rate`, `modulo`, or `time_based`. | +| `value` | number | `50` | Parameter for the chosen method (see table below). | +| `seed` | int \| null | `null` | Optional random seed for reproducible sampling. If `null`, sampling is non-deterministic. | + +### Method reference + +| Method | `value` meaning | Notes | +|--------|-----------------|-------| +| `percentage` | Percentage of vCons to keep (0โ€“100) | Uses `random.uniform`; `value: 100` passes all vCons, `value: 0` drops all. | +| `rate` | Average seconds between kept samples | Uses an exponential distribution; lower values keep more vCons. | +| `modulo` | Keep every *n*th vCon | Uses a SHA-256 hash of the UUID modulo `value`; deterministic per UUID. | +| `time_based` | Interval in seconds | Keeps a vCon only when `current_unix_time % value == 0`. | + +## Example + +```yaml +chains: + sample_for_review: + links: + - sampler: + method: percentage + value: 10 + - analyze + storages: + - postgres + ingress_lists: + - transcribed + enabled: 1 +``` + +### Modulo example โ€” keep every 5th call + +```yaml +chains: + every_fifth: + links: + - sampler: + method: modulo + value: 5 + - deepgram_link + storages: + - postgres + ingress_lists: + - default + enabled: 1 +``` + +## Behavior + +1. Merges provided options with defaults. +2. Seeds the random number generator if `seed` is set. +3. Applies the configured sampling function to the vCon UUID. +4. Returns the UUID if the vCon passes, or `None` to drop it (halting further processing in the chain). +5. Raises `ValueError` if an unknown `method` is specified. diff --git a/docs/reference/links/slack.md b/docs/reference/links/slack.md new file mode 100644 index 0000000..38348b1 --- /dev/null +++ b/docs/reference/links/slack.md @@ -0,0 +1,80 @@ +# post_analysis_to_slack + +Posts vCon analysis results to Slack channels. Sends a formatted message with an inline summary and a details button when a matching analysis entry is found on the vCon. Supports routing to team-specific channels as well as a default fallback channel. + +## Prerequisites + +- A [Slack Bot Token](https://api.slack.com/authentication/token-types#bot) with `chat:write` scope. +- The bot must be invited to the target Slack channel(s). + +## Configuration + +```yaml +links: + post_analysis_to_slack: + module: links.post_analysis_to_slack + options: + token: "${SLACK_BOT_TOKEN}" + channel_name: vcon-alerts + default_channel_name: vcon-errors + url: https://app.example.com/conversations + analysis_to_post: summary + only_if: + analysis_type: customer_frustration + includes: "NEEDS REVIEW" +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `token` | string | `null` | Slack Bot OAuth token. | +| `channel_name` | string | `null` | Default Slack channel to post notifications to. | +| `default_channel_name` | string | โ€” | Fallback channel used when the team-specific channel does not exist. | +| `url` | string | `"Url to hex sheet"` | Base URL used to build the "Details" button link. The vCon UUID is appended as a query parameter. | +| `analysis_to_post` | string | `summary` | Type of analysis whose `body` is used as the Slack message text. | +| `only_if.analysis_type` | string | `customer_frustration` | Only post when an analysis of this type is found on the vCon. | +| `only_if.includes` | string | `NEEDS REVIEW` | Only post when the matching analysis body contains this substring. | + +## Example + +```yaml +chains: + slack_alerts: + links: + - analyze + - post_analysis_to_slack: + token: "${SLACK_BOT_TOKEN}" + channel_name: call-center-alerts + default_channel_name: call-center-errors + url: https://dashboard.example.com/calls + analysis_to_post: summary + only_if: + analysis_type: customer_frustration + includes: "NEEDS REVIEW" + storages: + - postgres + ingress_lists: + - analyzed + enabled: 1 +``` + +## Slack Message Format + +Each notification includes three Slack blocks: + +1. A header section with a neutral-face emoji. +2. The summary text from the `analysis_to_post` analysis body. +3. A "Details" button linking to `url?_vcon_id=""`. + +If the vCon has a `strolid_dealer` attachment with a `team` field, the message is also posted to a channel named `team--alerts` with the dealer name appended to the summary text. + +## Behavior + +1. Retrieves the vCon from Redis. +2. Iterates over analysis entries looking for ones matching `only_if.analysis_type` and containing `only_if.includes` in the body. +3. Skips entries already marked `was_posted_to_slack: true`. +4. Finds the corresponding `summary` analysis entry for the same dialog. +5. Posts the formatted message to the team-specific channel (if applicable) and to `channel_name`. +6. Marks the analysis entry `was_posted_to_slack: true` and saves the vCon. +7. If posting to a team channel fails (channel does not exist), an error is sent to `default_channel_name`. diff --git a/docs/reference/links/tag-router.md b/docs/reference/links/tag-router.md new file mode 100644 index 0000000..c50aab1 --- /dev/null +++ b/docs/reference/links/tag-router.md @@ -0,0 +1,67 @@ +# tag_router + +Routes vCon objects to one or more Redis lists based on their tags. Useful for fanning out processed vCons into category-specific queues without duplicating chain configuration. + +## Configuration + +```yaml +links: + tag_router: + module: links.tag_router + options: + tag_routes: + urgent: urgent_vcons + billing: billing_queue + forward_original: true +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `tag_routes` | dict | `{}` | Mapping of tag name to target Redis list name. When a vCon carries a matching tag, its UUID is pushed to that list. Multiple tags can match and route simultaneously. | +| `forward_original` | bool | `true` | If `true`, the link returns the vCon UUID so processing continues down the chain. If `false`, returns `None` to stop further processing after routing. | + +## Example + +```yaml +chains: + route_by_tag: + links: + - analyze_and_label + - tag_router: + tag_routes: + escalation: escalation_queue + billing: billing_queue + praise: praise_queue + forward_original: true + storages: + - postgres + ingress_lists: + - transcribed + enabled: 1 +``` + +## Tag Format + +The link reads tags from vCon attachments of type `tags`. Two body formats are supported: + +- **List format** โ€” each item is a `name:value` string; the part before the first colon is used as the tag name: + + ```json + ["billing:true", "escalation:true"] + ``` + +- **Dictionary format** โ€” keys are used as tag names: + + ```json + {"billing": "true", "escalation": "true"} + ``` + +## Behavior + +1. Retrieves the vCon from Redis. +2. Extracts all tag names from `tags`-typed attachments. +3. For each tag that matches a key in `tag_routes`, pushes the vCon UUID to the corresponding Redis list via `RPUSH`. +4. Returns the vCon UUID if `forward_original` is `true`, otherwise returns `None`. +5. If no `tag_routes` are configured, logs a warning and passes the vCon through unchanged. diff --git a/docs/reference/storage-adapters/chatgpt-files.md b/docs/reference/storage-adapters/chatgpt-files.md new file mode 100644 index 0000000..fa47b8c --- /dev/null +++ b/docs/reference/storage-adapters/chatgpt-files.md @@ -0,0 +1,77 @@ +# chatgpt-files + +Uploads vCons to the OpenAI Files API and optionally adds them to an OpenAI vector store, making them available for use with Assistants and ChatGPT Retrieval. + +## Prerequisites + +- An OpenAI account with API access +- A vector store created in the OpenAI platform (if using vector store integration) +- The `openai` Python package + +``` +pip install openai +``` + +## Configuration + +```yaml +storages: + chatgpt_files: + module: storage.chatgpt_files + options: + organization_key: ${OPENAI_ORG_ID} + project_key: ${OPENAI_PROJECT_ID} + api_key: ${OPENAI_API_KEY} + vector_store_id: ${OPENAI_VECTOR_STORE_ID} + purpose: assistants +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `organization_key` | string | Required | OpenAI organization ID (e.g. `org-xxxxx`) | +| `project_key` | string | Required | OpenAI project ID (e.g. `proj_xxxxxxx`) | +| `api_key` | string | Required | OpenAI API key (e.g. `sk-proj-xxxxxx`) | +| `vector_store_id` | string | Required | ID of the OpenAI vector store to attach uploaded files to | +| `purpose` | string | `assistants` | Purpose for the uploaded file. Must be a value accepted by the OpenAI Files API (e.g. `assistants`) | + +## Example + +```yaml +storages: + chatgpt_files: + module: storage.chatgpt_files + options: + organization_key: ${OPENAI_ORG_ID} + project_key: ${OPENAI_PROJECT_ID} + api_key: ${OPENAI_API_KEY} + vector_store_id: ${OPENAI_VECTOR_STORE_ID} + purpose: assistants + +chains: + main: + links: + - transcribe + - summarize + storages: + - chatgpt_files + ingress_lists: + - default + enabled: 1 +``` + +## File Naming + +Each vCon is written to a temporary local file named `{uuid}.vcon.json`, uploaded to the OpenAI Files API, and then the local file is deleted. The uploaded file is attached to the configured vector store. + +``` +{vcon_uuid}.vcon.json โ†’ uploaded to OpenAI Files API โ†’ added to vector store +``` + +## Notes + +- A temporary local file is created during the upload and removed immediately after a successful transfer. +- After uploading, the file is registered with the vector store identified by `vector_store_id`, enabling semantic search via OpenAI Assistants. +- File size limits and rate limits are governed by your OpenAI account tier. +- Store the `api_key` and other credentials in environment variables โ€” never commit them to source control. diff --git a/docs/reference/storage-adapters/dataverse.md b/docs/reference/storage-adapters/dataverse.md new file mode 100644 index 0000000..1c3af3b --- /dev/null +++ b/docs/reference/storage-adapters/dataverse.md @@ -0,0 +1,97 @@ +# dataverse + +Stores vCons in Microsoft Dataverse as custom entity records. Authentication uses Azure Active Directory via the MSAL library, making this adapter suitable for organizations already using the Microsoft Power Platform. + +## Prerequisites + +1. A Microsoft Dataverse environment (Dynamics 365 or Power Platform) +2. An Azure AD app registration with the `Dynamics CRM user_impersonation` API permission (or equivalent) +3. A custom entity in Dataverse with the fields described below +4. The `msal` and `requests` Python packages + +``` +pip install msal requests +``` + +### Dataverse Entity Setup + +Create a custom entity (table) in your Dataverse environment containing these columns: + +| Column | Type | Description | +|--------|------|-------------| +| `vcon_uuid` | Single Line of Text | vCon UUID (used as the lookup key) | +| `vcon_data` | Multiple Lines of Text | Full vCon serialized as JSON | +| `vcon_subject` | Single Line of Text | vCon subject | +| `vcon_created_at` | Date and Time | vCon creation timestamp | + +The column logical names must match the `uuid_field`, `data_field`, `subject_field`, and `created_at_field` options. + +## Configuration + +```yaml +storages: + dataverse: + module: storage.dataverse + options: + url: https://your-org.crm.dynamics.com + tenant_id: ${AZURE_TENANT_ID} + client_id: ${AZURE_CLIENT_ID} + client_secret: ${AZURE_CLIENT_SECRET} + entity_name: vcon_storage + uuid_field: vcon_uuid + data_field: vcon_data + subject_field: vcon_subject + created_at_field: vcon_created_at +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `name` | string | `dataverse` | Adapter name | +| `url` | string | `https://org.crm.dynamics.com` | Dataverse environment URL | +| `api_version` | string | `9.2` | Dataverse Web API version | +| `tenant_id` | string | `""` | Azure AD tenant (directory) ID | +| `client_id` | string | `""` | Azure AD application (client) ID | +| `client_secret` | string | `""` | Azure AD client secret | +| `entity_name` | string | `vcon_storage` | Logical name of the custom entity (table) in Dataverse | +| `uuid_field` | string | `vcon_uuid` | Column logical name that stores the vCon UUID | +| `data_field` | string | `vcon_data` | Column logical name that stores the full vCon JSON | +| `subject_field` | string | `vcon_subject` | Column logical name that stores the vCon subject | +| `created_at_field` | string | `vcon_created_at` | Column logical name that stores the creation timestamp | + +## Example + +```yaml +storages: + dataverse: + module: storage.dataverse + options: + url: https://contoso.crm.dynamics.com + api_version: "9.2" + tenant_id: ${AZURE_TENANT_ID} + client_id: ${AZURE_CLIENT_ID} + client_secret: ${AZURE_CLIENT_SECRET} + entity_name: cr123_vcon_storage + uuid_field: cr123_vcon_uuid + data_field: cr123_vcon_data + subject_field: cr123_vcon_subject + created_at_field: cr123_vcon_created_at + +chains: + main: + links: + - transcribe + storages: + - dataverse + ingress_lists: + - default + enabled: 1 +``` + +## Notes + +- An access token is acquired via MSAL client-credentials flow on every `save` or `get` call. Tokens are not cached between calls. +- On `save`, the adapter checks whether an entity with the same UUID already exists and performs a PATCH (update) if found, or a POST (create) otherwise. +- Store the `client_secret` in an environment variable or secrets manager โ€” never commit it to source control. +- Custom entity logical names in Dataverse typically include a publisher prefix (e.g. `cr123_vcon_storage`). Ensure the `entity_name` and field names match the logical names shown in the Power Apps maker portal. diff --git a/docs/reference/storage-adapters/elasticsearch.md b/docs/reference/storage-adapters/elasticsearch.md new file mode 100644 index 0000000..84c5896 --- /dev/null +++ b/docs/reference/storage-adapters/elasticsearch.md @@ -0,0 +1,124 @@ +# elasticsearch + +Indexes vCon components into Elasticsearch for full-text search and analytics. Each vCon is decomposed into separate documents stored across multiple indices: parties, attachments, analysis, and dialog. + +## Prerequisites + +- Elasticsearch 7.x or 8.x (self-hosted or Elastic Cloud) +- The `elasticsearch` Python package + +``` +pip install elasticsearch +``` + +## Configuration + +### Self-Hosted + +```yaml +storages: + elasticsearch: + module: storage.elasticsearch + options: + url: http://localhost:9200 + username: elastic + password: changeme + index_prefix: "" +``` + +### Elastic Cloud + +```yaml +storages: + elasticsearch: + module: storage.elasticsearch + options: + cloud_id: my-deployment:dXMtZWFzdC0x... + api_key: VnVhQ2ZHY0JDZGJrUW... + index_prefix: prod_ +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `name` | string | `elasticsearch` | Adapter name | +| `cloud_id` | string | `""` | Elastic Cloud deployment ID. When set, `cloud_id` and `api_key` are used for authentication | +| `api_key` | string | `""` | Elastic Cloud API key | +| `url` | string | `None` | Self-hosted Elasticsearch URL (e.g. `http://localhost:9200`) | +| `username` | string | `None` | Basic auth username for self-hosted deployments | +| `password` | string | `None` | Basic auth password for self-hosted deployments | +| `ca_certs` | string | `None` | Path to CA certificate file for TLS verification | +| `index_prefix` | string | `""` | Prefix added to all index names (e.g. `prod_` yields `prod_vcon_dialog`) | + +## Index Structure + +vCon components are stored in separate indices based on type and role: + +| Index Pattern | Description | Example | +|---------------|-------------|---------| +| `{prefix}vcon_parties_{role}` | Party records grouped by role | `vcon_parties_agent` | +| `{prefix}vcon_attachments_{type}` | Attachments grouped by type | `vcon_attachments_transcript` | +| `{prefix}vcon_analysis_{type}` | Analysis records grouped by type | `vcon_analysis_summary` | +| `{prefix}vcon_dialog` | All dialog entries | `vcon_dialog` | + +Each document includes `vcon_id` and `started_at` fields (derived from the first dialog entry) for cross-index correlation. If a tenant attachment is present, `tenant_id` is also indexed. + +## Example + +```yaml +storages: + elasticsearch: + module: storage.elasticsearch + options: + url: https://elasticsearch:9200 + username: elastic + password: ${ELASTICSEARCH_PASSWORD} + ca_certs: /certs/ca.crt + index_prefix: vcon_ + +chains: + main: + links: + - transcribe + - summarize + storages: + - elasticsearch + ingress_lists: + - default + enabled: 1 +``` + +## Queries + +Search across Elasticsearch indices directly: + +```bash +# Find all vCon components for a specific vCon UUID +GET vcon_*/_search +{ + "query": { + "term": { "vcon_id": "abc-123-def" } + } +} + +# Full-text search across analysis summaries +GET vcon_analysis_summary/_search +{ + "query": { + "match": { "body": "billing issue" } + } +} + +# Find all agent parties +GET vcon_parties_agent/_search +{ + "query": { "match_all": {} } +} +``` + +## Notes + +- vCons with no dialog entries are silently skipped โ€” at least one dialog entry is required for indexing. +- JSON-encoded string bodies in attachments and analysis are automatically parsed before indexing. +- The `delete` function removes all documents for a vCon across every `{prefix}vcon*` index using `delete_by_query`. diff --git a/docs/reference/storage-adapters/file.md b/docs/reference/storage-adapters/file.md new file mode 100644 index 0000000..2c4c7c0 --- /dev/null +++ b/docs/reference/storage-adapters/file.md @@ -0,0 +1,106 @@ +# file + +Stores vCons as JSON files on the local filesystem. Supports optional gzip compression, date-based directory organization, and configurable Unix permissions. + +## Prerequisites + +- A writable directory on the host filesystem +- No additional Python packages required (uses the standard library) + +When running in Docker, mount a persistent volume: + +```yaml +services: + conserver: + volumes: + - vcon_files:/data/vcons + +volumes: + vcon_files: {} +``` + +## Configuration + +```yaml +storages: + file: + module: storage.file + options: + path: /data/vcons + organize_by_date: true + compression: false + max_file_size: 10485760 + file_permissions: 0o644 + dir_permissions: 0o755 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `path` | string | `/data/vcons` | Base directory where vCon files are written | +| `organize_by_date` | boolean | `true` | When enabled, files are placed under `YYYY/MM/DD/` subdirectories based on the vCon creation date | +| `compression` | boolean | `false` | Compress files with gzip (files are saved as `{uuid}.json.gz`) | +| `max_file_size` | integer | `10485760` | Maximum allowed file size in bytes (default: 10 MB). Raises an error if exceeded | +| `file_permissions` | integer | `0o644` | Unix permissions applied to each written file | +| `dir_permissions` | integer | `0o755` | Unix permissions applied to created directories | + +## File Layout + +### Date-based organization (default) + +``` +/data/vcons/ +โ””โ”€โ”€ 2024/ + โ””โ”€โ”€ 03/ + โ””โ”€โ”€ 14/ + โ”œโ”€โ”€ abc-123.json + โ””โ”€โ”€ def-456.json +``` + +### Flat layout (`organize_by_date: false`) + +``` +/data/vcons/ +โ”œโ”€โ”€ abc-123.json +โ””โ”€โ”€ def-456.json +``` + +### With compression enabled + +``` +/data/vcons/ +โ””โ”€โ”€ 2024/ + โ””โ”€โ”€ 03/ + โ””โ”€โ”€ 14/ + โ””โ”€โ”€ abc-123.json.gz +``` + +## Example + +```yaml +storages: + file: + module: storage.file + options: + path: /data/vcons + organize_by_date: true + compression: true + max_file_size: 52428800 + +chains: + archive: + links: + - transcribe + storages: + - file + ingress_lists: + - default + enabled: 1 +``` + +## Notes + +- On `delete`, empty parent directories created by date-based organization are automatically removed. +- The adapter searches both flat and date-organized layouts when locating a vCon by UUID, so the directory structure can be changed without losing access to previously stored files. +- For production deployments with large volumes of vCons, consider using the [S3](s3.md) adapter instead. diff --git a/docs/reference/storage-adapters/milvus.md b/docs/reference/storage-adapters/milvus.md new file mode 100644 index 0000000..ec77689 --- /dev/null +++ b/docs/reference/storage-adapters/milvus.md @@ -0,0 +1,125 @@ +# milvus + +Stores vCons as vector embeddings in Milvus for semantic similarity search. Text is extracted from the vCon (transcripts, summaries, party info, dialog), converted to an embedding vector via the OpenAI Embeddings API, and inserted into a Milvus collection. + +## Prerequisites + +- A running Milvus instance (v2.x) +- An OpenAI API key (or a LiteLLM proxy) for generating embeddings +- The `pymilvus` and `openai` Python packages + +``` +pip install pymilvus openai +``` + +## Configuration + +```yaml +storages: + milvus: + module: storage.milvus + options: + host: localhost + port: "19530" + collection_name: vcons + api_key: ${OPENAI_API_KEY} + embedding_model: text-embedding-3-small + embedding_dim: 1536 + create_collection_if_missing: false +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `host` | string | `localhost` | Milvus server hostname | +| `port` | string | `19530` | Milvus server port | +| `collection_name` | string | `vcons` | Milvus collection to store vCons in | +| `embedding_model` | string | `text-embedding-3-small` | OpenAI embedding model used to generate vectors | +| `embedding_dim` | integer | `1536` | Vector dimension โ€” must match the chosen model output | +| `api_key` | string | `None` | OpenAI API key | +| `organization` | string | `None` | OpenAI organization ID | +| `create_collection_if_missing` | boolean | `false` | Create the collection automatically if it does not exist | +| `skip_if_exists` | boolean | `true` | Skip storing if the vCon UUID already exists in the collection | +| `index_type` | string | `IVF_FLAT` | Vector index type: `IVF_FLAT`, `IVF_SQ8`, `IVF_PQ`, `HNSW`, `ANNOY`, `FLAT` | +| `metric_type` | string | `L2` | Distance metric: `L2` (Euclidean), `IP` (Inner Product), `COSINE` | +| `nlist` | integer | `128` | Number of clusters for IVF-family indexes | +| `m` | integer | `16` | HNSW: number of edges per node | +| `ef_construction` | integer | `200` | HNSW: dynamic candidate list size during index construction | + +## Collection Schema + +When `create_collection_if_missing: true` the adapter creates a collection with these fields: + +| Field | Type | Description | +|-------|------|-------------| +| `id` | INT64 (auto) | Primary key | +| `vcon_uuid` | VARCHAR(100) | vCon UUID | +| `party_id` | VARCHAR(100) | Extracted party identifier | +| `text` | VARCHAR(65535) | Extracted text used for embedding | +| `embedding` | FLOAT_VECTOR | Embedding vector | +| `created_at` | VARCHAR(30) | vCon creation timestamp | +| `updated_at` | VARCHAR(30) | vCon update timestamp | +| `subject` | VARCHAR(255) | vCon subject | +| `metadata_title` | VARCHAR(255) | Title from vCon metadata | +| `has_transcript` | BOOL | Whether a transcript is present | +| `has_summary` | BOOL | Whether a summary analysis is present | +| `party_count` | INT16 | Number of parties | +| `embedding_model` | VARCHAR(50) | Model used to generate the embedding | +| `embedding_version` | VARCHAR(20) | Embedding schema version | + +## Example + +### Basic Setup + +```yaml +storages: + milvus: + module: storage.milvus + options: + host: milvus + port: "19530" + collection_name: vcons + api_key: ${OPENAI_API_KEY} + embedding_model: text-embedding-3-small + embedding_dim: 1536 + create_collection_if_missing: true + skip_if_exists: true + +chains: + main: + links: + - transcribe + - summarize + storages: + - milvus + ingress_lists: + - default + enabled: 1 +``` + +### HNSW Index + +```yaml +storages: + milvus: + module: storage.milvus + options: + host: milvus + port: "19530" + collection_name: vcons_hnsw + api_key: ${OPENAI_API_KEY} + embedding_model: text-embedding-3-small + embedding_dim: 1536 + create_collection_if_missing: true + index_type: HNSW + metric_type: COSINE + m: 16 + ef_construction: 200 +``` + +## Notes + +- vCons with no extractable text are silently skipped. +- The embedding dimension must match the model: `text-embedding-3-small` produces 1536-dimensional vectors, `text-embedding-3-large` produces 3072. +- The `get` function returns only the indexed fields (`uuid`, `text`, `embedding`, `party_id`) โ€” the full vCon is not stored in Milvus and must be retrieved from another storage backend. diff --git a/docs/reference/storage-adapters/redis.md b/docs/reference/storage-adapters/redis.md new file mode 100644 index 0000000..e4004d9 --- /dev/null +++ b/docs/reference/storage-adapters/redis.md @@ -0,0 +1,80 @@ +# redis + +Stores vCons in Redis as serialized JSON with optional TTL expiry. + +## Prerequisites + +- A running Redis instance +- The `redis` Python package (included in server dependencies) + +## Configuration + +```yaml +storages: + redis: + module: storage.redis_storage + options: + redis_url: redis://:password@localhost:6379 + prefix: vcon_storage + expires: 604800 +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `redis_url` | string | `redis://:localhost:6379` | Redis connection URL | +| `prefix` | string | `vcon_storage` | Key prefix used when storing vCons (`{prefix}:{uuid}`) | +| `expires` | integer | `604800` | TTL in seconds before keys expire (default: 7 days) | + +## Example + +### Local Redis + +```yaml +storages: + redis: + module: storage.redis_storage + options: + redis_url: redis://localhost:6379 + prefix: vcon_storage + expires: 86400 + +chains: + main: + links: + - transcribe + storages: + - redis + ingress_lists: + - default + enabled: 1 +``` + +### Redis with Authentication + +```yaml +storages: + redis: + module: storage.redis_storage + options: + redis_url: redis://:${REDIS_PASSWORD}@redis.example.com:6379 + prefix: prod_vcons + expires: 604800 +``` + +## Key Structure + +vCons are stored as JSON strings under the key: + +``` +{prefix}:{vcon_uuid} +``` + +For example, with the default prefix: + +``` +vcon_storage:abc-123-def +``` + +Keys automatically expire after the configured `expires` duration. Set `expires` to `0` to disable expiry (keys persist indefinitely). diff --git a/docs/reference/storage-adapters/sftp.md b/docs/reference/storage-adapters/sftp.md new file mode 100644 index 0000000..01544c2 --- /dev/null +++ b/docs/reference/storage-adapters/sftp.md @@ -0,0 +1,92 @@ +# sftp + +Transfers vCons to a remote server over SFTP (SSH File Transfer Protocol). Each vCon is uploaded as a JSON file; the filename can optionally include a timestamp to avoid collisions. + +## Prerequisites + +- An accessible SFTP server with a valid account +- The `paramiko` Python package + +``` +pip install paramiko +``` + +## Configuration + +```yaml +storages: + sftp: + module: storage.sftp + options: + url: sftp.example.com + port: 22 + username: vcon_user + password: ${SFTP_PASSWORD} + path: /uploads/vcons + filename: vcon + extension: json + add_timestamp_to_filename: true +``` + +## Options + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `name` | string | `sftp` | Adapter name | +| `url` | string | `sftp://localhost` | SFTP server hostname or address | +| `port` | integer | `22` | SSH/SFTP port | +| `username` | string | `username` | SFTP account username | +| `password` | string | `password` | SFTP account password | +| `path` | string | `.` | Remote directory where files are uploaded | +| `filename` | string | `vcon` | Base name for the uploaded file | +| `extension` | string | `json` | File extension | +| `add_timestamp_to_filename` | boolean | `true` | Append an ISO 8601 timestamp to the filename to prevent overwriting (e.g. `vcon_2024-03-14T10:30:00.json`) | + +## Example + +```yaml +storages: + sftp: + module: storage.sftp + options: + url: files.example.com + port: 22 + username: vcon_user + password: ${SFTP_PASSWORD} + path: /var/uploads/vcons + filename: vcon + extension: json + add_timestamp_to_filename: true + +chains: + main: + links: + - transcribe + storages: + - sftp + ingress_lists: + - default + enabled: 1 +``` + +## File Naming + +With `add_timestamp_to_filename: true` (default): + +``` +/var/uploads/vcons/vcon_2024-03-14T10:30:00.123456.json +``` + +With `add_timestamp_to_filename: false`: + +``` +/var/uploads/vcons/vcon.json +``` + +Disable the timestamp only when storing a single vCon at a time, or when your pipeline ensures the remote path is unique per transfer. + +## Notes + +- The adapter uses password-based authentication. SSH key-based authentication is not currently supported via configuration options. +- The `get` function lists the remote directory and returns the content of the lexicographically latest matching file, which corresponds to the most recently uploaded vCon when timestamps are enabled. +- The transport connection is opened and closed for each `save` or `get` call. diff --git a/docs/reference/vcon-data-model.md b/docs/reference/vcon-data-model.md new file mode 100644 index 0000000..e33003c --- /dev/null +++ b/docs/reference/vcon-data-model.md @@ -0,0 +1,282 @@ +# vCon Data Model Reference + +## What is vCon? + +vCon (Virtual Conversation) is an IETF standard container format for storing and exchanging conversation data. It is defined in [draft-ietf-vcon-vcon-container](https://datatracker.ietf.org/doc/draft-ietf-vcon-vcon-container/) and is designed to capture the full context of a conversation โ€” parties involved, the dialog content (audio, text, or video), attachments, and machine-generated analysis โ€” in a single portable JSON document. + +vCon Server uses version `"0.0.1"` of the format. A vCon travels through processing chains as a JSON object stored in Redis (keyed as `vcon:`), and is enriched by each link in the chain before being written to one or more storage backends. + +--- + +## Top-Level Fields + +| Field | Type | Required | Description | +|---|---|---|---| +| `vcon` | string | Required | Syntactic version of the vCon JSON format. Always `"0.0.1"` in this implementation. | +| `uuid` | string (UUID) | Required | Globally unique identifier for the vCon. Used to reference the vCon in Redis and storage. Must be globally unique. | +| `created_at` | string (ISO 8601) | Required | Creation timestamp. Set once when the vCon is built and must not change afterwards. Format: `2024-01-15T10:30:00.000+00:00`. | +| `updated_at` | string (ISO 8601) | Optional | Timestamp of the most recent modification to the vCon. | +| `subject` | string | Optional | A short human-readable description of the conversation topic. | +| `redacted` | object | Optional | An object describing any redacted content. Present as an empty object `{}` when no redactions have been applied. | +| `group` | array | Optional | An array of references to other vCons that are part of a group with this one. Present as an empty array `[]` when unused. | +| `parties` | array of Party objects | Required | The participants in the conversation. Must contain at least the parties involved in the dialog. | +| `dialog` | array of Dialog objects | Required | The actual conversation content โ€” recordings, transcripts, or signaling events. | +| `attachments` | array of Attachment objects | Required | Supplementary data attached to the vCon. Includes the special `tags` attachment type. | +| `analysis` | array of Analysis objects | Required | Machine-generated analysis results such as transcriptions, summaries, and sentiment scores. | + +--- + +## Party Object + +A party represents one participant in the conversation. + +| Field | Type | Required | Description | +|---|---|---|---| +| `tel` | string | Optional | E.164 telephone number, e.g. `"+15551234567"`. | +| `mailto` | string | Optional | Email address, e.g. `"alice@example.com"`. | +| `name` | string | Optional | Human-readable display name. | +| `role` | string | Optional | Role in the conversation, e.g. `"agent"` or `"customer"`. | +| `uuid` | string | Optional | Identifier for the party, useful for linking to external systems. | +| `meta` | object | Optional | Additional metadata as key/value pairs. | + +At least one contact field (`tel` or `mailto`) is recommended to make the party identifiable. Parties are referenced by their zero-based index in the `parties` array from within dialog and analysis entries. + +### Example Party + +```json +{ + "tel": "+15551234567", + "name": "Alice Example", + "role": "customer" +} +``` + +--- + +## Dialog Object + +A dialog entry represents one unit of conversation content. Multiple dialog entries are common โ€” for example, separate entries for each leg of a transferred call or each message in a chat thread. + +| Field | Type | Required | Description | +|---|---|---|---| +| `type` | string | Required | The kind of dialog. See dialog types below. | +| `start` | string (ISO 8601) | Required | Start time of this dialog segment. | +| `duration` | number | Optional | Duration in seconds. | +| `parties` | array of integers | Required | Indices into the top-level `parties` array identifying who participated. | +| `originator` | integer | Optional | Index of the party who initiated this dialog segment. | +| `mimetype` | string | Optional | MIME type of the content when `body` or `url` is present, e.g. `"audio/x-wav"` or `"text/plain"`. | +| `filename` | string | Optional | Original filename for the content. | +| `body` | string | Optional | Inline content. For `text` dialogs this is the message text; for encoded content this is the encoded string. | +| `url` | string | Optional | URL pointing to the external content (e.g. an audio file in S3). | +| `encoding` | string | Optional | Encoding applied to `body`. See Encoding Types. | +| `alg` | string | Optional | Hash algorithm used for `signature`, e.g. `"SHA-512"`. | +| `signature` | string | Optional | Cryptographic signature of the content for integrity verification. | +| `disposition` | string | Optional | SIP disposition or call outcome, e.g. `"no-answer"`. | +| `meta` | object | Optional | Additional metadata specific to the dialog type. | + +### Dialog Types + +| Type | Description | +|---|---| +| `recording` | An audio or video recording of the conversation. `url` or `body` contains the media. | +| `text` | A text message or transcript segment. `body` contains the text content. | +| `transfer` | A call transfer event. Records the signaling of a call being redirected. | +| `incomplete` | A dialog segment that did not complete normally, e.g. a missed call. | + +### Example Dialog Entry + +```json +{ + "type": "recording", + "start": "2024-01-15T10:30:00.000+00:00", + "duration": 182.4, + "parties": [0, 1], + "originator": 0, + "mimetype": "audio/x-wav", + "url": "https://storage.example.com/recordings/abc123.wav", + "encoding": "none" +} +``` + +--- + +## Attachment Object + +Attachments carry supplementary data associated with the vCon as a whole (rather than with a specific dialog segment). Common uses include metadata tags, original source files, and structured documents. + +| Field | Type | Required | Description | +|---|---|---|---| +| `type` | string | Required | Identifies what the attachment represents. The special value `"tags"` is used for key/value metadata tags. | +| `body` | string, object, or array | Required | The attachment content. Shape depends on `type` and `encoding`. | +| `encoding` | string | Required | Encoding of `body`. See Encoding Types. | +| `filename` | string | Optional | Original filename if the attachment is a file. | +| `mimetype` | string | Optional | MIME type of the content. | + +### The `tags` Attachment Type + +The `tags` attachment is a special, well-known attachment type used throughout vCon Server for metadata tagging. When present, it has exactly one entry in the `attachments` array with `"type": "tags"`. + +- `body` is an **array of strings**. +- Each string has the format `"name:value"`. +- There is at most one `tags` attachment per vCon; additional tags are appended to its `body` array. + +Links add tags using `vcon.add_tag(name, value)`, and read them using `vcon.get_tag(name)`. + +#### Example tags attachment + +```json +{ + "type": "tags", + "body": [ + "status:processed", + "sentiment:positive", + "language:en" + ], + "encoding": "none" +} +``` + +### Example General Attachment + +```json +{ + "type": "transcript_source", + "body": "{\"provider\": \"deepgram\", \"model\": \"nova-2\"}", + "encoding": "json", + "mimetype": "application/json" +} +``` + +--- + +## Analysis Object + +Analysis entries hold the results of automated processing applied to the conversation โ€” transcripts, summaries, sentiment scores, named entity recognition output, and so on. + +| Field | Type | Required | Description | +|---|---|---|---| +| `type` | string | Required | The kind of analysis, e.g. `"transcript"`, `"summary"`, `"sentiment"`. | +| `dialog` | integer or array of integers | Required | Index (or indices) of the dialog entries this analysis applies to. | +| `vendor` | string | Required | Name of the system or service that produced the analysis, e.g. `"deepgram"`, `"openai"`. | +| `body` | string, object, or array | Required | The analysis result. Shape depends on the `type` and `encoding`. | +| `encoding` | string | Required | Encoding of `body`. See Encoding Types. | +| `vendor_schema` | string | Optional | Identifier for the vendor-specific schema version of `body`. | + +Extra fields can be added to an analysis entry via the `extra` parameter of `vcon.add_analysis()` and will be merged at the top level of the object. + +### Example Analysis Entry + +```json +{ + "type": "transcript", + "dialog": 0, + "vendor": "deepgram", + "body": { + "transcript": "Hello, how can I help you today?", + "confidence": 0.98, + "words": [] + }, + "encoding": "json" +} +``` + +--- + +## Encoding Types + +The `encoding` field appears on both `attachments` and `analysis` entries and controls how `body` should be interpreted. + +| Value | Description | +|---|---| +| `none` | `body` is a plain Python/JSON value โ€” a string, object, or array โ€” and requires no decoding. | +| `json` | `body` is a JSON-encoded string. The string must be valid JSON; the consumer should parse it with `json.loads()`. | +| `base64url` | `body` is a Base64url-encoded string (URL-safe alphabet, no padding). Used for binary content such as audio files embedded inline. Decoded with `base64.urlsafe_b64decode()`. | + +--- + +## Complete Annotated Example + +The following JSON shows a complete, minimal vCon containing two parties, one audio recording dialog, a `tags` attachment, and a transcript analysis entry. + +```json +{ + "vcon": "0.0.1", + "uuid": "018e1b2c-3d4e-8f56-a789-0b1c2d3e4f50", + "created_at": "2024-01-15T10:30:00.000+00:00", + "updated_at": "2024-01-15T10:32:15.000+00:00", + "subject": "Customer support call โ€” billing inquiry", + "redacted": {}, + "group": [], + + "parties": [ + { + "tel": "+15551234567", + "name": "Alice Example", + "role": "customer" + }, + { + "tel": "+15559876543", + "name": "Bob Agent", + "role": "agent" + } + ], + + "dialog": [ + { + "type": "recording", + "start": "2024-01-15T10:30:00.000+00:00", + "duration": 182.4, + "parties": [0, 1], + "originator": 0, + "mimetype": "audio/x-wav", + "url": "https://storage.example.com/recordings/018e1b2c.wav", + "encoding": "none" + } + ], + + "attachments": [ + { + "type": "tags", + "body": [ + "status:processed", + "language:en", + "topic:billing" + ], + "encoding": "none" + } + ], + + "analysis": [ + { + "type": "transcript", + "dialog": 0, + "vendor": "deepgram", + "body": { + "transcript": "Hello, thank you for calling. How can I help you today?", + "confidence": 0.97 + }, + "encoding": "json" + } + ] +} +``` + +--- + +## UUID Generation + +vCon Server generates **UUID version 8** identifiers for new vCons. UUID v8 is a custom UUID format defined in [draft-peabody-dispatch-new-uuid-format](https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-04.txt) that allows implementors to embed domain-specific bits. + +The implementation in `server/vcon.py` (`Vcon.uuid8_domain_name`) works as follows: + +1. A SHA-1 hash is computed over the configured DNS domain name string. +2. The upper 62 bits of the hash are used as the `custom_c` portion of the UUID, making the identifier domain-scoped. +3. The `custom_a` and `custom_b` portions are derived from the current Unix timestamp in milliseconds and a sub-millisecond counter, providing monotonic ordering within a process. + +The domain name defaults to `"strolid.com"` and can be overridden by setting the `UUID8_DOMAIN_NAME` environment variable: + +```bash +UUID8_DOMAIN_NAME=mycompany.com +``` + +This means UUIDs generated by different deployments that use different domain names will occupy different ranges of the UUID space, reducing collision risk across independently operating systems. diff --git a/mkdocs.yml b/mkdocs.yml index cb96302..063d6a3 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -74,16 +74,38 @@ nav: - Dynamic Modules: extending/dynamic-modules.md - Reference: - reference/index.md + - vCon Data Model: reference/vcon-data-model.md - Links: - Deepgram: reference/links/deepgram.md + - Groq Whisper: reference/links/groq-whisper.md + - HuggingFace Whisper: reference/links/huggingface-whisper.md + - OpenAI Transcribe: reference/links/openai-transcribe.md - Analyze: reference/links/analyze.md + - Analyze and Label: reference/links/analyze-and-label.md + - Analyze vCon: reference/links/analyze-vcon.md + - Detect Engagement: reference/links/detect-engagement.md - Tag: reference/links/tag.md + - Check and Tag: reference/links/check-and-tag.md + - Tag Router: reference/links/tag-router.md + - Sampler: reference/links/sampler.md - Webhook: reference/links/webhook.md + - Slack: reference/links/slack.md + - Diet: reference/links/diet.md + - Expire vCon: reference/links/expire-vcon.md + - jq: reference/links/jq.md - Storage Adapters: - PostgreSQL: reference/storage-adapters/postgres.md - S3: reference/storage-adapters/s3.md - MongoDB: reference/storage-adapters/mongo.md + - Redis: reference/storage-adapters/redis.md + - Elasticsearch: reference/storage-adapters/elasticsearch.md + - Milvus: reference/storage-adapters/milvus.md + - File: reference/storage-adapters/file.md + - SFTP: reference/storage-adapters/sftp.md + - Dataverse: reference/storage-adapters/dataverse.md + - ChatGPT Files: reference/storage-adapters/chatgpt-files.md - CLI Reference: reference/cli-reference.md + - Contributing: contributing.md plugins: - search