vCon Database Documentation Index

Created: November 19, 2025
Purpose: Guide for LLMs and AI systems to understand and interact with the vCon database

Overview

This project includes comprehensive documentation designed specifically for LLMs and AI systems to understand how the vCon database is organized, how it works, and how to build applications that interact with it.

Authoritative schema for coding agents (start here)

File: docs/reference/AGENT_DATABASE_SCHEMA.md
Purpose: Describes the deployed PostgreSQL schema as defined by supabase/migrations/ (tables, tenant columns, embeddings, materialized views, RLS, legacy dual columns). Prefer this over older monolithic schema pages that may drift.

The vCon database implements the IETF vCon (Virtual Conversation) specification in a PostgreSQL database with advanced features like semantic search, multi-tenant isolation, and GDPR compliance.

Documentation Structure

1. Complete Architecture Guide

File: DATABASE_ARCHITECTURE_FOR_LLMS.md
Size: ~48KB
Purpose: Comprehensive deep-dive into database architecture

Contents:

Overview of vCon and database technology stack
Complete data model explanation
Detailed table reference (12 tables)
Index and performance strategies
Search capabilities (keyword, semantic, hybrid, tag-based)
Multi-tenant architecture with RLS
Data relationships and foreign keys
Query patterns and best practices
Extensions (embeddings, caching, S3 sync)
GDPR compliance features

Best For: Understanding the complete system architecture, designing new features, troubleshooting complex issues

2. Quick Start Guide

File: DATABASE_QUICKSTART_FOR_LLMS.md
Size: ~25KB
Purpose: Rapid onboarding with practical code examples

Contents:

5-minute TL;DR overview
Complete code examples for:
- Creating a vCon (TypeScript and SQL)
- Retrieving vCons
- All 4 search types
- Updating vCons
- Deleting vCons
- Multi-tenant setup
Critical field name reference
Common error patterns and solutions
Performance tips
Testing script template

Best For: Getting started quickly, finding code snippets, avoiding common mistakes

3. Visual Schema Reference

File: DATABASE_SCHEMA_VISUAL.md
Size: ~20KB
Purpose: Visual diagrams and quick lookup reference

Contents:

Complete entity relationship diagram (ASCII art)
Table structures with all fields
Relationship summary
Unique constraints
Data type reference (enums, arrays, JSONB, vectors)
Index strategy overview
RLS policy structure
Search RPC function signatures
Common query patterns (SQL)
Database statistics queries
Performance monitoring queries
Migration history

Best For: Quick lookup, understanding relationships, finding SQL patterns

How to Use This Documentation

For Building Applications

Start with Quick Start: Read DATABASE_QUICKSTART_FOR_LLMS.md to understand basic operations
Reference Visual Schema: Use DATABASE_SCHEMA_VISUAL.md for table structures and relationships
Deep Dive When Needed: Consult DATABASE_ARCHITECTURE_FOR_LLMS.md for complex features

For Understanding the System

Read Architecture First: DATABASE_ARCHITECTURE_FOR_LLMS.md provides complete context
Visual Reference: DATABASE_SCHEMA_VISUAL.md helps visualize relationships
Code Examples: DATABASE_QUICKSTART_FOR_LLMS.md shows practical usage

For Troubleshooting

Check Quick Start Errors: Common errors and solutions in DATABASE_QUICKSTART_FOR_LLMS.md
Review Schema: Confirm field names and constraints in DATABASE_SCHEMA_VISUAL.md
Understand Design: Read relevant sections in DATABASE_ARCHITECTURE_FOR_LLMS.md

Key Concepts Summary

What is vCon?

vCon (Virtual Conversation) is an IETF standard for representing conversations in a portable, interoperable format. It's like "PDF for conversations" - a standardized container for:

Conversations from any medium (voice, video, text, email)
Participants with identity and privacy controls
AI analysis results (transcripts, summaries, sentiment)
Attachments (documents, images, files)
Privacy markers for consent and redaction

Database Architecture

The vCon database is a normalized relational database (not a document database):

8 core tables: vcons, parties, dialog, analysis, attachments, groups, party_history, vcon_embeddings
4 extension tables: vcon_tags_mv, privacy_requests, embedding_queue, s3_sync_tracking
25+ strategic indexes for performance
Row Level Security (RLS) for multi-tenant isolation
pgvector extension for semantic search
pg_trgm extension for fuzzy text search

Key Features

Multiple Search Types:
- Keyword search (full-text with trigram matching)
- Semantic search (AI-powered vector similarity)
- Hybrid search (combines keyword and semantic)
- Tag-based filtering
Multi-Tenant Support:
- Row Level Security (RLS) on all tables
- Tenant ID extracted from vCon attachments
- JWT or app setting based tenant context
Performance:
- Strategic indexes for fast queries
- Optional Redis caching (20-50x faster reads)
- Materialized view for tag queries
- HNSW index for vector search
IETF Compliance:
- Implements draft-ietf-vcon-vcon-core-00
- Correct field names (e.g., schema not schema_version)
- Required fields enforced (e.g., analysis.vendor)
- Proper data types (e.g., body as TEXT)
Extensions:
- Async embedding generation
- S3 sync for external storage
- GDPR compliance features
- Privacy request tracking

Common Mistakes to Avoid

These are documented in detail in the guides, but here's a quick reference:

Field Name Mistakes

❌ analysis.schema_version → ✅ analysis.schema
❌ analysis.vendor optional → ✅ analysis.vendor REQUIRED
❌ analysis.body as JSONB → ✅ analysis.body as TEXT

Data Type Mistakes

❌ Setting default encoding values → ✅ Explicitly set or leave NULL
❌ parties as parties[] → ✅ parties as INTEGER[]

Query Mistakes

❌ No LIMIT on queries → ✅ Always use LIMIT
❌ Using LIKE for full-text → ✅ Use search RPCs
❌ Missing indexes → ✅ Filter by indexed fields

Technology Stack

Database: PostgreSQL 15+
Extensions:
- pgvector (semantic search)
- pg_trgm (fuzzy text search)
- uuid-ossp (UUID generation)
Vector Dimensions: 384 (optimized for OpenAI text-embedding-3-small)
Caching: Optional Redis
Platform: Supabase (PostgreSQL hosting)
Client Libraries:
- @supabase/supabase-js (JavaScript/TypeScript)
- Direct PostgreSQL clients

Database Statistics (Example)

The database is designed to scale to:

Millions of vCons
Billions of dialog messages
Hundreds of millions of embeddings
Multiple tenants with isolation

Performance characteristics:

UUID lookups: < 10ms
Keyword search: 50-500ms
Semantic search: 100-1000ms (depends on corpus size)
Hybrid search: 200-1500ms
Tag filtering: < 50ms (via materialized view)

With Redis caching:

Cached reads: < 5ms (20-50x improvement)

Quick Reference: Table List

Core Tables (IETF Spec)

vcons - Main conversation container
parties - Participants
dialog - Conversation segments
analysis - AI/ML results
attachments - Files and metadata
groups - vCon aggregation
party_history - Party events

Extension Tables

vcon_embeddings - Semantic search vectors
vcon_tags_mv - Materialized view for tags
privacy_requests - GDPR compliance
embedding_queue - Async processing
s3_sync_tracking - External storage

Quick Reference: RPC Functions

Search Functions

search_vcons_keyword() - Full-text keyword search
search_vcons_semantic() - Vector similarity search
search_vcons_hybrid() - Combined search
search_vcons_by_tags() - Tag-based filtering

Utility Functions

get_current_tenant_id() - Get tenant context
extract_tenant_from_attachments() - Extract tenant ID
populate_tenant_ids_batch() - Batch populate tenant IDs

Getting Help

Documentation Priority

Quick question about field names or data types?
→ Check DATABASE_SCHEMA_VISUAL.md
Need code examples?
→ See DATABASE_QUICKSTART_FOR_LLMS.md
Understanding a feature or design decision?
→ Read relevant section in DATABASE_ARCHITECTURE_FOR_LLMS.md
Want to see production code?
→ Check src/db/queries.ts and test scripts in scripts/

Example Code Locations

Creating vCons: src/db/queries.ts - createVCon()
Searching: src/tools/search-tools.ts
Tags: src/tools/tag-tools.ts
Multi-tenant: src/config/tenant-config.ts

Test Scripts

Run these to understand how the database works:

scripts/test-database-tools.ts - Basic CRUD operations
scripts/test-search-tools.ts - Search functionality
scripts/test-semantic-search.ts - Semantic search
scripts/test-tags.ts - Tag system

Version Information

Database Schema Version: 0.3.0 (IETF vCon spec version)
Documentation Created: November 19, 2025
Latest Migration: 20251119140000_sync_most_recent_first.sql
Vector Dimension: 384 (migrated from 1536)

Maintenance

Keeping Documentation Current

When making database changes:

Update migration files in supabase/migrations/
Update type definitions in src/types/vcon.ts
Update these LLM documentation files if structure changes
Run tests to verify changes

Documentation Updates Needed When:

Adding new tables
Adding new fields to existing tables
Changing indexes
Adding new RPC functions
Changing multi-tenant configuration
Updating search algorithms

Summary

These three documentation files provide everything an LLM or AI system needs to:

Understand the vCon database architecture
Write applications that interact with the database
Query and search conversation data
Implement multi-tenant applications
Optimize performance
Comply with IETF specifications
Handle privacy and GDPR requirements

Start with: DATABASE_QUICKSTART_FOR_LLMS.md
Reference: DATABASE_SCHEMA_VISUAL.md
Deep dive: DATABASE_ARCHITECTURE_FOR_LLMS.md

Happy coding! 🚀

FilesExpand file tree

DATABASE_DOCUMENTATION_INDEX.md

Latest commit

History

DATABASE_DOCUMENTATION_INDEX.md

File metadata and controls

vCon Database Documentation Index

Overview

Authoritative schema for coding agents (start here)

Documentation Structure

1. Complete Architecture Guide

2. Quick Start Guide

3. Visual Schema Reference

How to Use This Documentation

For Building Applications

For Understanding the System

For Troubleshooting

Key Concepts Summary

What is vCon?

Database Architecture

Key Features

Common Mistakes to Avoid

Field Name Mistakes

Data Type Mistakes

Query Mistakes

Technology Stack

Database Statistics (Example)

Quick Reference: Table List

Core Tables (IETF Spec)

Extension Tables

Quick Reference: RPC Functions

Search Functions

Utility Functions

Related Documentation

In This Repository

IETF Specifications

Getting Help

Documentation Priority

Example Code Locations

Test Scripts

Version Information

Maintenance

Keeping Documentation Current

Documentation Updates Needed When:

Summary