Created: November 19, 2025
Purpose: Guide for LLMs and AI systems to understand and interact with the vCon database
This project includes comprehensive documentation designed specifically for LLMs and AI systems to understand how the vCon database is organized, how it works, and how to build applications that interact with it.
File: docs/reference/AGENT_DATABASE_SCHEMA.md
Purpose: Describes the deployed PostgreSQL schema as defined by supabase/migrations/ (tables, tenant columns, embeddings, materialized views, RLS, legacy dual columns). Prefer this over older monolithic schema pages that may drift.
The vCon database implements the IETF vCon (Virtual Conversation) specification in a PostgreSQL database with advanced features like semantic search, multi-tenant isolation, and GDPR compliance.
File: DATABASE_ARCHITECTURE_FOR_LLMS.md
Size: ~48KB
Purpose: Comprehensive deep-dive into database architecture
Contents:
- Overview of vCon and database technology stack
- Complete data model explanation
- Detailed table reference (12 tables)
- Index and performance strategies
- Search capabilities (keyword, semantic, hybrid, tag-based)
- Multi-tenant architecture with RLS
- Data relationships and foreign keys
- Query patterns and best practices
- Extensions (embeddings, caching, S3 sync)
- GDPR compliance features
Best For: Understanding the complete system architecture, designing new features, troubleshooting complex issues
File: DATABASE_QUICKSTART_FOR_LLMS.md
Size: ~25KB
Purpose: Rapid onboarding with practical code examples
Contents:
- 5-minute TL;DR overview
- Complete code examples for:
- Creating a vCon (TypeScript and SQL)
- Retrieving vCons
- All 4 search types
- Updating vCons
- Deleting vCons
- Multi-tenant setup
- Critical field name reference
- Common error patterns and solutions
- Performance tips
- Testing script template
Best For: Getting started quickly, finding code snippets, avoiding common mistakes
File: DATABASE_SCHEMA_VISUAL.md
Size: ~20KB
Purpose: Visual diagrams and quick lookup reference
Contents:
- Complete entity relationship diagram (ASCII art)
- Table structures with all fields
- Relationship summary
- Unique constraints
- Data type reference (enums, arrays, JSONB, vectors)
- Index strategy overview
- RLS policy structure
- Search RPC function signatures
- Common query patterns (SQL)
- Database statistics queries
- Performance monitoring queries
- Migration history
Best For: Quick lookup, understanding relationships, finding SQL patterns
- Start with Quick Start: Read
DATABASE_QUICKSTART_FOR_LLMS.mdto understand basic operations - Reference Visual Schema: Use
DATABASE_SCHEMA_VISUAL.mdfor table structures and relationships - Deep Dive When Needed: Consult
DATABASE_ARCHITECTURE_FOR_LLMS.mdfor complex features
- Read Architecture First:
DATABASE_ARCHITECTURE_FOR_LLMS.mdprovides complete context - Visual Reference:
DATABASE_SCHEMA_VISUAL.mdhelps visualize relationships - Code Examples:
DATABASE_QUICKSTART_FOR_LLMS.mdshows practical usage
- Check Quick Start Errors: Common errors and solutions in
DATABASE_QUICKSTART_FOR_LLMS.md - Review Schema: Confirm field names and constraints in
DATABASE_SCHEMA_VISUAL.md - Understand Design: Read relevant sections in
DATABASE_ARCHITECTURE_FOR_LLMS.md
vCon (Virtual Conversation) is an IETF standard for representing conversations in a portable, interoperable format. It's like "PDF for conversations" - a standardized container for:
- Conversations from any medium (voice, video, text, email)
- Participants with identity and privacy controls
- AI analysis results (transcripts, summaries, sentiment)
- Attachments (documents, images, files)
- Privacy markers for consent and redaction
The vCon database is a normalized relational database (not a document database):
- 8 core tables: vcons, parties, dialog, analysis, attachments, groups, party_history, vcon_embeddings
- 4 extension tables: vcon_tags_mv, privacy_requests, embedding_queue, s3_sync_tracking
- 25+ strategic indexes for performance
- Row Level Security (RLS) for multi-tenant isolation
- pgvector extension for semantic search
- pg_trgm extension for fuzzy text search
-
Multiple Search Types:
- Keyword search (full-text with trigram matching)
- Semantic search (AI-powered vector similarity)
- Hybrid search (combines keyword and semantic)
- Tag-based filtering
-
Multi-Tenant Support:
- Row Level Security (RLS) on all tables
- Tenant ID extracted from vCon attachments
- JWT or app setting based tenant context
-
Performance:
- Strategic indexes for fast queries
- Optional Redis caching (20-50x faster reads)
- Materialized view for tag queries
- HNSW index for vector search
-
IETF Compliance:
- Implements draft-ietf-vcon-vcon-core-00
- Correct field names (e.g.,
schemanotschema_version) - Required fields enforced (e.g.,
analysis.vendor) - Proper data types (e.g.,
bodyas TEXT)
-
Extensions:
- Async embedding generation
- S3 sync for external storage
- GDPR compliance features
- Privacy request tracking
These are documented in detail in the guides, but here's a quick reference:
- ❌
analysis.schema_version→ ✅analysis.schema - ❌
analysis.vendoroptional → ✅analysis.vendorREQUIRED - ❌
analysis.bodyas JSONB → ✅analysis.bodyas TEXT
- ❌ Setting default encoding values → ✅ Explicitly set or leave NULL
- ❌
partiesasparties[]→ ✅partiesasINTEGER[]
- ❌ No LIMIT on queries → ✅ Always use LIMIT
- ❌ Using LIKE for full-text → ✅ Use search RPCs
- ❌ Missing indexes → ✅ Filter by indexed fields
- Database: PostgreSQL 15+
- Extensions:
- pgvector (semantic search)
- pg_trgm (fuzzy text search)
- uuid-ossp (UUID generation)
- Vector Dimensions: 384 (optimized for OpenAI text-embedding-3-small)
- Caching: Optional Redis
- Platform: Supabase (PostgreSQL hosting)
- Client Libraries:
- @supabase/supabase-js (JavaScript/TypeScript)
- Direct PostgreSQL clients
The database is designed to scale to:
- Millions of vCons
- Billions of dialog messages
- Hundreds of millions of embeddings
- Multiple tenants with isolation
Performance characteristics:
- UUID lookups: < 10ms
- Keyword search: 50-500ms
- Semantic search: 100-1000ms (depends on corpus size)
- Hybrid search: 200-1500ms
- Tag filtering: < 50ms (via materialized view)
With Redis caching:
- Cached reads: < 5ms (20-50x improvement)
vcons- Main conversation containerparties- Participantsdialog- Conversation segmentsanalysis- AI/ML resultsattachments- Files and metadatagroups- vCon aggregationparty_history- Party events
vcon_embeddings- Semantic search vectorsvcon_tags_mv- Materialized view for tagsprivacy_requests- GDPR complianceembedding_queue- Async processings3_sync_tracking- External storage
search_vcons_keyword()- Full-text keyword searchsearch_vcons_semantic()- Vector similarity searchsearch_vcons_hybrid()- Combined searchsearch_vcons_by_tags()- Tag-based filtering
get_current_tenant_id()- Get tenant contextextract_tenant_from_attachments()- Extract tenant IDpopulate_tenant_ids_batch()- Batch populate tenant IDs
README.md- Project overview and featuresBUILD_GUIDE.md- Step-by-step implementationsupabase/migrations/- Database migration filessrc/types/vcon.ts- TypeScript type definitionssrc/db/queries.ts- Query implementationdocs/guide/- User guides
background_docs/draft-ietf-vcon-vcon-core-00.txt- Official vCon specbackground_docs/draft-howe-vcon-consent-00.txt- Privacy and consentbackground_docs/draft-howe-vcon-lifecycle-00.txt- Lifecycle management
-
Quick question about field names or data types?
→ CheckDATABASE_SCHEMA_VISUAL.md -
Need code examples?
→ SeeDATABASE_QUICKSTART_FOR_LLMS.md -
Understanding a feature or design decision?
→ Read relevant section inDATABASE_ARCHITECTURE_FOR_LLMS.md -
Want to see production code?
→ Checksrc/db/queries.tsand test scripts inscripts/
- Creating vCons:
src/db/queries.ts-createVCon() - Searching:
src/tools/search-tools.ts - Tags:
src/tools/tag-tools.ts - Multi-tenant:
src/config/tenant-config.ts
Run these to understand how the database works:
scripts/test-database-tools.ts- Basic CRUD operationsscripts/test-search-tools.ts- Search functionalityscripts/test-semantic-search.ts- Semantic searchscripts/test-tags.ts- Tag system
- Database Schema Version: 0.3.0 (IETF vCon spec version)
- Documentation Created: November 19, 2025
- Latest Migration:
20251119140000_sync_most_recent_first.sql - Vector Dimension: 384 (migrated from 1536)
When making database changes:
- Update migration files in
supabase/migrations/ - Update type definitions in
src/types/vcon.ts - Update these LLM documentation files if structure changes
- Run tests to verify changes
- Adding new tables
- Adding new fields to existing tables
- Changing indexes
- Adding new RPC functions
- Changing multi-tenant configuration
- Updating search algorithms
These three documentation files provide everything an LLM or AI system needs to:
- Understand the vCon database architecture
- Write applications that interact with the database
- Query and search conversation data
- Implement multi-tenant applications
- Optimize performance
- Comply with IETF specifications
- Handle privacy and GDPR requirements
Start with: DATABASE_QUICKSTART_FOR_LLMS.md
Reference: DATABASE_SCHEMA_VISUAL.md
Deep dive: DATABASE_ARCHITECTURE_FOR_LLMS.md
Happy coding! 🚀