Skip to content

cockroachdb/langchain-cockroachdb

langchain-cockroachdb

Tests codecov PyPI version Python 3.10+ Downloads License

LangChain integration for CockroachDB with native vector support

Quick Start β€’ Features β€’ Documentation β€’ Examples β€’ Contributing


Overview

Build LLM applications with CockroachDB's distributed SQL database and native vector search capabilities. This integration provides:

  • 🎯 Native Vector Support - CockroachDB's VECTOR type
  • πŸš€ C-SPANN Indexes - Distributed vector indexes optimized for scale
  • πŸ”„ Automatic Retries - Handles serialization errors transparently
  • ⚑ Async & Sync APIs - Choose based on your use case
  • πŸ—οΈ Distributed by Design - Built for CockroachDB's architecture

Quick Start

Installation

pip install langchain-cockroachdb

Basic Usage

import asyncio
from langchain_cockroachdb import AsyncCockroachDBVectorStore, CockroachDBEngine
from langchain_openai import OpenAIEmbeddings

async def main():
    # Initialize
    engine = CockroachDBEngine.from_connection_string(
        "cockroachdb://user:pass@host:26257/db"
    )
    
    await engine.ainit_vectorstore_table(
        table_name="documents",
        vector_dimension=1536,
    )
    
    vectorstore = AsyncCockroachDBVectorStore(
        engine=engine,
        embeddings=OpenAIEmbeddings(),
        collection_name="documents",
    )
    
    # Add documents
    await vectorstore.aadd_texts([
        "CockroachDB is a distributed SQL database",
        "LangChain makes building LLM apps easy",
    ])
    
    # Search
    results = await vectorstore.asimilarity_search(
        "Tell me about databases",
        k=2
    )
    
    for doc in results:
        print(doc.page_content)
    
    await engine.aclose()

asyncio.run(main())

Features

Vector Store

  • Native VECTOR type support with C-SPANN indexes
  • Advanced metadata filtering ($and, $or, $gt, $in, etc.)
  • Hybrid search (full-text + vector similarity)
  • Multi-tenant index support with prefix columns

Reliability

  • Automatic retry logic with exponential backoff
  • Connection pooling with health checks
  • Configurable for different workloads
  • Built for SERIALIZABLE isolation

Developer Experience

  • Async-first design for high concurrency
  • Sync wrapper for simple scripts
  • Type-safe with full type hints
  • Comprehensive test suite (92 tests)

Documentation

πŸ“š Complete Documentation

Getting Started:

Guides:

Examples

πŸ”§ Working Examples

Development

Setup

# Clone repository
git clone https://github.com/cockroachdb/langchain-cockroachdb.git
cd langchain-cockroachdb

# Install dependencies
pip install -e ".[dev]"

# Start CockroachDB
docker-compose up -d

# Run tests
make test

Documentation

# Install docs dependencies
pip install -e ".[docs]"

# Serve documentation locally
mkdocs serve

# Open http://127.0.0.1:8000

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

Why CockroachDB?

  • Distributed SQL - Scale horizontally across regions
  • Native Vector Support - First-class VECTOR type and C-SPANN indexes
  • Strong Consistency - SERIALIZABLE isolation by default
  • Cloud Native - Deploy anywhere (IBM, AWS, GCP, Azure, on-prem)
  • PostgreSQL Compatible - Familiar SQL with distributed superpowers

License

Apache License 2.0 - see LICENSE for details.

Acknowledgments

Built for the CockroachDB and LangChain communities.

Links