Build domain-specific AI assistants that connect chat, voice, knowledge search, CRM workflows, and business automation in one observable runtime.
This repository is a production-oriented foundation for AI assistants that answer from private knowledge, search domain catalogs, qualify leads, update CRM/workflow systems with human approval, and operate across Telegram, voice, API, and Mini App surfaces.
The current codebase includes one working sales/catalog automation domain, but the platform is intentionally modular: replace the domain prompts, search tools, and business integrations while keeping the orchestration, retrieval, ingestion, caching, observability, and runtime contracts.
Most business bots stop at scripted replies. Most AI demos stop at a prompt. This project demonstrates the full operational loop: users ask in natural language, the system retrieves grounded knowledge, routes workflow intent, calls business tools, asks for human approval when needed, and leaves traces operators can inspect.
| Business need | Platform capability |
|---|---|
| Reduce repeated manual answers | RAG over private documents with citation-aware generation |
| Convert conversations into action | CRM/workflow tools, lead scoring, tasks, notes, and manager handoff |
| Let users search naturally | Domain-specific extraction plus hybrid vector/search pipelines |
| Support more than text chat | Telegram bot, Telegram voice input, LiveKit voice-agent path, API, and Mini App surface |
| Keep AI behavior inspectable | LangGraph state flow, Langfuse traces, structured logs, quality scores, and runbooks |
| Control latency and cost | Redis-backed semantic, embedding, search, rerank, and extraction caches |
The reusable part is the platform: channels, graph orchestration, retrieval, ingestion, cache, observability, and Docker runtime. The replaceable part is the domain layer: prompts, tools, search schema, CRM/workflow integration, and UI copy.
| Domain | Replaceable module examples |
|---|---|
| Sales and CRM | Lead qualification, deal lookup, tasks, notes, manager handoff |
| Customer support | Policy answers, ticket lookup, escalation workflows |
| E-commerce | Product catalog search, recommendations, order status tools |
| Education | Course search, student FAQ, onboarding flows |
| Internal operations | Knowledge-base assistant, document search, approval workflows |
| Capability | What it means in this repo | Evidence |
|---|---|---|
| Stateful AI orchestration | LangGraph routes classification, guard, cache, retrieval, grading, reranking, generation, response, and optional summarization | telegram_bot/graph/ |
| Typed workflow state | One state contract tracks query, routing, retrieval, filters, cache, scoring, policy, latency, and response metadata | telegram_bot/graph/state.py |
| Cross-channel assistant runtime | Telegram text, Telegram voice, LiveKit voice agent, FastAPI RAG API, and Mini App handoff reuse the same core ideas | src/voice/, src/api/, mini_app/ |
| Self-hosted retrieval | BGE-M3 + Qdrant support dense, sparse, and ColBERT-style retrieval paths | docs/QDRANT_STACK.md |
| Deterministic ingestion | CocoIndex and Docling parse, chunk, embed, upsert/delete, retry, and track DLQ state | docs/INGESTION.md |
| Business tool actions | CRM/domain tools can create workflow actions with HITL confirmation for sensitive writes | telegram_bot/agents/ |
| Cost and latency controls | Redis caches semantic answers, embeddings, search results, rerank results, and extraction outputs | telegram_bot/integrations/cache.py |
| Observability | Langfuse traces/scores, trace validation, Loki/Promtail/Alertmanager local monitoring, and runbooks | docs/PIPELINE_OVERVIEW.md |
| Compose-first runtime | Docker Compose profiles cover core services, bot, ingestion, voice, ML observability, monitoring, and full stack | DOCKER.md |
A simple chatbot receives a message and calls an LLM. This repository treats the assistant as an operating system for business workflows:
- The graph can decide whether to answer, retrieve, rewrite, rerank, call tools, ask for approval, or hand off.
- Retrieval is not a single vector query; it includes dense/sparse search, optional reranking, cache policy, grading, and fallbacks.
- Domain behavior lives behind tools and prompts, so catalog/search/CRM logic can be replaced without rewriting the runtime.
- Runtime behavior is observable through traces, logs, health checks, validation commands, and documented runbooks.
graph TB
subgraph "Channels"
TG["Telegram Bot"]
TV["Telegram Voice"]
MA["Telegram Mini App"]
VA["LiveKit Voice Agent"]
API["FastAPI RAG API"]
end
subgraph "AI Workflow Core"
ROUTER["Classifier / Router"]
GRAPH["LangGraph Workflow"]
HITL["Human Approval"]
TOOLS["Domain + CRM Tools"]
end
subgraph "Knowledge Runtime"
INGEST["CocoIndex + Docling Ingestion"]
RAG["RAG Pipeline"]
CACHE["Redis Semantic Cache"]
EVAL["Evaluation + Trace Validation"]
end
subgraph "Data + Model Services"
QD[("Qdrant")]
RD[("Redis")]
PG[("PostgreSQL")]
BGE["BGE-M3 Embeddings"]
LLM["LiteLLM Providers"]
end
subgraph "Operations"
LF["Langfuse"]
MON["Loki / Promtail / Alertmanager"]
DOCKER["Docker Compose Profiles"]
end
TG --> ROUTER
TV --> ROUTER
MA --> TG
VA --> API --> ROUTER
ROUTER --> GRAPH
GRAPH --> RAG
GRAPH --> TOOLS
TOOLS --> HITL
RAG --> CACHE
RAG --> QD
INGEST --> QD
CACHE --> RD
GRAPH --> PG
RAG --> BGE
GRAPH --> LLM
GRAPH --> LF
EVAL --> LF
DOCKER --> QD
DOCKER --> BGE
DOCKER --> MON
If you are evaluating the project for collaboration, hiring, or client work, read it in this order:
docs/portfolio/resume-case-study.md- concise narrative, feature cards, trade-offs, and limitations.docs/review/PROJECT_GUIDE.md- folder map and high-signal files.telegram_bot/graph/- LangGraph orchestration and state flow.telegram_bot/agents/andtelegram_bot/services/- tools, business logic, cache, search, CRM, scoring, handoff.src/ingestion/unified/- deterministic ingestion and Qdrant writes.compose.yml,compose.dev.yml, andDOCKER.md- runtime architecture.
Safe review notes:
- Do not run production deploy scripts, use real credentials, or trigger real CRM write paths without a dedicated review environment.
- Start with
docs/review/ACCESS_FOR_REVIEWERS.mdbefore executing commands. - Repository presentation checklist lives in
docs/review/GITHUB_REPO_SETUP.md.
- Workflow architecture: LangGraph graph, typed state, conditional edges, tool boundaries, and HITL confirmation.
- Retrieval infrastructure: BGE-M3, Qdrant dense/sparse/ColBERT vectors, aliases, strict mode, and reranking path.
- Ingestion reliability: stable file identity, Docling parsing, chunking, Qdrant upsert/delete, PostgreSQL state, retries, and DLQ.
- Runtime discipline: Compose profiles, pinned images, health checks, preflight checks, remote Docker helpers, and operational docs.
- Cost controls: Redis semantic answer cache, embedding cache, search cache, rerank cache, and extraction cache.
- Observability: Langfuse traces/scores, prompt management, trace validation, local monitoring, and runbooks.
- Quality gates: Ruff, MyPy, pytest tiers, CI guardrails, and documented local validation.
The repository currently includes a working sales/catalog automation module: catalog-style search, lead workflows, manager handoff, scoring, and Kommo CRM integration. Treat that as the first domain implementation, not the boundary of the platform.
To adapt the system, replace the domain prompts, extraction logic, catalog/search tools, CRM/tool integrations, and UI copy while keeping the common runtime: channels, LangGraph orchestration, RAG, ingestion, cache, observability, and Docker profiles.
Choose the path that matches your goal:
| Goal | Start here |
|---|---|
| Review safely before running commands | docs/review/ACCESS_FOR_REVIEWERS.md |
| Run core local services | make local-up |
| Run the bot natively for fast iteration | make test-bot-health then make run-bot |
| Run the Compose bot stack | make docker-bot-up |
| Run the full Compose stack | make docker-full-up |
| Understand runtime profiles and ports | DOCKER.md |
- Python 3.11+; Python 3.12 is recommended for local development.
uv- Docker with Compose support.
.envcopied from.env.exampleand filled with local/test credentials.
cp .env.example .env.env.local is not loaded automatically; use .env for local runs.
For full setup, validation ladder, environment behavior, and troubleshooting, use docs/LOCAL-DEVELOPMENT.md.
Docker Compose is the primary local/VPS runtime. Profiles split the system by operational surface:
| Profile | Services |
|---|---|
| default/core | PostgreSQL, Redis, Qdrant, BGE-M3, Docling, user-base, Mini App API/frontend |
bot |
LiteLLM and Telegram bot |
ingest |
unified ingestion service |
ml |
Langfuse, ClickHouse, MinIO, Redis Langfuse, worker |
obs |
Loki, Promtail, Alertmanager |
voice |
RAG API, LiveKit, SIP, voice agent |
full |
all profile-gated services |
The repo also includes remote Docker helpers for running Compose on a remote host. Treat this as a development convenience, not a required public setup path; see docs/LOCAL-DEVELOPMENT.md and DOCKER.md for details.
Use docs/review/PROJECT_GUIDE.md for the maintained folder map and high-signal files.
High-level entry points:
| Area | Path |
|---|---|
| Telegram assistant runtime | telegram_bot/ |
| LangGraph workflow | telegram_bot/graph/ |
| Business/domain tools | telegram_bot/agents/ and telegram_bot/services/ |
| RAG API and voice | src/api/ and src/voice/ |
| Unified ingestion | src/ingestion/unified/ |
| Mini App | mini_app/ |
| Runtime | compose.yml, compose.dev.yml, DOCKER.md |
make check # Ruff lint + MyPy strict type checking
make test-unit # Unit tests (parallel via pytest-xdist)
make test-full # Full suite: parallel-safe tiers first, live/stateful tiers afterLocal verification is the release authority for this repo. Run focused checks for the touched area before merging to dev or deploying. CI is intentionally lightweight: it runs static/lint guardrails, not pytest suites or the authoritative full-suite signal.
- Docker Compose is the primary local runtime path.
- k3s manifests exist for core services but are not full parity with Compose.
- Monitoring services are local/dev unless production evidence is added.
- HITL confirmation protects CRM/write workflows, not every possible state transition.
- The Mini App is a lightweight entry surface and not full parity with the Telegram bot.
- Some UI/i18n strings are still being migrated into Fluent bundles.
- Domain-specific names, prompts, and catalog fields remain in the current implementation and should be replaced for a different customer domain.
| Document | Use it for |
|---|---|
docs/portfolio/resume-case-study.md |
portfolio narrative and feature cards |
docs/review/ACCESS_FOR_REVIEWERS.md |
safe review path |
docs/review/PROJECT_GUIDE.md |
folder map and high-signal files |
docs/README.md |
full documentation index |
DOCKER.md |
Compose services, profiles, ports, env, runtime contracts |
docs/LOCAL-DEVELOPMENT.md |
local setup and validation ladder |
docs/PIPELINE_OVERVIEW.md |
query, retrieval, ingestion, voice, observability flows |
docs/INGESTION.md |
unified ingestion operations |
docs/QDRANT_STACK.md |
vector schema and Qdrant operations |
This project is licensed under the MIT License.