InsightAI is a production-oriented platform that turns natural language questions into safe, read-only SQL, runs queries against your operational database, and returns grounded natural-language answers backed by real row data.
Ask in plain English — get SQL, results, and a summary you can trust. The system is designed for multi-table operational schemas (schools, classrooms, enrollments, staff, and similar domains) without exposing write access to the database.
Natural language question
→ Route: SQL | RAG (documents) | both (when INSIGHTAI_RAG_ENABLED=true)
→ Relevant schema context and/or Knowledge/ vector search
→ LLM generates read-only SQL (analytics path)
→ Multi-layer SQL safety validation
→ Execute SELECT on a readonly connection
→ LLM summarizes results (no invented numbers)
→ JSON answer via REST API (sync chat, SSE stream, or debug endpoints)
Typical latency for a full question (SQL + DB + answer) is on the order of 2–30 seconds, depending on schema size, query complexity, and LLM provider.
| Phase | Capability | Status |
|---|---|---|
| 1 | Foundation — FastAPI, config, LLM providers, DB layer, Docker | Complete |
| 2 | Schema intelligence — markdown schema → context for the LLM | Complete |
| 3 | SQL generation — NL → SQL with Groq / OpenAI / OpenRouter | Complete |
| 4 | SQL safety — sqlglot AST + composite validator | Complete |
| 5 | Query execution — timeouts, row caps, MSSQL/Postgres/SQLite | Complete |
| 6 | Answer generation — grounded summaries from query results | Complete |
| 7 | Product API — chat, sessions, auth, rate limits, SSE streaming | Complete |
| 8 | Observability — audit logs, LLM usage, OTEL tracing, Prometheus /metrics (optional) |
Complete — see optional insightai[otel,prometheus] |
| 9 | Performance — Redis caching | Complete — see docs/PERFORMANCE.md |
| 10 | Hybrid RAG — vectors + SQL + cited answers | Complete — see docs/RAG_INGEST.md |
| 11 | Trusted semantic layer — approved metrics & example SQL | Complete — docs/PHASE_11_TRUSTED_SEMANTIC.md |
| 16 | App database & API key auth (platform Postgres/SQLite) | Complete — docs/PHASE_16_APP_DB_AUTH.md |
| 12 | Governance & data policy (YAML scope, masks) | Complete — docs/GOVERNANCE.md, SECURITY.md |
| 13 | Explainability — explainability on ask/chat (route, schema picks, warnings) |
Complete — FUTURE_PHASES.md § Phase 13 |
Roadmap detail: AGENT_PHASES.md. Maintainer guide: AGENT.md.
Global platform (11+): FUTURE_PHASES.md — Phases 11–20 (trusted SQL, governance, API key auth, evals, catalog; one instance per customer).
Global learning (future): BRAIN_PHASES.md — Knowledge now; lesson store + UI later.
When INSIGHTAI_RAG_ENABLED=true, the chat pipeline can:
- Answer policy / help / product questions from
Knowledge/(semantic search) - Run SQL for counts, lists, and trends
- Combine both (
route: "both") with document citations
Put policies, help text, security notes, and reference material in Knowledge/ (.md, .txt, .pdf). The API ingests on startup (or via CLI) so questions like "What is this system for?" or "When is campus closed?" use your docs—not guessed SQL.
Trusted SQL (Phase 11): Author approved metrics and golden queries in config/semantic/. Education samples: config/semantic/examples/education/. Set INSIGHTAI_SEMANTIC_ENABLED=true for trusted matching. Chat/API: use_llm: false skips the LLM when YAML matches; mode: dry_run validates SQL without running it. Validate locally: insightai-semantic-validate, insightai-semantic-test-match. See docs/PHASE_11_TRUSTED_SEMANTIC.md.
Platform app database & API keys (Phase 16): Separate readonly analytics DB and app DB (keys only — never your warehouse password). Workflow: insightai-app-db upgrade → insightai-keys create → call API with iai_... token. Settings: INSIGHTAI_API_AUTH_MODE=api_key, INSIGHTAI_API_KEY_AUTH_SOURCE=database|both. Admin: insightai-keys create --roles admin then GET /api/v1/admin/keys. Full runbook: docs/PHASE_16_APP_DB_AUTH.md.
Governance (Phase 12): Row scope, table allow/deny, and column masks in config/governance/. Operator guide: docs/GOVERNANCE.md. Production review: SECURITY.md § Governance. Enable with INSIGHTAI_GOVERNANCE_ENABLED=true; validate with insightai-governance-validate; issue scoped keys with --attributes campus_ids=1,2.
# After editing Knowledge/:
insightai-knowledge-sync --force
# PDFs need the RAG extra:
pip install -e ".[rag]"See Knowledge/README.md and docs/RAG_INGEST.md.
- Loads table/column/join metadata from exports produced by django-db-schema-doc (see Schema from Django below).
- Prefers
schema/schema.jsonwhen present (INSIGHTAI_SCHEMA_SOURCE=auto); falls back toschema/database_schema.md. - Injects only relevant tables into the prompt (configurable cap).
- Outputs structured JSON: SQL, explanation, confidence, tables used; responses can include an
explainabilitypayload (Phase 13).
- Keyword blocklist for obvious write operations.
- sqlglot AST validation — only
SELECT(and safeWITH…SELECT) accepted. - Composite validator — AST is authoritative; keyword layer is fail-closed backup.
- Rejects multi-statement batches, dangerous functions, and policy violations before execution.
- Read-only SQLAlchemy executor with dialect support: Microsoft SQL Server, PostgreSQL, SQLite.
- Configurable row cap (
INSIGHTAI_SQL_MAX_ROWS, default 1000). - Per-query timeout (
INSIGHTAI_SQL_QUERY_TIMEOUT_SECONDS, default 120s). - Truncation detection when results exceed the cap.
- Summarizes query results in plain English.
- Prompt rules: cite real column names and row counts; do not invent data.
- Row sampling (head/tail/spread) for large result sets in the LLM prompt.
- Handles empty results and truncated sets explicitly.
| Endpoint | Description |
|---|---|
POST /api/v1/chat |
Main product endpoint — one question → answer (sync JSON) |
POST /api/v1/chat/stream |
Same pipeline via SSE — status, answer tokens, then done |
POST /api/v1/chat/sessions |
Create a conversation session |
GET /api/v1/chat/sessions/{id} |
Session metadata |
GET /api/v1/chat/sessions/{id}/messages |
Conversation history |
DELETE /api/v1/chat/sessions/{id} |
Delete session |
Debug / power-user:
| Endpoint | Description |
|---|---|
POST /api/v1/ask |
Full pipeline payload — SQL, rows, token usage, timings |
POST /api/v1/sql/generate |
SQL generation only |
GET /api/v1/schema/context |
Schema context for a question |
POST /api/v1/ai/complete |
Raw LLM smoke test (not the product API) |
POST /api/v1/ai/complete/stream |
Raw LLM SSE stream (token → done; public, no auth) |
Public (no auth):
| Endpoint | Description |
|---|---|
GET /api/v1/health |
Liveness |
GET /api/v1/health/ready |
Readiness (includes DB check) |
- Optional
session_idon chat requests (orX-Session-IDheader). - Each turn stores user + assistant messages (in-memory by default; Redis optional).
- Session TTL and message limits are configurable.
INSIGHTAI_API_AUTH_MODE:none|api_key|jwt- API keys via
X-API-KeyorAuthorization: Bearer <key> - Production requires auth mode other than
none(settings validator). - Today: comma-separated secrets in
INSIGHTAI_API_KEYS(Phase 7). - DB keys (16.4):
insightai-keys createthen sendX-API-Key: iai_...(or Bearer).INSIGHTAI_API_KEY_AUTH_SOURCE=both(default) also acceptsINSIGHTAI_API_KEYS. - Protected routes:
/chat,/chat/stream,/ask,/sql,/schema. Health and/ai/completestay public.
| Phase | What to configure | Commands / docs |
|---|---|---|
| 7 | INSIGHTAI_API_AUTH_MODE=api_key, INSIGHTAI_API_KEYS, rate limits |
SECURITY.md |
| 10 | RAG: INSIGHTAI_RAG_ENABLED, pgvector URL, Knowledge/ |
insightai-knowledge-sync — docs/RAG_INGEST.md |
| 11 | Trusted YAML in config/semantic/, INSIGHTAI_SEMANTIC_ENABLED=true |
insightai-semantic-validate — docs/PHASE_11_TRUSTED_SEMANTIC.md |
| 16 | App DB, DB-backed keys, optional admin key | insightai-app-db upgrade, insightai-keys create, INSIGHTAI_API_KEY_AUTH_SOURCE=database — docs/PHASE_16_APP_DB_AUTH.md |
| 12 | INSIGHTAI_GOVERNANCE_ENABLED=true, policies.yaml + key --attributes |
docs/GOVERNANCE.md, insightai-governance-validate, insightai-keys create |
POST /api/v1/chat/stream uses Server-Sent Events (text/event-stream):
| Event | When | Payload |
|---|---|---|
status |
SQL / query / answer phase starts | {"phase": "generating_sql" | "applying_governance" | "validating_sql" | "executing_query" | "generating_answer"} |
token |
Answer text delta | {"text": "..."} |
done |
Pipeline finished | Full chat JSON (same fields as sync POST /chat) |
error |
Failure | {"error_message", "error_code", "request_id"} |
- Toggle:
INSIGHTAI_CHAT_STREAMING_ENABLED(defaulttrue;false→ HTTP 404). - Session history is written on
done(not on each token). - Auth and rate limits match sync
/chat.
curl -N -X POST http://localhost:8000/api/v1/chat/stream \
-H 'Content-Type: application/json' \
-d '{"question": "How many active classrooms are there?"}'- Sliding window per authenticated principal or client IP.
- Returns HTTP 429 with
Retry-Afterandretry_after_secondsin JSON. - Configurable:
INSIGHTAI_RATE_LIMIT_ENABLED,_REQUESTS,_WINDOW_SECONDS,_STORE(memory|redis).
- Structured logging (
structlog) with request ID on every HTTP call (X-Request-ID). - Pipeline timings logged: SQL generation, query execution, answer generation.
InsightAI is read-only by design:
- Generated SQL must pass parser + policy validation.
- Only
SELECTstatements are executed. - Database credentials should use a read-only DB user.
- API keys and JWT protect product endpoints in production.
Do not point the app at a write-capable database role in production.
Hexagonal (ports & adapters):
api/ → FastAPI routes, schemas, auth, rate limits
application/ → Use cases (ask, chat sessions, generate SQL, run query, …)
domain/ → Models, ports, exceptions
infrastructure/ → LLM, DB, schema parser, prompts, safety validators
prompts/ → LLM system/user templates
schema/ → schema.json and/or database_schema.md (from django-db-schema-doc)
context/ → optional deployment plugins (see context/README.md)
Core use case: AskUseCase orchestrates schema context → SQL → validate → execute → answer.
InsightAI does not introspect your Django app at runtime. Schema metadata is exported once from your Django project with django-db-schema-doc (PyPI), then copied into this repo’s schema/ folder (or mounted in Docker). The two projects are independent: django-db-schema-doc generates files; InsightAI consumes them.
| Capability | Management command | Typical InsightAI use |
|---|---|---|
| Human-readable schema doc | generate_database_doc |
schema/database_schema.md — markdown fallback, FK index, query examples |
Structured snapshot (schema_version: 1) |
export_schema_json |
schema/schema.json — preferred for NL→SQL context (INSIGHTAI_SCHEMA_SOURCE=auto) |
| SQL / ORM examples per table | export_schema_examples |
schema/schema_examples.json — optional; boosts example queries in context |
| RAG-sized chunks | export_ai_schema |
Split into Knowledge/*.md for hybrid document search |
| ERD explorer | generate_erd |
Documentation / analysts (not required by InsightAI) |
Install in your Django project (not in InsightAI):
pip install django-db-schema-doc# settings.py
INSTALLED_APPS = [
# ...
"db_schema_doc",
]Run after migrations (or in CI on every deploy):
cd /path/to/your/django/project
python manage.py migrate
python manage.py generate_database_doc -o DATABASE.md --project-name "Your Product"
python manage.py export_schema_json -o schema.json
python manage.py export_schema_examples -o schema_examples.json # optional
python manage.py export_ai_schema -o ai_schema.json --project-name "Your Product" # optional, for Knowledge/Use model docstrings, verbose_name, and help_text so exports include business meaning (enabled by default in django-db-schema-doc). Re-run exports whenever models or migrations change.
More commands (DBML, schema diff, CLI search): see the django-db-schema-doc README.
cp /path/to/your/django/project/DATABASE.md /path/to/InsightAI/schema/database_schema.md
cp /path/to/your/django/project/schema.json /path/to/InsightAI/schema/schema.json
cp /path/to/your/django/project/schema_examples.json /path/to/InsightAI/schema/schema_examples.json # optionalConfigure .env (see .env.example):
INSIGHTAI_SCHEMA_SOURCE=auto
INSIGHTAI_SCHEMA_JSON_PATH=schema/schema.json
INSIGHTAI_SCHEMA_MARKDOWN_PATH=schema/database_schema.md
# INSIGHTAI_SCHEMA_EXAMPLES_JSON_PATH=schema/schema_examples.jsonRestart the API (or clear schema cache in tests) so the registry reloads. Details: schema/README.md.
- Hybrid RAG: copy business
.mdintoKnowledge/; optionally turnexport_ai_schemadocuments[]into topic-focused files, theninsightai-knowledge-sync --force(docs/RAG_INGEST.md). - Trusted SQL: use
schema_examples.jsonand domain knowledge when authoringconfig/semantic/. - Extended table ranking: optional
context/plugins/+INSIGHTAI_SCHEMA_CONTEXT_PLUGIN(context/README.md) — keep customer-specific table names out of coresrc/.
On schema changes: export in Django CI → artifact → copy into InsightAI deploy → insightai-knowledge-sync if Knowledge changed. Same readonly DB credentials InsightAI uses for SELECT only (Security model).
- Python 3.12+, FastAPI, Pydantic v2, Uvicorn
- SQLAlchemy 2 — MSSQL (pyodbc), PostgreSQL, SQLite
- LlamaIndex (primary AI framework adapter)
- LLM providers (config:
INSIGHTAI_LLM_PROVIDER): Groq, OpenAI, OpenRouter (300+ models via openrouter.ai) - OpenRouter uses the official
openrouterPython SDK; model slugs look likeopenai/gpt-4o-minioranthropic/claude-3.5-haiku - sqlglot — SQL parsing and safety
- Redis (optional) — sessions and rate limits
- Docker Compose — local API + Postgres demo DB
cd InsightAI
python3.12 -m venv .venv
source .venv/bin/activate
cp .env.example .env
# Set LLM API key (GROQ_API_KEY, OPENAI_API_KEY, or OPENROUTER_API_KEY) and database settings
pip install -e ".[dev,mssql,rag]"
# or: pip install -r requirements-dev.txt && pip install -r requirements-rag.txt
uvicorn insightai.main:create_app --factory --reload
# or: insightaiIf your operational schema lives in Django, use Schema from Django: export with django-db-schema-doc, copy into schema/, set INSIGHTAI_SCHEMA_* in .env, add Knowledge/, then start the API.
Short checklist:
- In Django:
pip install django-db-schema-doc, adddb_schema_doctoINSTALLED_APPS, rungenerate_database_doc+export_schema_json(and optionalexport_schema_examples/export_ai_schema). - Copy
DATABASE.md→schema/database_schema.md,schema.json→schema/schema.json. - In InsightAI:
INSIGHTAI_SCHEMA_SOURCE=auto, LLM + readonly DB settings in.env. insightai-knowledge-sync --forceafter populatingKnowledge/.uvicorn insightai.main:create_app --factory --reload
Verify:
curl http://localhost:8000/api/v1/health
open http://localhost:8000/docsCLI (easiest):
# With the API running (see above):
python scripts/ask.py "How many active classrooms are there?"
python scripts/ask.py --stream "How many students per classroom?"
python scripts/ask.py # interactive — type questions until you press Enter on empty lineBrowser UI: python apps/serve_demo.py → open http://127.0.0.1:8765 (ChatGPT-style chat with sessions; no auth required in dev)
See apps/README.md for options (--include-sql, API key, etc.).
curl -s -X POST http://localhost:8000/api/v1/chat \
-H 'Content-Type: application/json' \
-d '{"question": "How many active classrooms are there?"}'SESSION=$(curl -s -X POST http://localhost:8000/api/v1/chat/sessions \
-H 'Content-Type: application/json' \
-d '{"title": "Enrollment questions"}' | jq -r .id)
curl -s -X POST http://localhost:8000/api/v1/chat \
-H 'Content-Type: application/json' \
-d "{\"question\": \"How many students per classroom?\", \"session_id\": \"$SESSION\"}"
curl -s "http://localhost:8000/api/v1/chat/sessions/$SESSION/messages"curl -N -X POST http://localhost:8000/api/v1/chat/stream \
-H 'Content-Type: application/json' \
-d "{\"question\": \"How many students per classroom?\", \"session_id\": \"$SESSION\"}"You will see event: status, event: token, and a final event: done with the complete JSON answer.
cp .env.example .env # set GROQ_API_KEY
docker compose up --buildThe Compose stack uses a sample Postgres database for demos. For Microsoft SQL Server in Docker, run the API on the host (or extend the image with ODBC) and set DB_HOST=localhost — see docker/README.md.
Copy .env.example to .env. Important variables:
| Variable | Purpose |
|---|---|
INSIGHTAI_LLM_PROVIDER |
groq | openai | openrouter |
GROQ_API_KEY |
Groq Cloud (when provider is groq) |
OPENAI_API_KEY |
OpenAI (when provider is openai) |
OPENROUTER_API_KEY |
OpenRouter (when provider is openrouter) |
INSIGHTAI_OPENROUTER_MODEL |
Model slug, e.g. openai/gpt-4o-mini |
INSIGHTAI_RAG_ENABLED |
Enable hybrid document + SQL routing |
INSIGHTAI_RAG_VECTOR_BACKEND |
memory (dev) or pgvector |
INSIGHTAI_RAG_SYNC_KNOWLEDGE_ON_STARTUP |
Ingest Knowledge/ on API start |
INSIGHTAI_SCHEMA_SOURCE |
auto | json | markdown |
INSIGHTAI_SCHEMA_JSON_PATH |
Path to export_schema_json output |
INSIGHTAI_SCHEMA_EXAMPLES_JSON_PATH |
Optional export_schema_examples output |
INSIGHTAI_SCHEMA_CONTEXT_PLUGIN |
Optional context.plugins.module:ClassName |
INSIGHTAI_DATABASE_KIND |
mssql | postgresql | sqlite |
DB_READONLY_USER / DB_READONLY_PASSWORD |
Preferred for MSSQL (auto URL-encoding) |
INSIGHTAI_SQL_MAX_ROWS |
Max rows per query (default 1000) |
INSIGHTAI_SQL_QUERY_TIMEOUT_SECONDS |
Query timeout (default 120) |
INSIGHTAI_APP_DATABASE_URL |
Platform DB (default SQLite data/insightai_app.db) |
INSIGHTAI_API_AUTH_MODE |
none | api_key | jwt |
INSIGHTAI_API_KEY_AUTH_SOURCE |
env | database | both (default both) |
INSIGHTAI_API_KEYS |
Comma-separated env keys (when source is env or both) |
INSIGHTAI_RATE_LIMIT_ENABLED |
Enable rate limiting |
INSIGHTAI_CHAT_STREAMING_ENABLED |
Enable POST /api/v1/chat/stream (default true) |
INSIGHTAI_CHAT_SESSION_STORE |
memory | redis |
Full list: .env.example.
LLM tips:
- For NL→SQL on Groq, prefer
INSIGHTAI_GROQ_MODEL=llama-3.3-70b-versatile—groq/compoundcan return HTTP 413 on large prompts. - OpenRouter model list: openrouter.ai/models.
| Command | Purpose |
|---|---|
insightai |
Run the API (same as uvicorn factory) |
insightai-ingest |
Build JSONL embedding index from documents |
insightai-rag-load |
Load JSONL into pgvector |
insightai-knowledge-sync |
Ingest Knowledge/ into the vector store |
insightai-app-db |
App DB migrations (upgrade, current, revision) |
insightai-semantic-validate |
Validate trusted semantic YAML + SQL parse |
insightai-semantic-test-match |
Test question match against semantic catalog |
insightai-governance-validate |
Validate config/governance/policies.yaml |
insightai-keys |
Create/list/revoke API keys (create, list, revoke) |
The examples below are adapted from real end-to-end runs against a multi-school operational database (classrooms, enrollments, users). Names and identifiers are anonymized; behavior and row counts reflect actual system responses.
| Question | Typical outcome (summary) |
|---|---|
| How many classrooms do I have? | Returns dozens of classrooms (e.g. ~40 rows). Answer lists sample names such as Sunrise Room, AAA Studio, MMM Wing, OOO Building, Music Studio. |
| How many children are in each classroom? | One row per classroom with counts (e.g. 19 classrooms). Example: Building A ~48 students, Sunrise Room ~1, River Campus ~28, OOO Building ~31. |
| Question | Typical outcome (summary) |
|---|---|
| Who is in Sunrise Room classroom? (extra word in name) | Often 0 rows — classroom stored as Sunrise Room without the extra word. |
| Who is in Sunrise Room? (ambiguous) | May 0 rows if the model searches the wrong table (e.g. activity titles instead of enrollments). |
| Who is in Sunrise Room? It is a classroom name. | 1 row — e.g. user id 122, username student001@example.org, classroom Sunrise Room. |
| Who is in Sunrise Room? Tell me the student's name. | 1 row — e.g. first name Alex, last name Rivera. |
Takeaway: Clear questions that match exact classroom names in the schema work best. The debug /ask endpoint returns generated SQL so you can refine wording.
| API | Best for |
|---|---|
POST /api/v1/chat |
Production — single JSON response when you do not need live tokens |
POST /api/v1/chat/stream |
Production UI — progressive status + answer text via SSE |
POST /api/v1/ask |
Debugging — full SQL, row payload, token usage, schema tables used |
Example chat response shape (abbreviated):
{
"question": "How many classrooms do I have?",
"answer": "The query returned 40 rows, which means you have 40 classrooms...",
"row_count": 40,
"truncation_noted": false,
"request_id": "a6f6bfdf-f684-4bbc-a50b-34051a489dd9",
"session_id": "d8a044a4-46fc-4530-8f78-8489ed1ccab9",
"timings": {
"sql_generation_ms": 2435.5,
"query_execution_ms": 56.66,
"answer_generation_ms": 871.54,
"total_ms": 3365.41
}
}Optional flags on chat: include_sql, include_data, timeout_seconds, route (sql | rag | both).
| Question type | Example | Route |
|---|---|---|
| Policy / product | What is this system for? | RAG → Knowledge/ |
| Analytics | How many students per classroom? | SQL |
| Combined | Per handbook, how many classrooms are required? | both |
pip install -e ".[dev]"
./scripts/test.sh # ruff + mypy + pytest
pytest tests -q # all tests
pytest tests/unit -q # unit only
pytest tests/integration/test_chat_product_e2e.py -m integration
pytest tests/integration/test_chat_stream_e2e.py -m streamingTest layout: tests/README.md, tests/integration/README.md.
export INSIGHTAI_MSSQL_INTEGRATION_URL="mssql+pyodbc://..."
pytest -m mssqlsrc/insightai/ Application code
Knowledge/ Business docs for RAG
schema/ Database metadata (`schema.json` and/or `database_schema.md`)
context/ Optional deployment-local plugins (gitignored in `context/plugins/*`)
prompts/ LLM prompts (SQL, answer, RAG, hybrid)
tests/ Unit and integration tests
docker/ Compose + Postgres init scripts
config/ Per-instance YAML (semantic, future governance)
docs/ RAG ingest, performance, Phase 11 implementation log
AGENT.md Maintainer / agent guide
AGENT_PHASES.md Phase roadmap (1–10)
FUTURE_PHASES.md Global platform roadmap (11–20+)
BRAIN_PHASES.md Global learning / lessons
REAL_TEST.md Raw curl transcripts (developer reference)
Contributors and agents: when you add or change user-facing behavior, API routes, configuration, or phase completion, update this README in the same change set. Keep AGENT.md and AGENT_PHASES.md in sync for phase status.
Copyright (c) 2025 MrHiB. Released under the MIT License.
Attribution: If you use, modify, or redistribute this project (including building an app on top of it), you must credit InsightAI by MrHiB and link to the canonical repository. See NOTICE for details.
See SECURITY.md for how to report vulnerabilities and a production deployment checklist.