Base URL: http://127.0.0.1:8012
Interactive Docs:
- Swagger UI: http://127.0.0.1:8012/docs
- ReDoc: http://127.0.0.1:8012/redoc
All endpoints return JSON unless otherwise specified. The API is designed for local-first use and binds to 127.0.0.1 by default.
- RAG Operations
- Configuration & Management
- Indexing & Data
- Cost & Performance
- Evaluation
- Observability
- MCP Wrapper Endpoints
Full RAG pipeline: retrieval → reranking → generation with citations.
Query Parameters:
q(string, required) - The question to answerrepo(string, optional) - Repository name (defaults toREPOenv var)top_k(integer, optional) - Number of results to retrieve (default: 10)
Response:
{
"answer": "[repo: agro]\n\nOAuth tokens are validated in the `auth/middleware.py` file...",
"citations": [
"auth/middleware.py:45-67",
"server/auth.py:120-145"
],
"repo": "agro",
"confidence": 0.78,
"retrieval_count": 5
}Example:
curl "http://127.0.0.1:8012/answer?q=Where%20is%20OAuth%20validated&repo=agro"Retrieval only (no generation). Returns ranked code chunks with rerank scores.
Query Parameters:
q(string, required) - Search queryrepo(string, optional) - Repository nametop_k(integer, optional) - Number of results (default: 10)
Response:
{
"results": [
{
"file_path": "auth/middleware.py",
"start_line": 45,
"end_line": 67,
"language": "python",
"rerank_score": 0.85,
"layer": "server",
"repo": "agro",
"code": "def validate_oauth_token(token: str):\n ..."
}
],
"repo": "agro",
"count": 5,
"query": "Where is OAuth validated"
}Example:
curl "http://127.0.0.1:8012/search?q=authentication&repo=agro&top_k=5"Multi-turn conversational chat with memory and context.
Request Body:
{
"message": "How does the indexer work?",
"repo": "agro",
"thread_id": "user-session-123",
"stream": false
}Response:
{
"answer": "The indexer in AGRO works by...",
"citations": ["indexer/index_repo.py:45-120"],
"thread_id": "user-session-123",
"turn_count": 3
}Streaming Response (SSE):
Set "stream": true to get Server-Sent Events:
curl -X POST http://127.0.0.1:8012/api/chat \
-H "Content-Type: application/json" \
-d '{"message":"Explain indexing","repo":"agro","stream":true}' \
--no-bufferGet current environment configuration and repository settings.
Response:
{
"env": {
"GEN_MODEL": "gpt-4o-mini",
"EMBEDDING_TYPE": "openai",
"RERANK_BACKEND": "cohere",
"REPO": "agro",
"MQ_REWRITES": 4
},
"repos": {
"default_repo": "agro",
"repos": [
{
"name": "agro",
"path": ["/Users/user/agro"],
"enabled": true,
"keywords": ["rag", "retrieval", "hybrid"]
}
]
}
}Example:
curl http://127.0.0.1:8012/api/configUpdate configuration (writes to .env and repos.json).
Request Body:
{
"env": {
"GEN_MODEL": "gpt-4o",
"RERANK_BACKEND": "local",
"MQ_REWRITES": 6
},
"repos": {
"default_repo": "agro"
}
}Response:
{
"ok": true,
"updated": ["GEN_MODEL", "RERANK_BACKEND", "MQ_REWRITES"],
"requires_restart": false,
"requires_reindex": false
}List all saved configuration profiles.
Response:
{
"profiles": [
{
"name": "fast-local",
"description": "BM25-only, local models",
"settings": {
"GEN_MODEL": "qwen3-coder:14b",
"EMBEDDING_TYPE": "local",
"RERANK_BACKEND": "local"
}
},
{
"name": "high-quality",
"description": "Full hybrid, OpenAI models",
"settings": {
"GEN_MODEL": "gpt-4o",
"EMBEDDING_TYPE": "openai",
"RERANK_BACKEND": "cohere"
}
}
]
}Save current config as a named profile.
Request Body:
{
"name": "my-profile",
"description": "Custom settings for X",
"settings": {
"GEN_MODEL": "gpt-4o-mini",
"RERANK_BACKEND": "cohere"
}
}Apply a saved profile (updates env vars).
Request Body:
{
"name": "high-quality"
}Response:
{
"ok": true,
"applied": "high-quality",
"updated_keys": ["GEN_MODEL", "EMBEDDING_TYPE", "RERANK_BACKEND"]
}Start indexing a repository (async operation).
Query Parameters:
repo(string, optional) - Repository name
Response:
{
"status": "started",
"repo": "agro",
"job_id": "idx-20251013-123456",
"message": "Indexing started in background"
}Example:
curl -X POST "http://127.0.0.1:8012/api/index/start?repo=agro"Check indexing job status.
Response:
{
"status": "running",
"repo": "agro",
"progress": {
"files_processed": 234,
"files_total": 567,
"chunks_created": 1234,
"elapsed_seconds": 45
},
"message": "Processing files..."
}Build semantic cards (high-level summaries) for a repo.
Query Parameters:
repo(string, optional) - Repository nameenrich(integer, optional) - Enable enrichment with LLM (1=yes, 0=no, default: 1)
Response:
{
"job_id": "cards-abc123",
"status": "started",
"repo": "agro",
"stream_url": "/api/cards/build/stream/cards-abc123"
}Stream card building progress (SSE).
Response (Server-Sent Events):
event: progress
data: {"files_processed": 10, "total": 100, "message": "Processing auth/"}
event: card
data: {"file": "auth/oauth.py", "card": "OAuth token validation..."}
event: complete
data: {"total_cards": 45, "elapsed": 123}
Example:
curl -N http://127.0.0.1:8012/api/cards/build/stream/cards-abc123List all semantic cards for a repo.
Query Parameters:
repo(string, optional) - Repository name
Response:
{
"repo": "agro",
"cards": [
{
"file_path": "auth/oauth.py",
"summary": "OAuth 2.0 token validation and refresh logic",
"keywords": ["oauth", "token", "validation", "auth"]
}
],
"count": 45
}Estimate costs for a given configuration.
Request Body:
{
"gen_provider": "openai",
"gen_model": "gpt-4o-mini",
"embed_provider": "openai",
"embed_model": "text-embedding-3-large",
"rerank_provider": "cohere",
"rerank_model": "rerank-3.5",
"tokens_in": 1000,
"tokens_out": 500,
"embeds": 100,
"reranks": 50,
"requests_per_day": 100
}Response:
{
"daily_cost": 2.45,
"monthly_cost": 73.50,
"breakdown": {
"generation": 1.20,
"embeddings": 0.80,
"reranking": 0.45
},
"per_request": 0.0245
}Full pipeline cost estimate based on actual usage patterns.
Request Body:
{
"repo": "agro",
"queries_per_day": 50,
"avg_chunks_per_query": 10,
"avg_output_tokens": 300
}Response:
{
"daily": 5.67,
"monthly": 170.10,
"yearly": 2068.55,
"breakdown": {
"retrieval": 1.20,
"reranking": 0.80,
"generation": 3.67
}
}Get model pricing database.
Response:
{
"models": [
{
"provider": "openai",
"model": "gpt-4o-mini",
"unit": "1k_tokens",
"input_cost": 0.000150,
"output_cost": 0.000600
},
{
"provider": "cohere",
"model": "rerank-3.5",
"unit": "1k_searches",
"rerank_per_1k": 2.00
}
]
}Add or update model pricing.
Request Body:
{
"provider": "openai",
"model": "gpt-4o",
"unit": "1k_tokens",
"input_cost": 0.0025,
"output_cost": 0.010
}List all golden test questions.
Response:
{
"tests": [
{
"q": "Where is OAuth validated?",
"repo": "agro",
"expect_paths": ["auth", "oauth", "token"]
}
],
"count": 10
}Add a new golden test.
Request Body:
{
"q": "How does the reranker work?",
"repo": "agro",
"expect_paths": ["rerank", "retrieval"]
}Update an existing golden test.
Request Body:
{
"q": "Where is OAuth token validated?",
"repo": "agro",
"expect_paths": ["auth", "oauth", "token", "validation"]
}Delete a golden test by index.
Response:
{
"ok": true,
"deleted_index": 2,
"remaining_count": 9
}Test a single question without adding to golden set.
Request Body:
{
"q": "How does indexing work?",
"repo": "agro",
"expect_paths": ["index", "chunk"]
}Response:
{
"hit": true,
"results": [
{
"file_path": "indexer/index_repo.py",
"rerank_score": 0.85
}
],
"matched_paths": ["index"]
}Run full evaluation suite.
Request Body (optional):
{
"save_baseline": false,
"compare_to_baseline": false
}Response:
{
"total_questions": 10,
"top1_accuracy": 0.70,
"top5_accuracy": 0.90,
"duration_seconds": 15.4,
"results": [
{
"question": "Where is OAuth validated?",
"hit_top1": true,
"hit_top5": true,
"top_result": "auth/oauth.py"
}
]
}Get latest evaluation results.
Response:
{
"timestamp": "2025-10-13T12:34:56Z",
"accuracy": {
"top1": 0.70,
"top5": 0.90
},
"duration": 15.4,
"total": 10
}Save current eval results as baseline for regression tracking.
Response:
{
"ok": true,
"saved_at": "2025-10-13T12:34:56Z",
"baseline_file": "eval_baseline.json"
}Compare current eval against saved baseline.
Response:
{
"baseline": {
"top1": 0.70,
"top5": 0.90
},
"current": {
"top1": 0.65,
"top5": 0.88
},
"diff": {
"top1": -0.05,
"top5": -0.02
},
"regressions": [
{
"question": "How does reranking work?",
"was_hit": true,
"now_hit": false
}
]
}Service health check.
Response:
{
"status": "healthy",
"graph_loaded": true,
"ts": "2025-10-13T12:34:56Z"
}LangSmith integration status.
Response:
{
"enabled": true,
"installed": true,
"project": "agro-rag",
"endpoint": "https://api.smith.langchain.com",
"key_present": true,
"can_connect": true,
"identity": {
"user_id": "abc123",
"org_id": "org-xyz"
}
}List recent retrieval traces.
Query Parameters:
repo(string, optional) - Filter by repositorylimit(integer, optional) - Number of traces (default: 20)
Response:
{
"traces": [
{
"query": "Where is OAuth validated?",
"repo": "agro",
"timestamp": "2025-10-13T12:34:56Z",
"retrieval_count": 5,
"top_score": 0.85,
"duration_ms": 234
}
],
"count": 10
}Get the most recent trace.
Response:
{
"query": "How does indexing work?",
"repo": "agro",
"results": [
{
"file_path": "indexer/index_repo.py",
"rerank_score": 0.85,
"layer": "indexer"
}
],
"duration_ms": 234
}Get latest LangSmith runs.
Query Parameters:
limit(integer, optional) - Number of runs (default: 10)
Response:
{
"runs": [
{
"id": "run-abc123",
"name": "rag_search",
"status": "success",
"start_time": "2025-10-13T12:34:56Z",
"end_time": "2025-10-13T12:34:58Z",
"duration_ms": 2000
}
]
}Query LangSmith runs with filters.
Query Parameters:
project(string, optional) - Project namestatus(string, optional) - Filter by status (success, error)limit(integer, optional) - Number of results (default: 20)
These endpoints provide HTTP access to MCP tools for remote agents.
MCP rag_search tool via HTTP.
Query Parameters:
repo(string, required) - Repository namequestion(string, required) - Search querytop_k(integer, optional) - Number of results (default: 10)
Response:
{
"results": [
{
"file_path": "auth/oauth.py",
"start_line": 45,
"end_line": 67,
"rerank_score": 0.85
}
],
"count": 5,
"repo": "agro"
}Example:
curl "http://127.0.0.1:8012/api/mcp/rag_search?repo=agro&question=OAuth%20validation&top_k=5"- Interactive API Docs: http://127.0.0.1:8012/docs (Swagger UI)
- Alternative Docs: http://127.0.0.1:8012/redoc (ReDoc)
- GUI Settings API: API_GUI.md
- MCP Integration: MCP_README.md
- Performance & Cost: PERFORMANCE_AND_COST.md
By default, all endpoints are unauthenticated and bind to 127.0.0.1 (localhost only).
For remote access or production deployment, consider:
- Reverse proxy with authentication (Caddy, Nginx)
- OAuth 2.0 integration (see GUI settings)
- VPN or SSH tunnel for secure access
No built-in rate limiting. For production use:
- Add reverse proxy with rate limiting
- Use API gateway (Kong, Tyk)
- Monitor with observability tools
All errors follow this format:
{
"detail": "Repository 'invalid-repo' not found",
"error_code": "REPO_NOT_FOUND",
"status_code": 404
}Common Error Codes:
400- Bad Request (invalid parameters)404- Not Found (repo, profile, or resource missing)500- Internal Server Error (check logs)503- Service Unavailable (dependencies down)
Version: 2.1.0
Last Updated: October 2025