Skip to content

api models

aakash-anko edited this page May 28, 2026 · 2 revisions

api/models.py

Pydantic request and response models that define the shape of every JSON body the API accepts and returns.


Key Concepts

Term Definition Example
blast radius All files that would be affected if a given file changes — found by following reverse import edges transitively. If A imports B and C imports A, changing B has blast radius = {A, C}.
embedding A numerical vector (list of numbers) that represents the meaning of text. Similar text → similar vectors. The code def add(a, b): return a+b might become [0.12, -0.45, 0.78, ...] (1536 numbers for OpenAI).
ChromaDB An open-source vector database for storing and searching embeddings. Used here to store code chunks. collection.query(query_texts=["scan files"], n_results=5) returns the 5 closest code chunks.
chunk A piece of source code (usually one function or class) stored as a unit for search. The function def scan_directory(root): ... (20 lines) is one chunk.
AST Abstract Syntax Tree — a tree representation of source code structure, where each node is a language construct (function, class, if-statement, etc.). def add(a, b): return a+b becomes a tree: FunctionDef → [args: a, b] → [body: Return → BinOp(a + b)].
LLM Large Language Model — an AI model (like GPT-4, Claude) that generates text given a prompt. get_llm() returns a ChatOpenAI instance that can answer questions about code.
Pydantic A Python library for data validation using type hints. Defines schemas as classes with typed fields. class QueryRoute(BaseModel): route: str; target: str — any instance is guaranteed to have string route and target fields.
diff The set of changes between two versions of code, showing added (+) and removed (-) lines. - old_line\n+ new_line shows old_line was replaced with new_line.
hunk A contiguous block of changes within a diff. One diff can contain multiple hunks (changes in different parts of a file). A diff might have hunk 1 (lines 10-15 changed) and hunk 2 (lines 80-85 changed).

AnalyzeRequest — line 5

Request body for POST /analyze. Tells the API which repo to index and how.

Fields

Field Type Default Purpose
repo_path str "" Absolute path to the repo on disk
collection_name str "" ChromaDB collection name (derived from repo if empty)
index_mode str "auto" "auto" / "reindex" / "full"

Example

Input JSON: {"repo_path": "/home/user/my-app", "collection_name": "", "index_mode": "full"}

Line 7: repo_path"/home/user/my-app" Line 8: collection_name"" (empty — the API endpoint will later derive "my-app" from the path) Line 9: index_mode"full" (nuke existing index, re-embed everything)

The resulting AnalyzeRequest object:

AnalyzeRequest(repo_path="/home/user/my-app", collection_name="", index_mode="full")

ChatRequest — line 11

Request body for POST /chat. Carries the user's question and a thread ID for conversation memory.

Fields

Field Type Default Purpose
message str (required) The question to ask the agent
thread_id str "default" Conversation thread for multi-turn memory

Example

Input JSON: {"message": "How does authentication work?", "thread_id": "session-42"}

Line 13: message"How does authentication work?" Line 14: thread_id"session-42"

The resulting ChatRequest object:

ChatRequest(message="How does authentication work?", thread_id="session-42")

AnalyzeResponse — line 18

Response body for POST /analyze. Reports what the indexing pipeline produced.

Fields

Field Type Purpose
status str Always "complete" on success
repo_path str The repo that was analyzed
files_scanned int Number of source files found
chunks_created int Number of text chunks embedded
modules list[str] Detected module names

Example

Input (constructed by the /analyze endpoint):

AnalyzeResponse(
    status="complete",
    repo_path="/home/user/my-app",
    files_scanned=127,
    chunks_created=843,
    modules=["api", "models", "services", "utils"]
)

Serialized JSON:

{
  "status": "complete",
  "repo_path": "/home/user/my-app",
  "files_scanned": 127,
  "chunks_created": 843,
  "modules": ["api", "models", "services", "utils"]
}

ChatResponse — line 26

Response body for POST /chat. Contains the agent's answer.

Fields

Field Type Purpose
answer str The agent's response text
thread_id str Echoed back for client-side correlation

Example

Input:

ChatResponse(answer="Auth uses JWT tokens issued by the /login endpoint.", thread_id="session-42")

Serialized JSON:

{
  "answer": "Auth uses JWT tokens issued by the /login endpoint.",
  "thread_id": "session-42"
}

ModuleResponse — line 31

Response body for GET /modules/{name}. Full details about one module.

Fields

Field Type Default Purpose
name str Module name (or "users (inside 'features')" if matched as sub-folder)
file_count int Number of files in the module
files list[str] Sorted list of file paths
languages dict[str, int] Language → file count mapping
depends_on list[str] Modules this one imports from
depended_by list[str] Modules that import from this one
blast_radius list[dict] [] Per-file risk data
module_risk str "low" Highest risk level across all files

Example

Input:

ModuleResponse(
    name="api",
    file_count=3,
    files=["src/api/main.py", "src/api/models.py", "src/api/state.py"],
    languages={"python": 3},
    depends_on=["services", "models"],
    depended_by=[],
    blast_radius=[{"file": "src/api/main.py", "risk_level": "high", "affected_files": 12}],
    module_risk="high",
)

Serialized JSON:

{
  "name": "api",
  "file_count": 3,
  "files": ["src/api/main.py", "src/api/models.py", "src/api/state.py"],
  "languages": {"python": 3},
  "depends_on": ["services", "models"],
  "depended_by": [],
  "blast_radius": [{"file": "src/api/main.py", "risk_level": "high", "affected_files": 12}],
  "module_risk": "high"
}

OverviewResponse — line 42

Response body for GET /overview. Full project summary with diagram and risk data.

Fields

Field Type Default Purpose
tech_stack list[str] Detected technologies (e.g. ["Python", "FastAPI"])
total_files int Total source files in the repo
total_modules int Number of detected modules
modules list[str] Module names
diagram str Mermaid diagram markup of module graph
overview_text str LLM-generated project summary
riskiest_files list[dict] [] Top 30 riskiest files with blast radius data

Example

OverviewResponse(
    tech_stack=["Python", "FastAPI", "ChromaDB"],
    total_files=45,
    total_modules=5,
    modules=["api", "embeddings", "analysis", "ingestion", "generation"],
    diagram="graph LR\n  api --> analysis\n  api --> embeddings",
    overview_text="This project is an AI-powered codebase onboarding tool...",
    riskiest_files=[{"file": "src/config.py", "risk_level": "critical", "affected_files": 30}],
)

BlastRadiusResponse — line 53

Response body for GET /blast-radius/{module_name}. Shows which files break if you change files in a module.

Fields

Field Type Purpose
module str Module name, or "all" for whole repo
module_risk str Highest risk across all files
total_files int Number of files in scope
files list[dict] Per-file risk details

Example

BlastRadiusResponse(
    module="analysis",
    module_risk="high",
    total_files=6,
    files=[
        {"file": "src/analysis/dependency_graph.py", "risk_level": "high", "affected_files": 18},
        {"file": "src/analysis/module_detector.py", "risk_level": "medium", "affected_files": 7},
    ],
)

ReviewRequest — line 60

Request body for POST /review. Controls which git diff to review.

Fields

Field Type Default Purpose
staged bool False If True, review only staged changes (--staged)
target_branch str | None None Diff against a branch (e.g. "main" for full PR review)

Example

Input JSON: {"staged": true, "target_branch": "main"}

Line 62: stagedTrue (only review git add-ed files) Line 63: target_branch"main" (diff current branch against main)


ReviewFileRequest — line 65

Request body for POST /review/file. Points to a single file to review.

Fields

Field Type Purpose
file_path str Absolute or relative path to the file

Example

Input JSON: {"file_path": "src/codewalk/api/main.py"}

Line 67: file_path"src/codewalk/api/main.py"


GuidelinesRequest — line 69

Request body for POST /review/guidelines. Loads team coding standards.

Fields

Field Type Default Purpose
docs_path str | None None Path to directory with .md/.txt guideline files

Example

Input JSON: {"docs_path": "/home/user/my-app/docs/guidelines"}

Line 71: docs_path"/home/user/my-app/docs/guidelines"


DocsIndexRequest

Request body for POST /docs/index. Points to a folder of documents to index.

Fields

Field Type Purpose
docs_path str Absolute path to directory with .md/.pdf/.txt files

Example

Input JSON: {"docs_path": "/Users/me/team-docs"}


DocsSearchRequest

Request body for POST /docs/search. Semantic search across indexed documents.

Fields

Field Type Default Purpose
query str Search query text
n_results int 5 Number of results to return

Example

Input JSON: {"query": "deployment process", "n_results": 3}


DocsAskRequest

Request body for POST /docs/ask. Ask a question answered from indexed documents.

Fields

Field Type Default Purpose
question str The question to answer
n_results int 5 Number of doc chunks to include as context

Example

Input JSON: {"question": "How do we deploy to production?", "n_results": 5}


ErrorResponse — line 73

Generic error response model, used across all endpoints.

Fields

Field Type Default Purpose
error str Short error message
detail str "" Optional extended explanation

Example

ErrorResponse(error="Module not found", detail="Available: api, models, services")

Serialized JSON:

{
  "error": "Module not found",
  "detail": "Available: api, models, services"
}

Clone this wiki locally