JIDRA (Java Intelligent Diagnostic & Reasoning Agent) is a focused CLI for Java codebase graph indexing, call tracing, context extraction, prompt generation, and optional LLM-based diagnosis.
This project is intentionally minimal and graph-driven.
- Index once → get a deterministic call graph you can trace and query offline.
- Trace from a method or HTTP route to see likely execution flow and unresolved edges.
- Generate prompt-ready context and prompts (
context,prompt) for LLM workflows. - Produce deterministic investigation docs from stack traces (
error-doc) and flows (flow-doc). - Optional LiteLLM-based diagnosis on top of graph-grounded context.
- Builds a graph from Java source (
index) - Traces method flow (
trace) - Traces by route entry (
trace-route) - Builds prompt-ready method context (
context) - Generates LLM prompt text (
prompt) - Runs LLM diagnosis with structured metrics (
diagnose)
- No enrichment agents
- No multiprocessing/async pipelines
- No UI/debug dashboards
- No graph format mutation at runtime
jidra/
├── pyproject.toml
├── requirements.txt
├── README.md
└── jidra/
├── __init__.py
├── cli.py
├── config.yaml
├── llm_client.py
├── models.py
├── graph_io.py
├── selector.py
├── trace_engine.py
├── context_builder.py
├── extractor.py
├── exporter.py
├── filters.py
└── cache.py
JIDRA operates as a graph-grounded reasoning backend for Enterprise Java. It decouples deterministic static code ingestion from upstream execution entrypoints (CLI and Model Context Protocol Server) using a unified engine service layer (JidraEngine).
graph LR
%% Global Styling Classes for High Contrast Rendering
classDef StorageClass fill:#FFF7ED,stroke:#EA580C,stroke-width:2px;
classDef EngineClass fill:#F8FAFC,stroke:#475569,stroke-width:2px;
classDef InterfaceClass fill:#EFF6FF,stroke:#1D4ED8,stroke-width:2px;
classDef RuntimeClass fill:#F0FDF4,stroke:#15803D,stroke-width:2px;
%% 1. INGESTION PIPELINE
subgraph INGESTION [1. Ingestion System]
direction TB
JavaSource[Java Source Files <br> *.java]
Extractor(extractor.py <br> Tree-Sitter AST)
Exporter(exporter.py <br> Records Generator)
JavaSource --> Extractor --> Exporter
end
%% 2. ARTIFACT DB STORAGE
subgraph ARTIFACTS [2. Graph DB Storage]
GraphJSONL[(graph.jsonl <br> graph_test.jsonl)]
end
Exporter -->|Writes| GraphJSONL
%% 3. JIDRA ENGINE CORE SERVICE LAYER
subgraph ENGINE_LAYER [3. Unified Engine Service Layer]
direction LR
subgraph Preparation [Ingestion & Resolution]
direction TB
GraphIO[graph_io.py <br> Load Graph]
Selector[selector.py <br> Match Method ID]
end
subgraph Scripts [Core Processing Engines]
direction TB
Trace[trace_engine.py]
Context[context_builder.py]
Stitcher[flow_stitcher.py]
end
JidraEngine[["engine.py <br> (JidraEngine App Facade)"]]
%% Linear Processing Core Dataflow
GraphIO --> Selector
Selector --> Trace & Context & Stitcher
Trace & Context & Stitcher --> JidraEngine
end
class ARTIFACTS StorageClass;
class ENGINE_LAYER EngineClass;
%% Global Inter-Subgraph Links
GraphJSONL -->|Reads| GraphIO
%% 4. SYSTEM ENTRYPOINTS & ENTRY ARTIFACTS
subgraph INTERFACES [4. System Gateways]
direction TB
CLI[cli.py <br> CLI Controller]
MCPServer[mcp_server.py <br> Stdio Protocol Server]
StackTrace[User Runtime Data <br> Raw Stack Traces]
end
class INTERFACES InterfaceClass;
%% Binding Gateways directly to the Unified Engine Class
JidraEngine ==>|Exposes Engine Facade API| CLI
JidraEngine ==>|Exposes Tools Ecosystem| MCPServer
StackTrace -.->|Parsed By| CLI
StackTrace -.->|Analyzed By Tool| MCPServer
%% 5. EXECUTION & AI AGENT ENVIRONMENT
subgraph ECOSYSTEM [5. Downstream Consumers]
direction TB
LLMClient(llm_client.py <br> LiteLLM Execution)
MarkdownDoc[Deterministic Reports <br> flow-doc / error-doc]
ExternalAgent[AI Coding Agent <br> Claude / Windsurf / Codex]
end
class ECOSYSTEM RuntimeClass;
%% Routing Outputs & Agent Execution Loops
CLI -->|`diagnose` command| LLMClient
CLI -->|Generates| MarkdownDoc
MCPServer <==>|Model Context Protocol Binding| ExternalAgent
ExternalAgent -.->|Context-Pruned Selection| JavaSource
This project is released under the MIT License (see LICENSE).
From project root:
pip install -e .If you use the local venv:
.venv/bin/pip install -e .Some features (like error-doc choosing the first "project" stack frame as an anchor) can use
package prefixes to distinguish your code from third-party libraries.
Set a comma-separated list:
export JIDRA_PROJECT_PREFIXES="com.myco.,org.example."If unset, JIDRA treats any package as project code for anchoring.
python -m jidra.cli index \
--codebase /path/to/java/repo \
--output /tmp/graph.jsonlWhen output is a directory, JIDRA writes:
graph.jsonl(main)graph_test.jsonl(test)
python -m jidra.cli trace \
--graph /tmp/graph.jsonl \
--method com.example.Controller.searchpython -m jidra.cli context \
--graph /tmp/graph.jsonl \
--method com.example.Controller.searchpython -m jidra.cli prompt \
--graph /tmp/graph.jsonl \
--method com.example.Controller.search \
--target codexpython -m jidra.cli diagnose \
--graph /tmp/graph.jsonl \
--method com.example.Controller.search \
--target codex \
--llm-profile localFor trace, context, trace-route, prompt, diagnose:
--graphprovided: used directly--graphomitted: selected by--graph-type(maindefault)main->jidra/output/graph.jsonltest->jidra/output/graph_test.jsonl
Supported method selectors:
- method id
- full signature
- full class + method (
com.example.Class.method) - short class + method (
Class.method) - bare method name (if unique)
Ambiguous selector output includes candidate ids you can use directly.
Purpose: generate deterministic flow investigation markdown from indexed graph data (no LLM calls).
jidra flow-doc \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
--output <markdown-path> \
[--depth 4] \
[--top-n 8] \
[--max-subflows 8] \
[--mind-map] \
[--max-nodes 200] \
[--include-details] \
[--include-utility]Behavior:
- Normal mode (no
--mind-map): prioritized flow slices usingtop_nandmax_subflows. --mind-mapmode: recursive resolved-edge traversal usingdepth + max_nodes; it does not usetop_n/max_subflowsfor traversal.--include-details: in--mind-mapmode, appends legacy detailed expanded sections that still use prioritized slicing (top_n/max_subflows).- Output is deterministic for the same graph + method + flags.
Examples:
python -m jidra.cli flow-doc \
--method SearchServiceController.search \
--output flow_docs/verify_SearchServiceController_search.md \
--depth 10 \
--top-n 10 \
--max-subflows 10 \
--show-agentspython -m jidra.cli flow-doc \
--method SearchServiceController.search \
--output flow_docs/mindmap_SearchServiceController_search.md \
--mind-map \
--depth 6 \
--max-nodes 120Purpose: generate deterministic error investigation markdown from a Java stack trace text file and indexed graph.
jidra error-doc \
--stack-trace <stack-trace.txt> \
--output <markdown-path> \
[--graph <path>] \
[--graph-type main|test] \
[--depth 6] \
[--max-nodes 200] \
[--mind-map]Stack frame parsing:
- Parses lines in format:
at package.Class.method(File.java:123).
Frame-to-method matching:
- class full name
- method name
- file name
- line in method
[start_line, end_line]
Match semantics:
matched: exactly one graph method candidate.ambiguous: multiple candidates (reported as ambiguity).unmatched: no candidate.
Anchor + focused map:
- primary failure anchor: first matched/ambiguous project frame.
- focused flow map: generated via deterministic
flow-docmind-map traversal around anchor. - upstream/downstream behavior:
- downstream-focused when anchor has meaningful downstream callees.
- upstream-focused fallback when downstream is weak.
Examples:
python -m jidra.cli error-doc \
--stack-trace examples/error_1.txt \
--output flow_docs/error_doc_verify_clean.md \
--mind-map \
--depth 6 \
--max-nodes 80- Static analysis only; runtime dispatch is not guaranteed.
- Unresolved calls may remain in outputs.
- External library frames/methods may be unmatched.
- Graph quality directly affects output quality.
- No runtime correctness claims; output is investigation guidance.
## Suggested Debug Locations
| priority | location | reason |
|---:|---|---|
| 1 | `com.example.app.health.HealthIndicator#doHealthCheck(Health.Builder)` | failing project frame |
| 2 | `org.opensearch.client.opensearch.cluster.OpenSearchClusterClient#health:360` | caller frame above failure |
| 3 | `this.client.cluster().health` | unresolved external call near failure |jidra index --codebase <path> --output <path-or-dir>Builds graph JSONL from Java source using tree-sitter parser pipeline.
jidra trace \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--max-depth 5] \
[--business-only] \
[--output <file-or-dir>]--business-onlyfilters support/metrics/logging from flow output- root node is always preserved
jidra context \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--max-chars 12000] \
[--max-tokens <int>] \
[--business-only] \
[--output <file-or-dir>]Includes:
- method signature/source
- endpoint metadata
- resolved callee summary
- unresolved call summary
Context output is deduped/grouped for prompt readiness.
jidra trace-route \
[--graph <path>] \
[--graph-type main|test] \
--route <path> \
[--max-depth 5] \
[--output <file-or-dir>]jidra prompt \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--max-chars 12000] \
[--max-tokens <int>] \
[--business-only|--no-business-only] \
[--target claude|codex|generic] \
[--output <file-or-dir>]Default: --business-only is enabled.
jidra diagnose \
[--graph <path>] \
[--graph-type main|test] \
--method <selector> \
[--target claude|codex|generic] \
[--model <model>] \
[--max-chars 12000] \
[--max-tokens <int>] \
[--business-only|--no-business-only] \
[--llm-profile local|enterprise] \
[--config <path-to-config.yaml>] \
[--show-prompt] \
[--quiet] \
[--output <file-or-dir>]Behavior:
- No
--output+ interactive TTY + not--quiet: ANSI-readable report - No
--output+ non-TTY or--quiet: JSON printed - With
--output: JSON written to file --show-prompt: includes prompt text in result JSON--max-chars: controls method context/source size sent into prompt construction--max-tokens: overrides model output token limit for this run (when omitted, config profile default is used)
When --output is a directory:
- trace:
trace_<graph_type>_<method>.json - trace + business-only:
trace_business_<graph_type>_<method>.json - context:
context_<graph_type>_<method>.json - context + business-only:
context_business_<graph_type>_<method>.json - trace-route:
trace_route_<graph_type>_<route_or_entry>.json - prompt:
prompt_<target>_<graph_type>_<method>.txt - diagnose:
diagnose_<target>_<graph_type>_<method>.json
Names are normalized to lowercase snake-style safe parts.
JIDRA uses jidra/config.yaml.
Example:
llm:
provider: litellm
profile: local
profiles:
local:
api_base: "http://localhost:4000"
api_key_env: "LITELLM_PROXY_API_KEY"
default_model: "ollama/gemma4:e4b"
timeout_seconds: 120
temperature: 0.2
max_tokens: 1200
enterprise:
api_base: "https://your-enterprise-litellm.example.com"
api_key_env: "ENTERPRISE_LITELLM_API_KEY"
default_model: "gpt-4o-mini"
timeout_seconds: 120
temperature: 0.2
max_tokens: 2000Rules:
- Default profile comes from
llm.profile - CLI override:
--llm-profile - If
api_key_envis set, env var is read - Missing config falls back to safe local defaults
diagnose returns JSON with:
{
"method": "...",
"analysis": "...",
"llm": {
"provider": "litellm",
"profile": "local",
"model": "...",
"usage": {
"input_tokens": 0,
"output_tokens": 0,
"total_tokens": 0,
"reasoning_tokens": 0
},
"latency_seconds": 0.0,
"limits": {
"max_chars": 12000,
"max_tokens": null
}
},
"context_summary": {
"business_flow_count": 0,
"unresolved_count": 0
}
}If provider usage is unavailable, token counts are estimated and:
"estimated": trueis added under llm.usage.
--max-chars(context, prompt, diagnose):- default
12000 - passed directly to context building to constrain context payload size
- default
--max-tokens(context, prompt, diagnose):- optional CLI override
- primarily used by
diagnoseto cap LLM output tokens - if omitted, profile default from
jidra/config.yamlis used
Likely LLM connectivity issue:
- verify LiteLLM endpoint in config
- verify API key/env key
- verify network access to endpoint
Use a stronger selector:
- class+method or exact method id from ambiguity output
No endpoint matched that route in graph. Validate route annotations and graph source set.
Check Python/venv and package index/network availability.
This repo includes an jidra/experiments/ package with exploratory agent-style components:
enrichment_agent.py,enrichment_judge.py,enrichment_orchestrator.py,enrichment_ui.pymethod_prompt.py,token_count.py
These modules are optional and not required for the core deterministic CLI workflow.
They are currently used only when you enable agent visibility in flow-doc via --show-agents.
If you are vendoring JIDRA or aiming for a minimal footprint, you can ignore this folder.
cli.pyhandles command orchestration only.llm_client.pyowns provider/config/use-metrics behavior.- graph extraction and graph format are intentionally unchanged.