Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@ jobs:
cache: "npm"
cache-dependency-path: libs/openant-core/parsers/javascript/package-lock.json

- name: Set up Go
uses: actions/setup-go@v6
with:
go-version-file: libs/openant-core/parsers/go/go_parser/go.mod
cache-dependency-path: libs/openant-core/parsers/go/go_parser/go.mod

- name: Install Python dependencies
working-directory: libs/openant-core
run: pip install -r requirements.txt && pip install ".[dev]"
Expand All @@ -68,6 +74,16 @@ jobs:
working-directory: libs/openant-core/parsers/javascript
run: npm ci

- name: Build go_parser binary (Linux/macOS)
if: runner.os != 'Windows'
working-directory: libs/openant-core/parsers/go/go_parser
run: go build -o go_parser .

- name: Build go_parser binary (Windows)
if: runner.os == 'Windows'
working-directory: libs/openant-core/parsers/go/go_parser
run: go build -o go_parser.exe .

- name: Run Python and parser tests
working-directory: libs/openant-core
run: python -m pytest tests/ -v
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,7 @@ __pycache__/
node_modules/
apps/openant-cli/bin/
libs/openant-core/parsers/go/go_parser/go_parser
libs/openant-core/parsers/javascript/.openant-npm-install.lock
_docs/
docs/
.worktrees/
95 changes: 95 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,101 @@

All notable changes to OpenAnt are documented in this file.

## [2026-05-12] — Parser depth, dependency UX, and LLM reachability (opt-in)

### Fixed

- **`openant parse` now defaults `--level` to `reachable`.** The Go CLI's
`parse` command previously defaulted to `--level all`, contradicting
`scan` and the Python CLI which both default to `reachable`. The
documentation has always said the default is `reachable`. Anyone running
`openant parse <repo>` without `-l` now gets the same dataset as
`openant scan <repo> --steps parse` — the documented behavior. Set
`--level all` explicitly to restore the previous output. (#35)

- **JS parser dependencies are now auto-installed on first use.**
`openant parse` on a JavaScript/TypeScript repository previously failed
out of the box with `Cannot find module 'ts-morph'` because nothing in
the install flow ran `npm install` for `parsers/javascript/`. The Python
parser adapter now runs `npm install` once on first JS parse using
`node_modules/.package-lock.json` as the completion sentinel (catches
Ctrl+C-interrupted installs). Python/Go-only users still never need
`npm`. Includes a cross-platform file lock to prevent concurrent install
corruption. Closes #6. (#37)

- **TypeScript parser now resolves dependency-injected service calls.**
NestJS-style `this.userService.findById()` calls were previously
unresolved in the call graph because the parser didn't extract
constructor parameter types. Adds DI-aware resolution covering
constructor injection (`constructor(private svc: SvcType)`),
field-decorator injection (`@Inject` / `@InjectRepository` / etc.), and
Angular's functional `inject()` API. Resolution priority: exact type →
nominal (`implements`/`extends`) → unambiguous prefix (e.g.
`CallService` → `CallServiceV1`). All steps return `null` on ambiguity
to preserve the resolver's no-false-positive guarantee. Class-level
metadata is keyed by `relativePath:className` so multi-module monorepos
with same-named classes work. (#39)

- **Express anonymous route handler callbacks are now extracted as units.**
`router.post('/orders', authenticateToken, async (req, res) => {...})` —
the anonymous handler callback was previously invisible to the analyzer
because the call-expression argument list wasn't walked. Synth units
now carry `route_handler` (last callback) or `route_middleware` (earlier
callbacks) with HTTP method/path metadata. Both unit types are now in
`ENTRY_POINT_TYPES` so the reachability filter doesn't drop them. The
receiver filter (`app` / `router` / `routes` / `server` / `web` / `api`
/ `endpoints` / `controller`) prevents false positives on
`myCache.get(...)` style calls. Named middleware identifiers become
call-graph edges so `authenticateToken` shows up as an upstream
dependency of the handler. Closes #21. (#49)

### Added

- **Auto-reinstall when `pyproject.toml` changes.** The Go CLI now hashes
`libs/openant-core/pyproject.toml` (SHA-256) and stores the hash at
`~/.openant/venv/.deps-hash`. Every `EnsureRuntime` call compares the
stored hash against the current file and re-runs `pip install -e <core>`
automatically when they differ. Eliminates the "user did `git pull`,
dependencies changed, but venv is stale" silent failure mode that
previously required manual reinstall. Best-effort: hash read/write
failures degrade gracefully with stderr warnings rather than crashing
the CLI. (#36)

- **`openant init` no longer requires a git repository for local paths.**
Init on a non-git directory (tarball download, generated code, locally
modified tree) now succeeds with `commit_sha` set to the `"nogit"`
placeholder. `--commit` on a non-git directory warns and is ignored
rather than hard-failing. Adds a shared `config/languages.json`
consumed by both the Go CLI and the Python parser adapter — single
source of truth for file-extension mappings and skip directories,
eliminating Go↔Python drift. Language auto-detection is exposed as
opt-in via `-l auto` (experimental dominance heuristic — see #61 for
the validation work needed before it becomes the default). (#40)

- **`--llm-reachability` opt-in stage on `openant scan`.** A new optional
review pass that uses Opus (default) to surface reachability signals
the structural analysis misses — likely entry points (framework
handlers, plugin/CLI registrations, message queues), external content
ingestion sites (HTTP request bodies, file/network reads, env/argv,
IPC), and async/cross-process data flows. Promote-only semantics:
signals can mark units as entry points but never demote a unit the
structural pass kept. When enabled, parse runs with `processing_level
= "all"` so the LLM sees the full unfiltered codebase, then the
structural reachability filter re-runs with LLM-promoted entry points
added as additional BFS seeds. Output: `llm_reachability.json` plus
per-unit `llm_reachability_signals` field on `dataset.json`.
Cost-conscious: opt-in only, batched (default 25 units per Opus call),
scales with total repo size rather than the filtered unit count. Off
by default. (#50)

- **All parsers now write `call_graph.json`.** Previously only the Python
and Zig parsers persisted this file; JS, Go, C, Ruby, and PHP did
reachability filtering internally and didn't expose the graph. Required
for the new `--llm-reachability` re-filter to work across all
languages. Defensive WARNING in `scanner.py` fires with a cost-impact
message if the file is ever missing for a language that should support
it. (#50)

## [2026-05-10] — Windows compatibility & CI hardening

### Fixed
Expand Down
176 changes: 160 additions & 16 deletions apps/openant-cli/cmd/init.go
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
package cmd

import (
"encoding/json"
"fmt"
"io/fs"
"os"
"os/exec"
"path/filepath"
Expand All @@ -26,6 +28,7 @@ After init, all commands (parse, scan, etc.) work without path arguments.
Examples:
openant init https://github.com/grafana/grafana -l go
openant init https://github.com/grafana/grafana -l go --commit 591ceb2eec0
openant init https://github.com/grafana/grafana -l auto
openant init ./repos/grafana -l go
openant init ./repos/grafana -l go --name myorg/grafana`,
Args: cobra.ExactArgs(1),
Expand All @@ -44,7 +47,7 @@ var (
)

func init() {
initCmd.Flags().StringVarP(&initLanguage, "language", "l", "", "Language to analyze: python, javascript, go, c, ruby, php (required)")
initCmd.Flags().StringVarP(&initLanguage, "language", "l", "", "Language to analyze: python, javascript, go, c, ruby, php, zig, auto (auto = experimental dominance heuristic; see #61)")
initCmd.Flags().StringVar(&initCommit, "commit", "", "Specific commit SHA (default: HEAD)")
initCmd.Flags().StringVar(&initName, "name", "", "Override project name (default: derived from URL/path)")
initCmd.Flags().BoolVar(&initFull, "full", false, "Force full scan (rejects --incremental/--diff-base/--pr)")
Expand Down Expand Up @@ -118,7 +121,7 @@ func runInit(cmd *cobra.Command, args []string) {
}
}
} else {
// Local: verify it's a git repo and resolve absolute path
// Local: resolve absolute path
source = "local"

absPath, err := filepath.Abs(input)
Expand All @@ -127,29 +130,48 @@ func runInit(cmd *cobra.Command, args []string) {
os.Exit(1)
}

if _, err := os.Stat(filepath.Join(absPath, ".git")); err != nil {
output.PrintError(fmt.Sprintf("%s is not a git repository (no .git directory)", absPath))
repoPath = absPath
}

// Auto-detect language if not specified
if initLanguage == "" || initLanguage == "auto" {
fmt.Fprintf(os.Stderr, "Auto-detecting language...\n")
detected, err := detectLanguage(repoPath)
if err != nil {
output.PrintError(fmt.Sprintf("Language auto-detection failed: %s\nSpecify manually with -l/--language", err))
os.Exit(1)
}
initLanguage = detected
fmt.Fprintf(os.Stderr, "Detected language: %s\n", initLanguage)
}

repoPath = absPath
// Get commit SHA (best-effort — not all local paths are git repos)
isGit := false
if _, err := os.Stat(filepath.Join(repoPath, ".git")); err == nil {
isGit = true
}

// Get commit SHA
commitSHA := initCommit
if commitSHA == "" {
out, err := exec.Command("git", "-C", repoPath, "rev-parse", "HEAD").Output()
if err != nil {
output.PrintError(fmt.Sprintf("Failed to get HEAD commit: %s", err))
os.Exit(1)
if isGit {
if commitSHA == "" {
out, err := exec.Command("git", "-C", repoPath, "rev-parse", "HEAD").Output()
if err != nil {
output.PrintError(fmt.Sprintf("Failed to get HEAD commit: %s", err))
os.Exit(1)
}
commitSHA = strings.TrimSpace(string(out))
} else {
// Resolve short SHA to full SHA
out, err := exec.Command("git", "-C", repoPath, "rev-parse", commitSHA).Output()
if err == nil {
commitSHA = strings.TrimSpace(string(out))
}
}
commitSHA = strings.TrimSpace(string(out))
} else {
// Resolve short SHA to full SHA
out, err := exec.Command("git", "-C", repoPath, "rev-parse", commitSHA).Output()
if err == nil {
commitSHA = strings.TrimSpace(string(out))
if commitSHA != "" {
output.PrintWarning("--commit ignored: not a git repository")
}
commitSHA = "nogit"
}

// Create project
Expand Down Expand Up @@ -224,3 +246,125 @@ func runInit(cmd *cobra.Command, args []string) {
output.PrintSuccess("Set as active project")
fmt.Println()
}

// languagesConfig is the structure of config/languages.json.
type languagesConfig struct {
SkipDirs []string `json:"skip_dirs"`
Extensions map[string]string `json:"extensions"`
}

// findLanguagesConfig locates config/languages.json by walking up from the
// executable path and then the current working directory.
func findLanguagesConfig() (string, error) {
rel := filepath.Join("config", "languages.json")

// Strategy 1: walk up from the executable.
if exePath, err := os.Executable(); err == nil {
exePath, _ = filepath.EvalSymlinks(exePath)
dir := filepath.Dir(exePath)
for range 6 {
candidate := filepath.Join(dir, rel)
if info, err := os.Stat(candidate); err == nil && !info.IsDir() {
return candidate, nil
}
parent := filepath.Dir(dir)
if parent == dir {
break
}
dir = parent
}
}

// Strategy 2: walk up from CWD.
if cwd, err := os.Getwd(); err == nil {
dir := cwd
for range 6 {
candidate := filepath.Join(dir, rel)
if info, err := os.Stat(candidate); err == nil && !info.IsDir() {
return candidate, nil
}
parent := filepath.Dir(dir)
if parent == dir {
break
}
dir = parent
}
}

return "", fmt.Errorf("could not find config/languages.json from executable or working directory")
}

// loadLanguagesConfig loads the shared language detection config.
func loadLanguagesConfig() (*languagesConfig, error) {
path, err := findLanguagesConfig()
if err != nil {
return nil, err
}
data, err := os.ReadFile(path)
if err != nil {
return nil, fmt.Errorf("failed to read %s: %w", path, err)
}
var cfg languagesConfig
if err := json.Unmarshal(data, &cfg); err != nil {
return nil, fmt.Errorf("failed to parse %s: %w", path, err)
}
return &cfg, nil
}

// detectLanguage walks a repository and returns the dominant language by file count.
// Extension mappings and skip directories are loaded from config/languages.json
// (shared with libs/openant-core/core/parser_adapter.py::detect_language()).
func detectLanguage(repoPath string) (string, error) {
cfg, err := loadLanguagesConfig()
if err != nil {
return "", fmt.Errorf("failed to load language config: %w", err)
}

skipDirs := make(map[string]bool, len(cfg.SkipDirs))
for _, d := range cfg.SkipDirs {
skipDirs[d] = true
}

counts := make(map[string]int)

err = filepath.WalkDir(repoPath, func(path string, d fs.DirEntry, err error) error {
if err != nil {
return nil // skip inaccessible paths
}
if d.IsDir() {
if skipDirs[d.Name()] {
return filepath.SkipDir
}
return nil
}

ext := strings.ToLower(filepath.Ext(d.Name()))
if lang, ok := cfg.Extensions[ext]; ok {
counts[lang]++
}
return nil
})
if err != nil {
return "", fmt.Errorf("failed to walk repository: %w", err)
}

// Find the dominant language
bestLang := ""
bestCount := 0
for lang, count := range counts {
if count > bestCount {
bestCount = count
bestLang = lang
}
}

if bestLang == "" {
return "", fmt.Errorf(
"no supported source files found in %s. "+
"Supported languages: Python, JavaScript/TypeScript, Go, C/C++, Ruby, PHP, Zig",
repoPath,
)
}

return bestLang, nil
}
Loading
Loading