-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Summary
Implement a new MCP tool analyze_health that provides code health metrics including coupling analysis, instability detection, centrality scoring, god object identification, and two types of duplication detection (token-based and semantic).
Motivation
This feature implements the Code Health Meter (CHM) framework described in the research paper, providing RPG users with actionable insights into code architecture quality. The tool enables:
- Identifying unstable modules that depend on too many others
- Finding god objects with excessive coupling
- Detecting code duplicates (both exact copies and semantic/conceptual duplicates)
- Supporting evidence-based refactoring decisions
Implementation Details
New Files
-
crates/rpg-nav/src/health.rs (587 lines)
HealthReport,EntityHealth,HealthSummarystructs- Graph metrics computation: in-degree (Ca), out-degree (Ce), instability index (I = Ce / (Ca + Ce))
- Degree centrality calculation (normalized)
- God object detection (high degree + extreme instability)
- Hub entity identification
compute_health()andcompute_health_full()functions
-
crates/rpg-nav/src/duplication.rs (1071 lines)
- Rabin-Karp rolling hash fingerprinting for token-based clone detection
- Per-entity tokenization using entity line ranges
- Language-agnostic tokenization (supports Rust, TypeScript, Python, Go, Java, C/C++)
- Semantic duplication detection via Jaccard similarity on lifted features
- Configuration structs:
DuplicationConfig,SemanticDuplicationConfig detect_duplication()anddetect_semantic_duplicates()functions
Modified Files
-
crates/rpg-mcp/src/params.rs
- Added
AnalyzeHealthParamsstruct with parameters:instability_threshold(default: 0.7)god_object_threshold(default: 10)include_duplication(default: false)include_semantic_duplication(default: false)semantic_similarity_threshold(default: 0.6)
- Added
-
crates/rpg-mcp/src/tools.rs
- Added
analyze_healthtool handler
- Added
-
crates/rpg-nav/src/lib.rs
- Added
pub mod health;andpub mod duplication;
- Added
-
crates/rpg-nav/src/toon.rs
- Added health report formatting for LLM-friendly output
-
crates/rpg-nav/Cargo.toml
- Added
rayondependency for parallel file processing
- Added
-
crates/rpg-nav/src/search.rs
- Exposed
jaccard_similarityfor use by duplication module
- Exposed
API
MCP Tool: analyze_health
Parameters:
{
"instability_threshold": 0.7,
"god_object_threshold": 10,
"include_duplication": false,
"include_semantic_duplication": false,
"semantic_similarity_threshold": 0.6
}Output:
# Code Health Analysis
entities: 888 (845 analyzed)
dependency_edges: 2267
avg_instability: 0.374
avg_centrality: 0.0060
god_objects: 12
highly_unstable: 58
highly_stable: 57
hubs: 168
## Top Unstable Entities (I > 0.7)
- entity_id | instability=X.XXX | in=X out=X
## God Object Candidates
- entity_id | degree=X | instability=X.XXX
## Duplication Hotspots (when enabled)
- similarity=X% | tokens=N | entities=2
## Semantic Duplication (Conceptual Clones) (when enabled)
- similarity=X% | shared: [feature1, feature2]
entity_id (file_path)
entity_id (file_path)
## Recommendations
1. Refactor god objects
2. Review hub entities
3. Extract shared abstractions
Testing
- 17 unit tests in
duplicationmodule - 6 unit tests in
healthmodule - Tests cover: tokenization, Rabin-Karp fingerprinting, Jaccard similarity, per-entity detection, edge cases
References
Khalfallah, B. H. (2025). Code Health Meter: A Quantitative and Graph-Theoretic Foundation for Automated Code Quality and Architecture Assessment. ACM Transactions on Software Engineering and Methodology. https://doi.org/10.1145/3737670