Skip to content

Feature: Implement Code Health Analysis Tool (rpg_analyze_health) #58

@VooDisss

Description

@VooDisss

Summary

Implement a new MCP tool analyze_health that provides code health metrics including coupling analysis, instability detection, centrality scoring, god object identification, and two types of duplication detection (token-based and semantic).

Motivation

This feature implements the Code Health Meter (CHM) framework described in the research paper, providing RPG users with actionable insights into code architecture quality. The tool enables:

  • Identifying unstable modules that depend on too many others
  • Finding god objects with excessive coupling
  • Detecting code duplicates (both exact copies and semantic/conceptual duplicates)
  • Supporting evidence-based refactoring decisions

Implementation Details

New Files

  1. crates/rpg-nav/src/health.rs (587 lines)

    • HealthReport, EntityHealth, HealthSummary structs
    • Graph metrics computation: in-degree (Ca), out-degree (Ce), instability index (I = Ce / (Ca + Ce))
    • Degree centrality calculation (normalized)
    • God object detection (high degree + extreme instability)
    • Hub entity identification
    • compute_health() and compute_health_full() functions
  2. crates/rpg-nav/src/duplication.rs (1071 lines)

    • Rabin-Karp rolling hash fingerprinting for token-based clone detection
    • Per-entity tokenization using entity line ranges
    • Language-agnostic tokenization (supports Rust, TypeScript, Python, Go, Java, C/C++)
    • Semantic duplication detection via Jaccard similarity on lifted features
    • Configuration structs: DuplicationConfig, SemanticDuplicationConfig
    • detect_duplication() and detect_semantic_duplicates() functions

Modified Files

  1. crates/rpg-mcp/src/params.rs

    • Added AnalyzeHealthParams struct with parameters:
      • instability_threshold (default: 0.7)
      • god_object_threshold (default: 10)
      • include_duplication (default: false)
      • include_semantic_duplication (default: false)
      • semantic_similarity_threshold (default: 0.6)
  2. crates/rpg-mcp/src/tools.rs

    • Added analyze_health tool handler
  3. crates/rpg-nav/src/lib.rs

    • Added pub mod health; and pub mod duplication;
  4. crates/rpg-nav/src/toon.rs

    • Added health report formatting for LLM-friendly output
  5. crates/rpg-nav/Cargo.toml

    • Added rayon dependency for parallel file processing
  6. crates/rpg-nav/src/search.rs

    • Exposed jaccard_similarity for use by duplication module

API

MCP Tool: analyze_health

Parameters:

{
  "instability_threshold": 0.7,
  "god_object_threshold": 10,
  "include_duplication": false,
  "include_semantic_duplication": false,
  "semantic_similarity_threshold": 0.6
}

Output:

# Code Health Analysis

entities: 888 (845 analyzed)
dependency_edges: 2267
avg_instability: 0.374
avg_centrality: 0.0060
god_objects: 12
highly_unstable: 58
highly_stable: 57
hubs: 168

## Top Unstable Entities (I > 0.7)
- entity_id | instability=X.XXX | in=X out=X

## God Object Candidates
- entity_id | degree=X | instability=X.XXX

## Duplication Hotspots (when enabled)
- similarity=X% | tokens=N | entities=2

## Semantic Duplication (Conceptual Clones) (when enabled)
- similarity=X% | shared: [feature1, feature2]
    entity_id (file_path)
    entity_id (file_path)

## Recommendations
1. Refactor god objects
2. Review hub entities
3. Extract shared abstractions

Testing

  • 17 unit tests in duplication module
  • 6 unit tests in health module
  • Tests cover: tokenization, Rabin-Karp fingerprinting, Jaccard similarity, per-entity detection, edge cases

References

Khalfallah, B. H. (2025). Code Health Meter: A Quantitative and Graph-Theoretic Foundation for Automated Code Quality and Architecture Assessment. ACM Transactions on Software Engineering and Methodology. https://doi.org/10.1145/3737670

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions