GenEC: Safe and Explainable Extract Class Refactoring

A hybrid framework for automated Extract Class refactoring in Java. GenEC fuses static dependency analysis with evolutionary coupling mined from Git history, uses a constrained LLM (Claude Sonnet) for semantic artifacts only (names, rationales, confidence), and delegates all structural edits to deterministic code generation built on Eclipse JDT's AST rewriting infrastructure. Multi-tier verification ensures no unsafe suggestion reaches the developer.

Paper: Uditanshu Tomar, Vijay Kumar Poloju, Danny Dig. "GenEC: A Hybrid Framework for Safe and Explainable Extract Class Refactoring." Targeting ICSE 2027.

Key Results

178 candidate suggestions across 23 large God Classes from 6 open-source projects
58.4% verification rate (104 of 178 pass multi-tier verification)
41.6% of unsafe proposals blocked before reaching developers
Macro F1 = 0.478 on the HECS ECAccEval benchmark (21 instances with evolutionary context)
4.8x more extraction opportunities than metric-only baselines (Wilcoxon p=0.0005)

Features

Hybrid Analysis: Fuses static dependency graphs with evolutionary co-change coupling from Git history
Constrained LLM: Claude Sonnet generates only semantic artifacts (class names, rationales, confidence scores); never generates or modifies code
Deterministic Code Generation: Eclipse JDT AST rewriting handles all structural transformations
Multi-Tier Verification: Compilation (with stub generation), structural integrity, and behavioral test validation
Structural Transformation Plans: Actionable guidance when automatic extraction is not safe
VS Code Extension: Interactive UI for reviewing and applying suggestions
Reproducibility: Cached LLM outputs, deterministic replay mode, configurable random seeds

Prerequisites

Python 3.10+
Java 17+ (for the JDT code generation wrapper)
Git (required for evolutionary coupling analysis)
Maven (optional, needed for behavioral verification)

Installation

cd genec
pip install -e .

Configuration

Set your Anthropic API key:

export ANTHROPIC_API_KEY='your-api-key-here'

Or pass it directly via --api-key on the command line.

Edit config/config.yaml to customize parameters.

Validation Configuration

The pipeline includes three-tier validation that prevents invalid extractions:

verification:
  enable_extraction_validation: true     # Static analysis before code generation
  suggest_pattern_transformations: true  # LLM design pattern suggestions
  enable_semantic: true                  # Semantic AST validation

refactoring_application:
  enabled: true
  dry_run: false                         # Set to true to preview only
  auto_apply: true                       # Automatically apply verified suggestions
  enable_git: true                       # Create branches and commits
  transactional: true                    # All-or-nothing application
  backup: true                           # Create filesystem backups

Validation Tiers:

Static Validation (~instant): Detects abstract methods, inner classes, missing private dependencies
LLM Semantic (~3-5s per cluster): Validates borderline cases, overrides if confidence >= 0.7
Pattern Transformation (~3-5s per cluster): Suggests design patterns to enable blocked extractions

Output:

Valid suggestions → data/outputs/{ClassName}/suggestion_{N}/
Pattern guidance → data/outputs/{ClassName}/transformation_guidance/
Structural plans → data/outputs/structural_plans/{ClassName}/

Usage

Command Line

# Basic usage
genec --target path/to/Class.java --repo path/to/repo

# Preview changes without applying them
genec --target path/to/Class.java --repo path/to/repo --dry-run

# JSON output (for tool integration)
genec --target path/to/Class.java --repo path/to/repo --json

# Save reports and use LLM cache for reproducibility
genec --target path/to/Class.java --repo path/to/repo \
  --report-dir ./reports --cache-dir ./cache --use-cache

CLI Flags

Flag	Description
`--target`	(required) Path to the Java class file to refactor
`--repo`	(required) Path to the repository root
`--config`	Path to config file (default: `config/config.yaml`)
`--dry-run`	Show what would be applied without making changes
`--json`	Output results in JSON format
`--api-key`	Anthropic API key (overrides `ANTHROPIC_API_KEY` env var)
`--report-dir`	Directory to save pipeline reports
`--cache-dir`	Directory for LLM response cache (reproducibility)
`--use-cache`	Use cached LLM responses if available
`--max-suggestions`	Maximum number of suggestions (default: 5)
`--apply-all`	Automatically apply all verified refactorings
`--verbose`	Enable DEBUG-level logging
`--min-cluster-size`	Override minimum cluster size
`--max-cluster-size`	Override maximum cluster size
`--min-cohesion`	Override minimum cohesion threshold
`--check-coverage`	Verify test coverage of extracted classes (requires JaCoCo)
`--seed INT`	Random seed for reproducible clustering
`--max-passes INT`	Maximum decomposition passes (default: 1)
`--no-build`	Disable automatic building of dependencies
`--websocket PORT`	Enable WebSocket progress server on the given port
`--multi-file`	Enable multi-file dependency analysis mode

Python API

from genec.core.pipeline import GenECPipeline

# Initialize pipeline
pipeline = GenECPipeline('config/config.yaml')

# Run on a single Java class
result = pipeline.run_full_pipeline(
    class_file='src/main/java/com/example/GodClass.java',
    repo_path='/path/to/git/repo'
)

# View verified suggestions
for suggestion in result.verified_suggestions:
    print(f"New Class: {suggestion.proposed_class_name}")
    print(f"Rationale: {suggestion.rationale}")
    print(f"Confidence: {suggestion.confidence_score}")
    print(f"Code:\n{suggestion.new_class_code}")

Architecture

genec/
├── core/
│   ├── stages/                   # Pipeline stages
│   │   ├── analysis_stage.py
│   │   ├── clustering_stage.py
│   │   ├── graph_processing_stage.py
│   │   ├── naming_stage.py
│   │   └── refactoring_stage.py
│   ├── pipeline_runner.py        # Pipeline orchestrator
│   ├── dependency_analyzer.py    # Static Java analysis
│   ├── evolutionary_miner.py     # Git history mining
│   ├── graph_builder.py          # Graph construction & fusion
│   ├── cluster_detector.py       # Leiden clustering
│   ├── llm_interface.py          # Claude API integration
│   ├── verification_engine.py    # Multi-layer verification
│   └── pipeline.py               # Main orchestration
├── llm/
│   ├── anthropic_client.py       # Centralized LLM client with retry
│   └── __init__.py               # LLM utilities
├── parsers/
│   └── java_parser.py            # Java AST parsing
├── metrics/
│   ├── cohesion_calculator.py    # LCOM5 calculation
│   └── coupling_calculator.py    # CBO calculation
├── verification/
│   ├── extraction_validator.py   # Static extraction validation
│   ├── llm_semantic_validator.py # LLM-based semantic validation
│   ├── llm_pattern_transformer.py # Design pattern suggestions
│   ├── syntactic_verifier.py     # Compilation checks
│   ├── semantic_verifier.py      # AST validation
│   └── behavioral_verifier.py    # Test execution
├── structural/
│   ├── transformer.py            # Structural scaffolding plans
│   └── compile_validator.py     # Build validation
└── evaluation/
    ├── ground_truth_builder.py   # RefactoringMiner integration
    └── comparator.py             # Precision/Recall/F1

Pipeline Stages

AnalysisStage: Extract method calls, field accesses from Java AST and analyze Git history for co-changing methods.
GraphProcessingStage: Create and fuse static + evolutionary graphs, calculate metrics, and export data.
ClusteringStage: Apply Leiden algorithm, filter and rank clusters.
- Extraction Validation: Static analysis for abstract methods, inner classes, private dependencies
- Auto-fix: Iterative transitive closure for missing private method dependencies
- LLM Semantic Validation: Intelligent override for borderline cases (confidence >= 0.7)
NamingStage: Generate refactoring suggestions via Claude API (LLM).
RefactoringStage: Apply and verify refactorings.
- Verification: Validate through syntactic, semantic, behavioral layers
- Transformation Guidance (for rejected clusters): Pattern suggestions and structural plans

Metrics

LCOM5: Lack of Cohesion of Methods (lower is better)
CBO: Coupling Between Objects (lower is better)
Modularity: Graph modularity score (higher is better)
Precision/Recall/F1: Against ground truth refactorings

Testing

pytest tests/

Documentation

Architecture Guide: Detailed architecture documentation including the three-tier validation system
Evaluation README: Comprehensive evaluation methodology, benchmarks, and reproduction instructions

Citation

If you use GenEC in your research, please cite:

@inproceedings{tomar2027genec,
  title     = {GenEC: A Hybrid Framework for Safe and Explainable Extract Class Refactoring},
  author    = {Tomar, Uditanshu and Poloju, Vijay Kumar and Dig, Danny},
  booktitle = {Proceedings of the International Conference on Software Engineering (ICSE)},
  year      = {2027},
  note      = {Under submission}
}

Related Work

GenEC extends the "LLM for semantics, IDE for mechanics" paradigm established by:

EM-Assist (FSE 2024) - Extract Method refactoring with LLMs
MM-Assist (ICSME 2025) - Move Method refactoring with LLMs and semantic embeddings
PyCraft (FSE 2024) - LLM + Transformation by Example for Python

License

MIT License - see LICENSE for details.

Acknowledgments

This work is conducted at the University of Colorado Boulder under the supervision of Prof. Danny Dig. GenEC uses Anthropic's Claude Sonnet API for semantic artifact generation.

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
.github		.github
config		config
docs		docs
evaluation		evaluation
genec-jdt-wrapper		genec-jdt-wrapper
genec-spoon-wrapper		genec-spoon-wrapper
genec		genec
paper		paper
tests		tests
vscode-extension		vscode-extension
.gitignore		.gitignore
ARTIFACT_README.md		ARTIFACT_README.md
Dockerfile		Dockerfile
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
pyproject.toml		pyproject.toml
reproduce.sh		reproduce.sh
requirements-dev.txt		requirements-dev.txt
requirements-lock.txt		requirements-lock.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenEC: Safe and Explainable Extract Class Refactoring

Key Results

Features

Prerequisites

Installation

Configuration

Validation Configuration

Usage

Command Line

CLI Flags

Python API

Architecture

Pipeline Stages

Metrics

Testing

Documentation

Citation

Related Work

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GenEC: Safe and Explainable Extract Class Refactoring

Key Results

Features

Prerequisites

Installation

Configuration

Validation Configuration

Usage

Command Line

CLI Flags

Python API

Architecture

Pipeline Stages

Metrics

Testing

Documentation

Citation

Related Work

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages