LLM-Generated Variable Names vs. Human-Written Variable Names in Java

Replication package for the CCSW 325 (Software Construction) course project at the University of Jeddah.

Authors: Joury Ghorab (2312080), Jana Akkad (2313200), Enas Alqarni (2314104) Instructor: Qamar Naith Department: Software Engineering, College of Computer Science and Engineering, University of Jeddah

Overview

This study evaluates whether ChatGPT (GPT-4) can generate variable names in Java that match the clarity, readability, and meaningfulness of names written by human developers. We curated 25 Java snippets containing 132 variables, replaced the original identifiers with generic placeholders (v1, v2, …), asked GPT-4 to suggest new names, and then compared the LLM suggestions to the originals using both automatic metrics and a blinded human survey.

Repository Structure

.
├── README.md                 # This file
├── snippets/
│   ├── original/             # 25 Java snippets with original (human) variable names
│   └── obfuscated/           # Same 25 snippets with names replaced by v1, v2, ...
├── scripts/
│   ├── obfuscate.py          # Script that produces obfuscated/ from original/
│   └── evaluate.py           # Computes exact match and edit-distance metrics
├── llm_outputs/
│   └── gpt4_responses.json   # Raw API responses from GPT-4 (3 runs per snippet)
├── survey/
│   ├── survey_instrument.md  # The blinded survey shown to participants
│   └── responses.csv         # Anonymized responses from 8 participants
├── results/
│   ├── automatic_metrics.csv # Per-variable exact match and NEDS scores
│   ├── survey_summary.csv    # Aggregated Likert and preference data
│   └── prompt.txt            # The final prompt used for the experiment
├── report/
│   └── CCSW325_Final_Report.pdf   # Full course project report

## How to Reproduce

1. **Obfuscate the dataset:**
   ```bash
   python scripts/obfuscate.py snippets/original snippets/obfuscated

Run GPT-4 on each obfuscated snippet using the prompt in results/prompt.txt. Generation parameters: model=gpt-4, temperature=0.2, three runs per snippet, take the modal name.

Compute automatic metrics:

python scripts/evaluate.py snippets/original llm_outputs/gpt4_responses.json

Survey results in survey/responses.csv were collected using the instrument in survey/survey_instrument.md from 8 Software Engineering students.

Headline Results

Metric	Value
Exact Match Rate	23.9% (27/113)
Semantic Similarity (NEDS ≥ 0.6 or shared head noun)	70.8% (80/113)
Mean NEDS	0.51
Convention Compliance	96.2%
Survey Preference (LLM vs. Human vs. None)	47.5% / 38.0% / 14.5%

See the project report (CCSW325_Final_Report.pdf) for the full discussion.

License

This repository uses a dual license:

Code (scripts/): MIT License
Data, results, and report (snippets/, llm_outputs/, results/, survey/, report/): CC BY 4.0

The Java snippets in snippets/original/ are short, standard implementations of common introductory programming exercises, used here for research purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM-Generated Variable Names vs. Human-Written Variable Names in Java

Overview

Repository Structure

Headline Results

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
llm_outputs		llm_outputs
report		report
results		results
scripts		scripts
snippets		snippets
survey		survey
LICENSE		LICENSE
LICENSE-DATA		LICENSE-DATA
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

LLM-Generated Variable Names vs. Human-Written Variable Names in Java

Overview

Repository Structure

Headline Results

License

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages