feat(eval): cweval by zhewang2001 · Pull Request #9 · purpcode-uiuc/purpcode

zhewang2001 · 2025-08-07T21:44:27Z

No description provided.

gemini-code-assist

Summary of Changes

Hello @zhewang2001, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

I've implemented a new feature to integrate CWEval into our evaluation framework. This allows us to generate code responses for CWEval tasks using our models and then prepare these generated files in the format expected by the external CWEval repository for subsequent evaluation. The README.md has also been updated with clear instructions on how to set up and run these evaluations.

Highlights

CWEval Integration: I've added a dedicated evaluate_cweval function to process model-generated responses for CWEval tasks.
Code Extraction & Formatting: This new function intelligently extracts code blocks from assistant messages in our generation outputs and saves them into the specific directory structure required by the CWEval repository.
Configurable Paths: I've introduced a new --cweval_path argument across the evaluation scripts (main.py, eval_only.py, evaluate.py) to allow users to specify the custom location of their CWEval repository.
Documentation: The README.md has been updated with a detailed section providing step-by-step instructions for setting up and running CWEval evaluations, including how to clone the repository and execute the evaluation script.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull Request Overview

This PR adds support for CWEval evaluation to the codebase by implementing the cweval_path parameter throughout the evaluation pipeline and providing a complete implementation of the CWEval evaluator.

Adds cweval_path parameter to main evaluation functions for custom CWEval repository path specification
Implements evaluate_cweval function to process generation files and extract code blocks for CWEval evaluation
Updates documentation with CWEval setup and usage instructions

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
eval/main.py	Adds `cweval_path` parameter to main function signature
eval/evaluate.py	Updates `evaluate_main` to accept and pass `cweval_path` parameter
eval/eval_only.py	Adds `cweval_path` parameter for evaluation-only workflow
eval/cweval.py	Implements complete CWEval evaluation logic with code extraction and file generation
README.md	Adds comprehensive CWEval setup and usage documentation

eval/cweval.py

gemini-code-assist

Code Review

This pull request introduces evaluation support for CWEval. The changes include a new script eval/cweval.py to handle the evaluation logic, updates to the main evaluation entry points to include the new functionality, and documentation in the README on how to use it.

My review focuses on improving the robustness and efficiency of the new cweval.py script. Key suggestions include:

Making path manipulations more portable.
Ensuring the script is robust against exceptions when changing directories.
Improving memory efficiency when processing large files.
A minor fix for the README markdown formatting.

Overall, the changes are a good addition, and with these improvements, the new evaluation script will be more reliable.

eval/cweval.py

README.md

eval/cweval.py

zhewang2001 added 3 commits August 7, 2025 21:24

docs: CWEval evaluation setup guide

a0dc0c8

feat(eval): cweval

4a6a914

refactor: simplify code block extraction logic

904cbd7

Copilot AI review requested due to automatic review settings August 7, 2025 21:44

gemini-code-assist bot reviewed Aug 7, 2025

View reviewed changes

Copilot AI reviewed Aug 7, 2025

View reviewed changes

eval/cweval.py Show resolved Hide resolved

eval/cweval.py Show resolved Hide resolved

eval/cweval.py Show resolved Hide resolved

eval/cweval.py Show resolved Hide resolved

gemini-code-assist bot reviewed Aug 7, 2025

View reviewed changes

eval/cweval.py Show resolved Hide resolved

README.md Outdated Show resolved Hide resolved

eval/cweval.py Show resolved Hide resolved

eval/cweval.py Show resolved Hide resolved

fix: gemini comments

9a47fd0

zhewang2001 requested a review from ganler August 7, 2025 21:50

ganler approved these changes Aug 7, 2025

View reviewed changes

ganler merged commit 2fc2447 into main Aug 7, 2025
2 checks passed

ganler deleted the eval/cweval branch August 7, 2025 22:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(eval): cweval#9

feat(eval): cweval#9
ganler merged 4 commits intomainfrom
eval/cweval

zhewang2001 commented Aug 7, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zhewang2001 commented Aug 7, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants