-
Notifications
You must be signed in to change notification settings - Fork 47
feat: add commit splitting feature for cleaner git history #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
- Add --split option to analyze and split commits into smaller, focused commits - Add --max-splits option to limit maximum splits per commit (default: 5) - Use AI to analyze commit diffs and suggest logical splits based on: - Different files serving different purposes - Different types of changes (feat, fix, refactor, etc.) - Unrelated functionality grouped together - Generate appropriate commit messages for each split - Support dry-run mode to preview splits without applying - Create backup branch before splitting for safety - Update README with documentation and examples Co-Authored-By: Fatih Kadir Akın <fatihkadirakin@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds a commit splitting feature that uses AI to analyze commits and split them into smaller, more focused commits for cleaner git history. The feature integrates with existing OpenAI/Ollama providers and follows the tool's pattern of AI-powered git operations with safety measures like dry-run mode and backup branches.
Key changes:
- Added
--splitmode to analyze and split commits using AI-based analysis - Introduced two new interfaces (
SplitSuggestionandCommitSplitAnalysis) to structure split proposals - Implemented commit splitting logic that handles both HEAD and non-HEAD commits with history rewriting
Critical Issues Found:
The implementation has several critical bugs that will prevent it from working correctly:
- Stale commit hashes: After splitting the first commit, all subsequent commit hashes become invalid, causing splits to fail
- Invalid git rev-list reference: Using a commit hash that was reset away causes git commands to fail
- Shell operator issues: Redirection operators won't work with execSync
- Insufficient input validation: AI responses are not validated, and shell commands don't properly escape special characters
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 11 comments.
| File | Description |
|---|---|
| src/index.ts | Adds core splitting functionality including commit analysis, diff extraction, and history rewriting logic; introduces new interfaces for split suggestions and analysis results |
| src/cli.ts | Adds CLI options --split and --max-splits to enable the commit splitting feature from command line |
| README.md | Documents the new commit splitting feature with usage examples and explanation of when commits should be split |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Stage and commit remaining changes | ||
| this.execCommand('git add -A'); | ||
| const remainingMessage = `chore: remaining changes from split of "${commitInfo.message}"`; | ||
| this.execCommand(`git commit -m "${remainingMessage}"`); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same shell escaping issue as line 784-785. The remainingMessage variable includes commitInfo.message which could contain shell metacharacters. This needs proper escaping or use of git commit -F <file>.
| this.execCommand(`git commit -m "${remainingMessage}"`); | |
| // Write commit message to a temp file and use git commit -F <file> | |
| const os = await import('os'); | |
| const tmpDir = os.tmpdir(); | |
| const tmpFile = path.join(tmpDir, `split_commit_msg_${Date.now()}_${Math.random().toString(36).slice(2)}.txt`); | |
| fs.writeFileSync(tmpFile, remainingMessage, { encoding: 'utf8' }); | |
| try { | |
| this.execCommand(`git commit -F "${tmpFile}"`); | |
| } finally { | |
| fs.unlinkSync(tmpFile); | |
| } |
| // Stage only the files for this split | ||
| for (const file of suggestion.files) { | ||
| try { | ||
| this.execCommand(`git add "${file}"`); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File paths with spaces, quotes, or other special characters may not be properly handled. While the double quotes provide some protection, consider using proper shell escaping or a more robust method. Also, this command uses the filename from AI output which could potentially be manipulated.
| // Get the commits that were after the split commit | ||
| const commitsAfter = this.execCommand(`git rev-list ${commitInfo.hash}..${currentBranch}`).trim().split('\n').filter(Boolean); | ||
|
|
||
| if (commitsAfter.length > 0) { |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The git rev-list ${commitInfo.hash}..${currentBranch} command will fail because commitInfo.hash no longer exists after the reset on line 761. This commit was reset away, so git can't find it in the history. You need to store the parent commit hash before the reset (e.g., git rev-parse ${commitInfo.hash}^) and use that as the starting point for the rev-list.
| // Get the commits that were after the split commit | |
| const commitsAfter = this.execCommand(`git rev-list ${commitInfo.hash}..${currentBranch}`).trim().split('\n').filter(Boolean); | |
| if (commitsAfter.length > 0) { | |
| // Store the parent commit hash before the reset, so we can use it for rev-list | |
| const parentHash = this.execCommand(`git rev-parse ${commitInfo.hash}^`).trim(); | |
| // Get the commits that were after the split commit | |
| const commitsAfter = this.execCommand(`git rev-list ${parentHash}..${currentBranch}`).trim().split('\n').filter(Boolean); |
| for (const file of files) { | ||
| if (section.includes(`a/${file}`) || section.includes(`b/${file}`)) { |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file matching logic is fragile. It checks if the section contains a/${file} or b/${file}, but this could lead to false positives. For example, a file named "test.ts" would match a diff section for "my-test.ts" or even content that contains "a/test.ts" in a diff hunk. Use more precise matching by checking the diff header line specifically: diff --git a/${file} b/${file}.
| for (const file of files) { | |
| if (section.includes(`a/${file}`) || section.includes(`b/${file}`)) { | |
| // Get the first line of the section (the diff header) | |
| const headerLine = section.split('\n', 1)[0].trim(); | |
| for (const file of files) { | |
| const expectedHeader = `diff --git a/${file} b/${file}`; | |
| if (headerLine === expectedHeader) { |
| const suggestions: SplitSuggestion[] = (analysis.suggestions || []).map((s: { files: string[]; description: string; commitMessage: string }) => { | ||
| const diffContent = this.extractDiffForFiles(commitInfo.diff, s.files); | ||
| return { | ||
| files: s.files, | ||
| description: s.description, | ||
| commitMessage: s.commitMessage, | ||
| diffContent | ||
| }; | ||
| }); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code doesn't enforce the maxSplits limit on the AI response. The AI is told via the prompt to limit splits to maxSplits, but there's no validation to ensure it actually does. If the AI returns more suggestions than maxSplits, they will all be processed. Add validation: suggestions.slice(0, maxSplits) or check and log a warning if the limit is exceeded.
| this.execCommand(`git checkout ${commitInfo.hash}`); | ||
| this.execCommand(`git checkout -b ${tempBranch}`); | ||
| this.execCommand('git reset --soft HEAD~1'); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When checking out to a specific commit hash, git enters detached HEAD state. The subsequent git checkout -b ${tempBranch} creates a branch from the commit, but then git reset --soft HEAD~1 unstages the commit. However, for non-HEAD commits, this logic is problematic because commitInfo.hash in the rev-list command on line 808 will no longer be valid after the reset. The rev-list will fail because that commit hash doesn't exist in the new branch history anymore.
| const escapedMessage = suggestion.commitMessage.replace(/"/g, '\\"'); | ||
| this.execCommand(`git commit -m "${escapedMessage}"`); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Escaping double quotes alone is insufficient for shell injection protection. Commit messages could contain other problematic characters like backticks, $(), newlines, or backslashes that could break the command or enable injection. Consider using a safer approach like writing the message to a temporary file and using git commit -F <file>, or use a library that properly escapes shell arguments.
| const prompt = `You are a git commit analyzer. Analyze the following commit and determine if it should be split into multiple smaller, more focused commits. | ||
| Original commit message: "${commitInfo.message}" |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message is directly interpolated into the prompt without escaping. If the commit message contains double quotes, it could break out of the string and potentially manipulate the AI prompt. Consider escaping double quotes in commitInfo.message before including it in the prompt.
| const prompt = `You are a git commit analyzer. Analyze the following commit and determine if it should be split into multiple smaller, more focused commits. | |
| Original commit message: "${commitInfo.message}" | |
| const escapedMessage = commitInfo.message.replace(/"/g, '\\"'); | |
| const prompt = `You are a git commit analyzer. Analyze the following commit and determine if it should be split into multiple smaller, more focused commits. | |
| Original commit message: "${escapedMessage}" |
| const suggestions: SplitSuggestion[] = (analysis.suggestions || []).map((s: { files: string[]; description: string; commitMessage: string }) => { | ||
| const diffContent = this.extractDiffForFiles(commitInfo.diff, s.files); | ||
| return { | ||
| files: s.files, | ||
| description: s.description, | ||
| commitMessage: s.commitMessage, | ||
| diffContent | ||
| }; | ||
| }); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AI response is parsed without validation. The code doesn't verify that s.files is actually an array, or that s.description and s.commitMessage are strings. If the AI returns malformed data (e.g., files as a string instead of an array), this will cause runtime errors. Add validation to ensure the response structure matches expectations before using it.
| const suggestions: SplitSuggestion[] = (analysis.suggestions || []).map((s: { files: string[]; description: string; commitMessage: string }) => { | |
| const diffContent = this.extractDiffForFiles(commitInfo.diff, s.files); | |
| return { | |
| files: s.files, | |
| description: s.description, | |
| commitMessage: s.commitMessage, | |
| diffContent | |
| }; | |
| }); | |
| // Validate each suggestion before mapping | |
| function isValidSuggestion(s: any): s is { files: string[]; description: string; commitMessage: string } { | |
| return ( | |
| s && | |
| Array.isArray(s.files) && | |
| s.files.every((f: any) => typeof f === 'string') && | |
| typeof s.description === 'string' && | |
| typeof s.commitMessage === 'string' | |
| ); | |
| } | |
| const suggestions: SplitSuggestion[] = (Array.isArray(analysis.suggestions) ? analysis.suggestions : []) | |
| .filter(isValidSuggestion) | |
| .map((s) => { | |
| const diffContent = this.extractDiffForFiles(commitInfo.diff, s.files); | |
| return { | |
| files: s.files, | |
| description: s.description, | |
| commitMessage: s.commitMessage, | |
| diffContent | |
| }; | |
| }); |
| // Try to recover | ||
| try { | ||
| this.execCommand(`git checkout ${currentBranch}`); | ||
| this.execCommand(`git branch -D ${tempBranch} 2>/dev/null || true`); |
Copilot
AI
Nov 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shell redirection operators (2>/dev/null || true) will not work with execCommand as it uses execSync which doesn't process shell operators. This command will fail. Use Node.js error handling instead, or wrap the command in a shell: sh -c "git branch -D ${tempBranch} 2>/dev/null || true". However, since the catch block already ignores errors, you can simply remove the shell operators and just call git branch -D ${tempBranch}.
| this.execCommand(`git branch -D ${tempBranch} 2>/dev/null || true`); | |
| this.execCommand(`git branch -D ${tempBranch}`); |
feat: add commit splitting feature for cleaner git history
Summary
This PR adds a new
--splitoption that allows users to split large commits into smaller, more focused commits for cleaner git history and easier rollbacks.The feature uses AI to analyze commit diffs and suggest logical splits based on:
Key additions:
--splitflag to enable commit splitting mode--max-splitsoption to limit splits per commit (default: 5)Review & Testing Checklist for Human
Test the
performCommitSplitmethod with real commits - This method performs destructive git operations (git reset --soft,git checkout,git cherry-pick). Verify the error recovery works if something fails mid-split.Test non-HEAD commit splitting - The cherry-pick logic for commits after the split target could fail with merge conflicts. Verify the warning message is appropriate and the repo isn't left in a broken state.
Verify AI JSON parsing - The code expects specific JSON structure from the AI. Test with both OpenAI and Ollama to ensure responses parse correctly. Edge cases: malformed JSON, missing fields, markdown-wrapped responses.
Test with special characters in filenames - The
git add "${file}"command may have issues with unusual filenames.Recommended test plan:
npx git-rewrite-commits --split --dry-runto previewnpx git-rewrite-commits --spliton a feature branchgit log --onelineto confirm splits look correctNotes
Link to Devin run: https://app.devin.ai/sessions/cdf71c5996d741918aff6b5f70f2bf62
Requested by: Fatih Kadir Akın (fatihkadirakin@gmail.com) / @f