-
Notifications
You must be signed in to change notification settings - Fork 0
review diff_parser
Runs git diff and parses raw unified diff text into structured DiffFile objects using the unidiff library.
| Term | Definition | Example |
|---|---|---|
| diff | The set of changes between two versions of code, showing added (+) and removed (-) lines. |
- old_line\n+ new_line shows old_line was replaced with new_line. |
| hunk | A contiguous block of changes within a diff. One diff can contain multiple hunks (changes in different parts of a file). | A diff might have hunk 1 (lines 10-15 changed) and hunk 2 (lines 80-85 changed). |
Runs git diff as a subprocess and returns the raw unified diff text as a string.
| Param | Type | Default | Purpose |
|---|---|---|---|
staged |
bool |
False |
If True, diffs only staged changes |
target_branch |
str | None |
None |
If set, diffs against that branch |
commit |
str | None |
None |
If set, shows diff for that specific commit (SHA or ref) |
repo_path |
str | None |
None |
Working directory for the git command |
Priority: commit > target_branch > staged > unstaged (default).
Input: staged=False, target_branch=None, commit=None, repo_path="/home/user/myapp"
Line 9: cmd = ["git", "diff", "--unified=5"] → ["git", "diff", "--unified=5"]
Line 10–11: commit is None → skip
Line 12–13: staged is False → skip --staged
Line 14–15: target_branch is None → skip branch arg
Line 17–22: subprocess.run(["git", "diff", "--unified=5"], cwd="/home/user/myapp") runs in that directory
Line 24: returns result.stdout → raw diff string like "diff --git a/foo.py b/foo.py\n..."
Input: staged=True, target_branch="main", commit=None, repo_path=None
Line 9: cmd = ["git", "diff", "--unified=5"]
Line 12–13: target_branch is "main" → cmd.append("main...HEAD") → ["git", "diff", "--unified=5", "main...HEAD"]
Note: target_branch takes priority over staged since it comes first in the elif chain.
Line 17–22: runs in current directory (cwd=None)
Line 24: returns stdout
Input: commit="abc1234", staged=True, target_branch="main", repo_path="/project"
Line 10–11: commit is "abc1234" → cmd = ["git", "diff", "--unified=5", "abc1234~1", "abc1234"]
Note: commit takes highest priority — staged and target_branch are ignored.
Line 17–22: subprocess.run(cmd, cwd="/project")
Line 24: returns the diff showing what that commit changed (parent → commit)
Parses a raw unified diff string into a list of DiffFile objects using the unidiff.PatchSet parser.
| Param | Type | Purpose |
|---|---|---|
diff_text |
str |
Raw unified diff text from git diff
|
Input:
diff_text = """diff --git a/math.py b/math.py
--- a/math.py
+++ b/math.py
@@ -10,3 +10,3 @@
def add(a, b):
- return a - b
+ return a + b
"""
Walkthrough:
Line 28–29: diff_text.strip() is non-empty → proceed
Line 31: patch = PatchSet(diff_text) → parses into a PatchSet with 1 PatchedFile
Line 32: diff_files = []
Loop — patched_file = first (and only) PatchedFile for math.py:
Line 35: hunks = []
Loop — hunk = first hunk (the @@ block):
Line 37: lines = []
Loop — line = first line (def add(a, b):):
Line 38–40: line.is_added → False, line.is_removed → False
Line 44–45: → change_type = "context", line_no = 10 (target_line_no)
Line 47–50: appends ChangedLine(line_number=10, content="def add(a, b):\n", change_type="context")
Loop — line = second line (- return a - b):
Line 41–43: line.is_removed → True → change_type = "removed", line_no = 11 (source_line_no)
Line 47–50: appends ChangedLine(line_number=11, content=" return a - b\n", change_type="removed")
Loop — line = third line (+ return a + b):
Line 38–40: line.is_added → True → change_type = "added", line_no = 11 (target_line_no)
Line 47–50: appends ChangedLine(line_number=11, content=" return a + b\n", change_type="added")
Line 52–55: appends DiffHunk(start_line=10, end_line=13, lines=[...3 ChangedLines...])
Line 57–65: appends DiffFile(...):
DiffFile(
file_path="math.py",
language=detect_language(Path("math.py")), → "python"
hunks=[DiffHunk(start_line=10, end_line=13, lines=[...])],
is_new_file=False,
is_deleted=False,
added_lines=1,
removed_lines=1,
)
Return: [DiffFile(file_path="math.py", language="python", added_lines=1, removed_lines=1, ...)]