Skip to content

Improve base-branch detection for branches with ambiguous fork points #5635

@stefanhaller

Description

@stefanhaller

Background: what's a base branch, and where does lazygit use one?

Lazygit lets users configure a list of "main branches" via git.mainBranches (default: [master, main]). For any other branch, lazygit can derive a base branch — the configured main branch the given branch is most closely related to — and uses it in four places:

  1. Rebase onto base branch — rebases the checked-out branch onto its detected base. Reached via the rebase menu (r from the branches view, then b).
  2. View divergence from base branch — opens a left/right view of the commits that are on the branch but not the base, and vice versa. Reached via the upstream menu (u then b).
  3. Move commits to new branch — moves the unpushed commits on the current branch onto a new branch stacked off the base. Reached via N from the branches view.
  4. The behind-base arrow in the branches list — when gui.showDivergenceFromBaseBranch is onlyArrow or arrowAndNumber, lazygit renders a (or ↓N) per branch to show how far it has fallen behind its base.

The first three are user-initiated actions; the fourth is passive display rendered on every branches refresh.

Scope

For the rest of this writeup I'll use "rebase onto base branch" as a stand-in for all three action commands. They share the same base-detection logic, and rebase is both the most frequently used and the one with the biggest impact if it gets the wrong answer (a rebase onto the wrong base rewrites history). So the two surfaces under discussion are:

  1. Rebase onto base branch (representative of all three actions).
  2. The behind-base arrow in the branches list.

A general requirement that ties them together: the arrow column and the commands must always agree on what a branch's base is. A user looking at "↓5 behind develop" in the column and then running rebase-onto-base needs the rebase to target the same develop. Disagreement between the two surfaces, even if one of them happens to give the right answer, is a bug.

The scenario

A common workflow at projects with parallel release branches:

  • main is the current release line.
  • develop is the next-release line.
  • main is merged into develop periodically (say, once per sprint).
  • Feature branches are forked from main (for current-release fixes) or from develop (for next-release work).
  • git.mainBranches is configured as [main, develop].

Visualised:

   M1 ── M2 ── M3 ── M4 ── M5            (main)
          \           \
           D1 ── D2 ── D3 ── D4          (develop)
                       ↑
                   (D3 = merge of M4 into develop)

D1 is develop's first commit, branched off main shortly after M2. D3 is the merge commit that brings main's state at M4 into develop. So at this snapshot, develop contains:

  • everything in main up to and including M4 (via the merge at D3),
  • develop's own commits D1, D2, D3, D4.

It does not contain M5, which arrived on main after the last merge.

All the interesting fork points

 F0          F1    F2    F3
  \           \     \     \
   M1 ── M2 ── M3 ── M4 ── M5            (main)
          \           \
           D1 ── D2 ── D3 ── D4          (develop)
            \           \     \
             F4          F5    F6

F0-F6 denote feature branches in interesting positions. Walking through each one:

  • F0 (forked from main at M1, before develop was even branched off main): M1 is reachable from main directly and from develop via develop's first-parent chain extending back through D1 → M2 → M1. Both branches contain M1. Git cannot tell whether the fork came from main or from a state develop reached as it inherited the early shared history. Ambiguous.
  • F1 (forked from main at M3): M3 is reachable from main directly and from develop via the merge at D3 (which pulled in M4 and all its ancestors, including M3). Ambiguous.
  • F2 (forked from main at M4): same — M4 is in both because develop's last merge ended at M4. Ambiguous.
  • F3 (forked from main at M5): M5 is on main's tip but not yet in develop (develop's last merge only brought in up to M4). Only main qualifies. Unambiguous; main wins.
  • F4 (forked from develop at D1): D1 is on develop's unique history — it's not on main. Only develop qualifies. Unambiguous; develop wins.
  • F5 (forked from the merge commit D3 on develop): D3 itself is a develop commit (the merge commit lives on develop's first-parent chain, not on main's). Only develop qualifies. Unambiguous; develop wins.
  • F6 (forked from develop at D4): D4 is develop-only. Only develop qualifies. Unambiguous; develop wins.

The four ambiguous cases are F0, F1, F2 (forked off main at a commit that later got pulled into develop — either via the original branch-off or via a subsequent merge). For these, the commit graph alone doesn't have the information needed to decide whether the user forked from main or from a state of develop that happened to inherit that same commit.

Aside — does it matter whether the feature branch has its own commits past the fork point? No. The fork point's reachability from each candidate (which is what determines ambiguity) only depends on the fork point, not on what's been added on top. The branch's own commits change the magnitudes of ahead/behind values but not which candidates qualify.

How ambiguity bites in practice

A typical timeline that produces a nasty surprise:

  1. Sprint 1. User forks fix-bug off main at M5. New commits arrive both on main and on develop, and they rebase the branch onto the latest main a few times during the sprint, which works as expected every time; the branch's base is unambiguously main.
  2. End of sprint 1. main gets merged into develop again.
  3. Sprint 2. The user does "rebase onto base branch" on fix-bug. The fork point with main is still the same, but now it is reachable from develop too. The base is suddenly ambiguous, and lazygit silently picks one candidate. If it picks develop, the branch is silently rebased onto the wrong release line.

The behind-base arrow has the same problem. Lazygit picks one candidate silently and shows the behind count against that one — which may be 0 ("up to date") when the other candidate would have shown a non-zero count.

What's a bug, and what's inherent ambiguity?

Two threads to keep separate.

Thread A — implementation defect (candidate set is too loose, and surfaces disagree)

Lazygit currently has two slightly different rules for determining a branch's base, depending on which surface is asking.

The commands (rebase-onto-base, view-divergence, move-commits-to-new-branch) all share a code path that uses
git for-each-ref --contains <merge-base> main develop: every configured main branch that contains the merge-base is a candidate, and the first ref returned is the answer. for-each-ref sorts its output alphabetically by refname (regardless of the order they were passed), so the "first" candidate is alphabetical — for [main, develop] that's develop.

The behind-base arrow behaves differently depending on git version:

  • Git ≥ 2.41 ("fast path"): the column uses %(ahead-behind:<base>) to compute ahead/behind against each configured main branch separately and picks the candidate with the smallest ahead (with config-order tiebreak on ties). This is a better rule — it correctly identifies the closest base when one candidate is clearly closer than the others — but it's not the same rule the commands use.
  • Git < 2.41 ("legacy path"): the column shares the commands' code path and inherits the same alphabetical-first behaviour.

So under git ≥ 2.41 a single branch can show "↓5 behind main" in the arrow column while "rebase onto base branch" targets develop — the two surfaces literally disagree about which configured main is the branch's base. Under git < 2.41 the surfaces at least agree on the same wrong rule.

Both the commands rule and the legacy path of the behind-base arrow rule also share a more fundamental flaw: "contains the merge-base" is too loose an equivalence class for "is the branch's actual base." It groups branches together when one is dramatically closer than another.

A concrete example is F4 in the picture above: a branch forked off develop at D1. Its merge-base with [main, develop] is M2 (the youngest common ancestor of all three). Both main and develop contain M2main directly, and develop via the original branch-off — so both end up in the candidate set, treated as if they were equally-good bases. By any reasonable measure of closeness they aren't: the branch differs from develop by only one commit while it differs from main by two.

The fix is to discriminate within the candidate set using ahead values (commits on the branch but not on the candidate base). The candidate with the smallest ahead is the closest — that's the right answer when one exists. The cases the commands' loose rule couldn't tell apart (feature-dev-style above) become unambiguous; the cases that are genuinely tied (feature-main-style, where both candidates have the same ahead value) get correctly identified as ambiguous instead of collapsed into a silent pick. And critically, every surface uses the same rule, so the arrow column and the rebase command can no longer disagree about a branch's base. The fast path of the behind-base arrow logic already did this, but the other two code paths did not.

Thread B — inherent ambiguity (a property of git history)

Even with the candidate set narrowed using ahead values, F0, F1, and F2 still have no topologically-correct answer. The information needed to decide "did the user fork from main or develop?" is just not in the commit graph once main has been merged into develop (or, for F0, once develop was branched off main and thereby inherited main's history).

These two threads need different fixes.

Step 1 — Fix Thread A: ahead-based selection

Replace "take the first ref out of for-each-ref --contains" with an ahead-based comparison, and use it for every surface that needs a base branch:

  • For each configured main branch that contains the branch's merge-base, compute ahead(feature_branch, candidate).
  • Pick the candidate with the smallest ahead value — the branch the feature has diverged from least, i.e. the closest base.
  • When more than one candidate is tied at the minimum ahead, report the full tied set so the caller (action or display) can handle the ambiguity rather than collapse it.

Walking the fork points with this fix:

Fork point Today After Step 1 (ahead-based)
F0, F1, F2 Silently picks one of the candidates (alphabetical, config-order, or surface-dependent) Identified as a genuine tie (ambiguous)
F3 main main
F4, F5, F6 Sometimes correct by accident develop ✓ (correctly identified)

Step 2 — Handle Thread B: prompt for actions, signal uncertainty in the column

For actions: prompt when ambiguous

When the user invokes "rebase onto base branch" on a branch where more than one candidate is tied at the smallest ahead, show a small menu listing the tied candidates and let the user pick. The chosen base drives the action; the menu only appears when the answer is genuinely ambiguous, so unambiguous branches behave exactly as before — no extra keypress, no friction.

The menu item label itself signals ambiguity:

  • Unambiguous: Rebase onto base branch (main).
  • Ambiguous: Rebase onto base branch (pick: main, develop).

So users know before pressing the key that they'll be asked.

The same pattern is wired into "view divergence from base branch" and "move commits to new branch."

For the behind-base arrow: show uncertainty

The arrow column can't prompt — it's rendered on every branches refresh, unprompted by the user. So instead of silently committing to a count we aren't sure about, the column renders one of these:

Display Meaning
(nothing) Either unambiguous & up to date, or ambiguous & every candidate agrees the branch is up to date.
↓N Unambiguous & behind by N, or ambiguous & every candidate agrees on N.
↓? Ambiguous; every candidate has the branch behind by some non-zero amount, but they disagree on how much. We know it's behind, we don't know by how much.
? Ambiguous; some candidates have the branch up to date, others have it behind. We don't even know if it's behind.

The principle is never display information we're not sure about. Specifically: never show "nothing" (which the user reads as "up to date") unless we're confident the branch is up to date against every candidate. Ambiguous branches lose specificity in their display, but they were ambiguous to begin with — silently picking one and showing its count was confidently misleading.

(In onlyArrow mode, ↓? collapses to since we still know the branch is behind. ? remains ? since we don't know.)

Possible refinement: first-parent ancestry

Going back to the walkthrough of fork points, F1 (forked from main at M3) deserves a second look. We called it ambiguous because M3 is reachable from both main and develop. But intuitively, develop could never possibly have pointed at M3. Develop has its own linear history of commits; M3 only ever existed as a tip of main. The fact that M3 is reachable from develop today is purely a consequence of the later merge that pulled it in — develop's tip never sat at M3, so nobody could have branched off develop at M3. A branch whose fork point is M3 must have forked from main.

The same argument applies to F2 (forked at M4): develop merged M4 in, but M4 itself never lived on develop's line of commits. So F1 and F2 should both resolve to main unambiguously.

F0 is the case the intuition can't rescue: M1 predates develop's existence, so it sits on the original common history that both branches inherit. Develop genuinely passed through M1 on its way to D1, and so did main on its way to M5. Whoever forked F0 might have branched from either side. That ambiguity is real.

The technical formulation behind this is first-parent ancestry: a branch's "linear history" is the chain you get by walking back along first parents only, ignoring commits that arrived as second parents of merge commits. Git supports it directly (git log --first-parent). The candidate rule becomes a configured main qualifies only if its linear history contains the fork point. Under that rule the ambiguous set in our diagram shrinks from {F0, F1, F2} to just {F0}.

When the intuition fails

"Develop never pointed at M3" relies on a workflow property: the configured main branches are only ever merged in one direction, and never hard-reset. Two cases don't fit:

  • Bidirectional merges. If develop is also occasionally merged back into main, develop's commits show up on main's linear history, and fork points on that shared region are ambiguous in both directions — the same shape as F0.
  • Hard resets between mains. Git's own workflow periodically runs git switch -C next master at release time, which replaces next's linear history with a copy of master's. After such a rewind, any branch forked off master before the rewind ends up on the shared history and is ambiguous again.

And in any long-lived repo, branches that predate the creation of the second main branch sit in the F0-shaped original-common-history region. They stay ambiguous regardless of workflow.

Why this isn't part of the proposal

  • The intuition is only sometimes right. Where the workflow assumption holds it works beautifully; where it doesn't (the cases above), the auto-pick is silently wrong — exactly the failure mode the rest of the proposal is built to eliminate.
  • The column and the commands would disagree. The column can't afford per-branch refinement on every refresh, so only the commands would get the first-parent rule. An F1-style branch would then show ? in the column while the rebase silently picked main — exactly the surface-disagreement bug the proposal exists to eliminate.
  • Implementation cost. The fast path's %(ahead-behind:<base>) token has no first-parent variant, so the bulk pass can only produce full-ancestry results. Disambiguating with first-parent would need a two-pass design — bulk pass to find candidates, then per-branch git rev-list --first-parent refinement for ambiguous ones. That refinement is pure waste for F0-style branches anyway.

Worth revisiting if we get evidence that the workflow assumption holds widely and that users would value the column display becoming more specific enough to justify the implementation cost.

Possible refinement: caching the user's pick

A natural follow-on to Step 2 is to add a session-scoped cache: once the user has disambiguated a branch via the prompt, lazygit could remember the pick and use it for subsequent actions on the same branch — and the behind-base arrow could then show a concrete count instead of ? / ↓?.

For:

  • No re-prompting on subsequent actions for the same branch.
  • The arrow column becomes precise once the user has disambiguated.

Against:

  • No undo path for a wrong pick. Once the user accidentally picks the wrong base, every subsequent action on the branch silently uses it. The only way to recover is to restart lazygit, but that's non-obvious, and will even stop working once we decide to persist the cache beyond the current session.
  • The arrow regresses to "confidently misleading." The whole point of the ? / ↓? display is to refuse to assert a number we don't know. Honouring a cached pick puts us back in the business of asserting a specific number against one of several equally-valid candidates — and if the user picked wrong (point above), that number is wrong.
  • Staleness after manual rebase. If the user git rebases a branch from main onto develop to move it to the next release, the cache might or might not invalidate automatically depending on whether the candidate set changed. The cases where it silently doesn't are subtle.
  • Complexity cost. The cache plumbing isn't trivial — per-branch map on MainBranches, mutex, lazy-init, invalidation on config change, auto-population from both fast and legacy loader paths, helper plumbing on the GUI side.

The proposal therefore omits the cache: ambiguous branches re-prompt on each action (cheap; users can rebase to make ambiguous branches unambiguous if they tire of being asked), and the arrow column always shows ? / ↓? for ambiguous branches.

Summary

Fork point Today After Step 1 After Step 2 (full proposal)
F0, F1, F2 (ambiguous) Silently picks one candidate (sometimes inconsistently between column and commands) Tied set identified, still picked silently Action prompts; arrow shows ? / ↓?
F3 (main only) main main main ✓ (no prompt)
F4–F6 (develop only) Sometimes wrong by accident develop develop ✓ (no prompt)

F0, F1, and F2 cannot be resolved automatically under full-ancestry analysis — git just doesn't have the information. The user does, and the prompt is the lowest-friction way to ask them. A possible future refinement using first-parent ancestry could disambiguate F1 and F2 (but not F0), at the cost of an assumption about the project's branching workflow.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions