|
| 1 | +Merge-Base Computation and paint_down_to_common() |
| 2 | +================================================== |
| 3 | + |
| 4 | +The function `paint_down_to_common()` in `commit-reach.c` computes merge |
| 5 | +bases by walking the commit graph backwards from two sets of tips and |
| 6 | +finding where their ancestry meets. |
| 7 | + |
| 8 | +Use cases |
| 9 | +--------- |
| 10 | + |
| 11 | +Computing merge bases is used in two different ways: |
| 12 | + |
| 13 | + 1. *Finding all merge bases* (`merge-base --all`, `merge-tree`, |
| 14 | + `merge`, `rebase`). A merge base is a common ancestor that is |
| 15 | + not itself an ancestor of another common ancestor. |
| 16 | + |
| 17 | + 2. *Ancestry checks* (`in_merge_bases`, used by `merge-base |
| 18 | + --is-ancestor`, `branch -d`, `fetch`). These ask: "is commit A |
| 19 | + an ancestor of commit B?" If a common ancestor equals one of the |
| 20 | + inputs, that input is necessarily the only merge base -- no other |
| 21 | + common ancestor can be both as recent and not an ancestor of it. |
| 22 | + |
| 23 | +Both use cases share the same algorithm and implementation. |
| 24 | + |
| 25 | +Algorithm |
| 26 | +--------- |
| 27 | +
|
| 28 | +Given a commit `one` and a set of commits `twos[]`, the walk paints |
| 29 | +commits with two colors: |
| 30 | +
|
| 31 | + - PARENT1: reachable from `one` |
| 32 | + - PARENT2: reachable from any commit in `twos[]` |
| 33 | +
|
| 34 | +The walk uses a priority queue ordered by generation number (falling |
| 35 | +back to commit date when generation numbers are unavailable). Each |
| 36 | +step dequeues the highest-priority commit (this is when we say a |
| 37 | +commit is "visited") and propagates its paint flags to its parents, |
| 38 | +enqueuing them if they gained new flags. When a commit receives |
| 39 | +both PARENT1 and PARENT2, it is a merge-base candidate. A candidate |
| 40 | +gains the STALE flag so its ancestors propagate staleness -- any |
| 41 | +deeper common ancestor is necessarily redundant. |
| 42 | +
|
| 43 | +INFINITY and finite generation regions |
| 44 | +-------------------------------------- |
| 45 | + |
| 46 | +The commit-graph stores a generation number for each commit. Commits |
| 47 | +not in the commit-graph have generation `GENERATION_NUMBER_INFINITY`. The |
| 48 | +graph is closed under reachability: if a commit is in the graph, all |
| 49 | +its ancestors are too. This partitions the commit graph into two regions: |
| 50 | + |
| 51 | +.... |
| 52 | + +---------------------------------------+ |
| 53 | + | INFINITY region | |
| 54 | + | generation = INFINITY | |
| 55 | + | queue order: heuristic (commit date) | |
| 56 | + +---------------------------------------+ |
| 57 | + | |
| 58 | + v |
| 59 | + +---------------------------------------+ |
| 60 | + | Finite region | |
| 61 | + | generation = finite | |
| 62 | + | queue order: topological | |
| 63 | + +---------------------------------------+ |
| 64 | +.... |
| 65 | + |
| 66 | +When the commit-graph is enabled, the INFINITY region is typically |
| 67 | +very small -- it only contains commits added since the last |
| 68 | +commit-graph refresh. |
| 69 | + |
| 70 | +All reachable INFINITY-generation commits are visited before any |
| 71 | +finite-generation commit, because INFINITY is larger than any finite |
| 72 | +value. Once the walk crosses into the finite region, it stays there. |
| 73 | + |
| 74 | +In the finite region, generation ordering guarantees topological |
| 75 | +traversal: children are always visited before their parents. This |
| 76 | +means that paint on already-visited commits is final -- no future |
| 77 | +traversal step can add paint to them. |
| 78 | + |
| 79 | +In the INFINITY region, commit-date ordering can violate this: a |
| 80 | +parent with a later date can be visited before a child with an earlier |
| 81 | +date. Paint flags are therefore NOT final at visit time, and a |
| 82 | +commit visited with only one side's paint may later gain the other. |
| 83 | + |
| 84 | +Paint flags are only added, never removed. Since each flag can be set |
| 85 | +at most once per commit, the number of times a commit can be |
| 86 | +re-enqueued is bounded by the number of flag transitions. |
| 87 | + |
| 88 | +Termination |
| 89 | +----------- |
| 90 | +
|
| 91 | +Termination happens when we can prove that no extra progress is |
| 92 | +possible. We are done with the main loop when one of the following |
| 93 | +conditions holds: |
| 94 | +
|
| 95 | + 1. The queue is empty. |
| 96 | + 2. The queue only contains STALE entries. |
| 97 | + 3. Side-exhaustion: the walk has reached the finite region and one |
| 98 | + of the sides is fully exhausted. |
| 99 | +
|
| 100 | +The loop waits for all pending merge-base candidates to be popped |
| 101 | +and recorded before any early exit fires, so no separate drain phase |
| 102 | +is needed after termination. |
| 103 | +
|
| 104 | +Stale entry condition |
| 105 | +~~~~~~~~~~~~~~~~~~~~~ |
| 106 | +If all entries are stale we cannot find any new merge bases since |
| 107 | +that requires at least one enqueued side node meeting the other side. |
| 108 | +However, we could still invalidate merge bases (if there are more |
| 109 | +than one). This is unnecessary since `remove_redundant()` will clean |
| 110 | +that up as a post-process step. |
| 111 | +
|
| 112 | +Side-exhaustion |
| 113 | +~~~~~~~~~~~~~~~ |
| 114 | +A commit is *exclusive* to one side if it carries that side's paint |
| 115 | +but not the other (e.g. PARENT1 without PARENT2). |
| 116 | +
|
| 117 | +If we have reached the finite region of the graph, no future |
| 118 | +traversal step can add paint to an already-visited commit. Thus if |
| 119 | +there are no exclusive PARENT2 commits in the queue, no additional |
| 120 | +PARENT2 paint can be introduced into the walk. Even if exclusive |
| 121 | +PARENT1 commits remain, no new merge-base candidates can be |
| 122 | +discovered. The same holds symmetrically for PARENT1. |
| 123 | +
|
| 124 | +This invariant is only valid in the finite region of the graph. |
| 125 | +
|
| 126 | +Related documentation |
| 127 | +--------------------- |
| 128 | + |
| 129 | + - `Documentation/technical/commit-graph.adoc` -- generation numbers |
| 130 | + and the reachability closure property. |
0 commit comments