Skip to content

Drop mimalloc#6231

Open
dscho wants to merge 6 commits intogit-for-windows:mainfrom
dscho:drop-mimalloc
Open

Drop mimalloc#6231
dscho wants to merge 6 commits intogit-for-windows:mainfrom
dscho:drop-mimalloc

Conversation

@dscho
Copy link
Copy Markdown
Member

@dscho dscho commented May 5, 2026

When 6a29c2d ("mingw: use mimalloc", 2019-06-24) introduced the vendored mimalloc, the comparison was against nedmalloc (which by then had not seen an upstream commit since 2014, and whose repository was archived in 2019). The two were essentially at parity in that benchmark; mimalloc was chosen because it was actively developed. I do not really recall whether the platform's default allocator was not part of the comparison; If it was, the performance was still worse than mimalloc, if it wasn't, I forgot to test ;-)

Six years on, with nedmalloc safely on its way to being dropped from the upstream codebase entirely (gitgitgadget#2104, currently in seen as e576abb), the question is no longer "mimalloc vs nedmalloc" but "mimalloc vs the OS allocator". Re-running the same git repack -adfq benchmark against each platform's current default allocator finds no measurable speedup from mimalloc on any of Windows, macOS, or Linux.

Methods

I recapitulated the same benchmark as cited in 6a29c2d (the original comparison was nedmalloc vs mimalloc on git repack -adfq over a subset of linux.git), now extended to the three GitHub-hosted runners (ubuntu-latest, macos-latest, windows-latest). Each job built two git binaries from the same source tree, vanilla and USE_MIMALLOC=YesPlease, then prepared a fresh bare clone of linux.git to a fixed SHA, and ran the repacks with both built gits in randomized order for five iterations. Each iteration ran both binaries exactly once on a freshly copytree-ed copy of the immutable template repository; the order within an iteration was randomized so any per-iteration confounder (cache state, runner warm-up, neighbour-VM contention) would be shared symmetrically between variants. Timings excluded the copytree. The full driver is the Python script ci/bench-mimalloc.py on the mimalloc-benchmark branch.

Results: original linux v2.6.20-era workload (49,917 commits, 431,605 objects, ~204 MB pack)

Platform vanilla mean ± stdev mimalloc mean ± stdev Δ (mimalloc − vanilla)
ubuntu-latest 27.089s ± 0.060s 27.041s ± 0.065s −0.048s (−0.18%)
macos-latest 23.259s ± 1.206s 25.076s ± 2.279s +1.817s (+7.8%)
windows-latest 29.828s ± 1.651s 30.329s ± 2.428s +0.501s (+1.7%)

Workflow run: https://github.com/dscho/git/actions/runs/25374127848

Results: 4x larger linux v3.0 workload (255,039 commits, 2,059,429 objects, ~788 MB pack)

Platform vanilla mean ± stdev mimalloc mean ± stdev Δ (mimalloc − vanilla)
ubuntu-latest 134.723s ± 0.329s 134.801s ± 0.191s +0.078s (+0.06%)
macos-latest 130.183s ± 19.098s 133.292s ± 18.991s +3.109s (+2.4%)
windows-latest 145.183s ± 1.272s 146.271s ± 4.161s +1.088s (+0.75%)

Workflow run: https://github.com/dscho/git/actions/runs/25376885309

Discussion

The Linux numbers on the larger workload are particularly clear: stdev is below 0.3% of the mean for both variants, and the difference is well inside that floor. Glibc's allocator and the vendored mimalloc are statistically indistinguishable for git repack -adfq here.

windows-latest runners are noisier (per-run variance ~1-4%, mostly neighbour-VM scheduling), but mimalloc never beats vanilla in either workload. With the original justification for keeping a custom allocator gone (the modern Windows segment-heap is no longer the slow Windows-XP-era HeapAlloc that drove the original 2009 nedmalloc adoption), there is nothing left to motivate the maintenance cost of a vendored allocator.

macos-latest is too noisy at n=5 (stdev 14% of the mean) to draw a firm conclusion, but the visible point-estimate goes the wrong way and there is no plausible mechanism by which Apple's libsystem_malloc would be slower than mimalloc.

What this PR does not do

It does not by itself remove nedmalloc from the tree; that is still promised as a follow-up of the in-flight upstream patch gitgitgadget#2104, presently in seen as e576abb. The first commit here is an amend! whose autosquashed result is byte-identical to that upstream commit, so once the next merging-rebase picks up the upstream patch the two will collapse cleanly.

The five remaining fixup! reverts target each of the original mimalloc-vendoring commits in reverse chronological order; once autosquashed, the pairs cancel out to empty commits which the rebase will drop, leaving the tree free of compat/mimalloc/, the USE_MIMALLOC build infrastructure, and the supporting changes (compat/posix.h _DEFAULT_SOURCE guard, win32_pthread_* renames) that only existed to support the vendored allocator.

dscho added 6 commits May 5, 2026 15:35
mingw: stop using nedmalloc

The vendored nedmalloc allocator under compat/nedmalloc/ has been
unmaintained upstream for a very long time: the original repository at
https://github.com/ned14/nedmalloc received its last commit on July 5,
2014, and was archived (made read-only) by its owner on March 15, 2019.
Our copy has been carried forward unchanged ever since.

The Git for Windows commit that introduced mimalloc as a replacement
on Windows ("mingw: use mimalloc", 2019-06-24, present in the Git for
Windows branch thicket but not upstream) already observed at that time
that nedmalloc had ceased to see any updates for several years.

This came to a head when the Git for Windows SDK upgraded to GCC 16:
the `add_segment()` function in `compat/nedmalloc/malloc.c.h` declares
`int nfences = 0` and only references it inside an `assert()`, which
GCC 16 now flags as `-Wunused-but-set-variable`. Combined with the
`-Werror` enabled by `DEVELOPER=1`, this turns into a hard build
failure:

	compat/nedmalloc/malloc.c.h: In function 'add_segment':
	compat/nedmalloc/malloc.c.h:3897:7: error: variable 'nfences' set but not used [-Werror=unused-but-set-variable=]
	 3897 |   int nfences = 0;
	      |       ^~~~~~~
	cc1.exe: all warnings being treated as errors

The same source built without complaint under GCC 15.2.0; the
regression was bisected to the SDK package update at
git-for-windows/git-sdk-64@188d93dd455
(`mingw-w64-x86_64-gcc 15.2.0-14 -> 16.1.0-1`), with the failing CI
run captured at
https://github.com/git-for-windows/git-sdk-64/actions/runs/25244795074.

Rather than patch the unmaintained vendored sources to silence the
warning, stop opting into nedmalloc altogether on MINGW. The platform
allocator is what every non-MINGW build already uses, and a fresh
build of git.git's master against a minimal Git for Windows SDK
upgraded to GCC 16, with `USE_NED_ALLOCATOR` removed from the MINGW
section, completes successfully.

The compat/nedmalloc/ subtree itself is left in place to keep this
change minimal; nothing in the build links against it any longer, so
it can be removed in a follow-up if desired.

Assisted-by: Claude Opus 4.7
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Re-running the `git repack -adfq` benchmark from
6a29c2d ("mingw: use mimalloc",
2019-06-24) against the platform's *current* default allocator (so
without `nedmalloc` in the picture at all) shows mimalloc is no longer
faster than the system allocator on any of Windows, macOS, or Linux,
neither for the original ~30-second `linux v2.6.20` workload nor for a
4x larger `linux v3.0` workload where each individual run takes ~2
minutes (and the noise floor on Linux is below 0.3% of the mean, so
even small differences would be visible if any existed).

`mimalloc` was originally chosen over nedmalloc, not over the system
allocator. Six years on, with nedmalloc now being dropped from the
codebase entirely, the allocator that mimalloc has to beat is whatever
the OS ships by default; modern Windows segment-heap, glibc malloc, and
the macOS libsystem allocator have all closed the gap, and there is no
longer a measurable benefit to keep maintaining a custom allocator.

The actual benchmark methodology, the per-platform numbers, and links
to the workflow runs that produced them are spelled out in the PR
description rather than repeated across each fixup.

The `fixup!` subject is so that the next rebase against an upstream
Git that already lacks this commit will autosquash this revert into
the original (which becomes empty and is dropped), leaving the tree
free of `mimalloc`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Part of the series that drops the vendored `mimalloc` from this fork;
the rationale (no measurable speedup over the platform allocator on
any of Windows, macOS, or Linux) is in the second commit of the
series and the PR description. The `fixup!` subject is so the next
rebase against an upstream Git that already lacks the target commit
autosquashes this revert into it, dropping the original cleanly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Part of the series that drops the vendored `mimalloc` from this fork;
the rationale (no measurable speedup over the platform allocator on
any of Windows, macOS, or Linux) is in the second commit of the
series and the PR description. The `fixup!` subject is so the next
rebase against an upstream Git that already lacks the target commit
autosquashes this revert into it, dropping the original cleanly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Part of the series that drops the vendored `mimalloc` from this fork;
the rationale (no measurable speedup over the platform allocator on
any of Windows, macOS, or Linux) is in the second commit of the
series and the PR description. The `fixup!` subject is so the next
rebase against an upstream Git that already lacks the target commit
autosquashes this revert into it, dropping the original cleanly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Part of the series that drops the vendored `mimalloc` from this fork;
the rationale (no measurable speedup over the platform allocator on
any of Windows, macOS, or Linux) is in the second commit of the
series and the PR description. The original commit was a preparation
step for vendoring `mimalloc` in (which forces C11 mode under
mingw-w64 GCC and so implicitly links libwinpthread, clashing with
Git's own emulation). With `mimalloc` gone the rename is no longer
needed, so this revert restores the plain `pthread_create` /
`pthread_self` names. The `fixup!` subject is so the next rebase
against an upstream Git that already lacks the target commit
autosquashes this revert into it, dropping the original cleanly.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@dscho
Copy link
Copy Markdown
Member Author

dscho commented May 5, 2026

Gaaaaah:

  [...]
  In file included from hook.c:1:
  Error: ./git-compat-util.h:353:11: '_Generic' is a C11 extension [-Werror,-Wc11-extensions]
    353 |         return !!strchr(path, '/');
        |                  ^
  /usr/include/string.h:265:3: note: expanded from macro 'strchr'
    265 |   __glibc_const_generic (S, const char *, strchr (S, C))
        |   ^
  /usr/include/x86_64-linux-gnu/sys/cdefs.h:838:3: note: expanded from macro '__glibc_const_generic'
    838 |   _Generic (0 ? (PTR) : (void *) 1,                     \
        |   ^
  [...]

That is obviously independent of this here PR's changes. At least the ever-productive @pks-gitlab saves our day by providing a fix already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant