lockfile: add holder info file for debugging stale locks #2011

pcasaretto · 2025-11-27T13:20:16Z

Changes in v2

Changed file naming: From foo.lock.pid to foo-pid.lock to avoid
collision with refs namespace (where refs/foo.lock.pid would be valid)
Replaced environment variable with config: Now uses core.lockfilePid
configuration instead of GIT_LOCK_PID_INFO environment variable
Added per-component configuration: Similar to core.fsync, users can
enable PID files for specific components:
- none (default), all
- index, config, refs, commit-graph, midx, shallow, gc, other
Caller-specified components: Each call site specifies which component it
belongs to, giving users fine-grained control. Please help by reviewing the categories
specially the catch all OTHER
Always read existing PID files: When displaying lock errors, existing
PID files are read regardless of config setting, ensuring helpful diagnostics
PID-only for now: Not adding hostname for simplicity.
Should we maybe make the file contents extensible?

CC: Taylor Blau me@ttaylorr.com
cc: "D. Ben Knoble" ben.knoble@gmail.com
cc: Torsten Bögershausen tboegi@web.de
cc: Jeff King peff@peff.net
cc: "Paulo Casaretto (Shopify)" paulo.casaretto@shopify.com
cc: Patrick Steinhardt ps@pks.im

hcmaATshopify

I would argue for using pid instead of holder everywhere.

Pros of pid

Precise: The only payload in the file is the PID. Using “pid” makes it 100% clear and avoids ambiguity.
Conventional: Many tools use .pid files or similar conventions, so the meaning is familiar.
Unambiguous Code: Variables like pid_file or read_lock_pid() are explicitly about PIDs, not other info.

Documentation/git.adoc

hcmaATshopify · 2025-12-01T16:38:35Z

lockfile.c

+		return NULL;
+
+	strbuf_addf(&holder_path, "%s%s", lock_path, LOCK_HOLDER_SUFFIX);
+	fd = open(holder_path.buf, O_WRONLY | O_CREAT | O_TRUNC, mode);


Please make sure mode is restrictive enough (match whatever the .lock has and is readable by the creator).

It does use the same mode parameter as the lock file:

// In lock_file(): lk->tempfile = create_tempfile_mode(filename.buf, mode); // lock file lk->pid_tempfile = create_lock_pid_file(filename.buf, mode); // PID file - same mode

lockfile.c

hcmaATshopify · 2025-12-01T16:46:36Z

lockfile.c

+
+		if (lock_holder_info_enabled() &&
+		    !read_lock_holder_pid(lock_path.buf, &holder_pid))
+			strbuf_addf(buf, _("Lock is held by process %"PRIuMAX".\n\n"),


Lock (acquired via file '%s') is being held by process .

Given this will print after the "Unable to create '/path/to/.git/index.lock': File exists." message, I think it would be redundant.

hcmaATshopify · 2025-12-01T16:48:27Z

lockfile.c

+			strbuf_addf(buf, _("Lock is held by process %"PRIuMAX".\n\n"),
+				    holder_pid);
+
+		strbuf_addstr(buf, _("Another git process seems to be running in this repository, e.g.\n"


If the feature is enabled, we can actually improve this now and be assertive (for instance, if the pid is no longer around, we can say this "lock file is stale and can be removed" or "lock file is valid and held by process "

turns out kill was already implement and cross platform. Added a liveness check!

hcmaATshopify · 2025-12-01T16:50:42Z

lockfile.c

 	return 0;
 }
+
+int rollback_lock_file(struct lock_file *lk)


I recommend renaming rollback_lock_file to abort_lock_file for clarity and consistency with Git’s conventions. abort_lock_file is likely the best fit for the Git codebase, as this matches the standard “commit/abort” terminology for transactional semantics.

This function is not new, I just needed to change it from being an inline function to add new functionality.

hcmaATshopify · 2025-12-01T17:00:23Z

lockfile.c

+	strbuf_addf(&holder_path, "%s%s", lock_path, LOCK_HOLDER_SUFFIX);
+	fd = open(holder_path.buf, O_WRONLY | O_CREAT | O_TRUNC, mode);
+	if (fd >= 0) {
+		strbuf_addf(&content, "%"PRIuMAX"\n", (uintmax_t)getpid());


maybe strbuf_addf(&content, "%ld\n", (long)getpid()); since pid_t is generally int or long; cast to long for portability

Are we sure this will work for non-Unix platforms that might be supported by git? In other words, do we need to ifdef this whole thing so we don't break non-Unix builds?

Looking around the codebase it seems (uintmax_t)getpid() + PRIuMAX pattern is the established Git convention for portable PID handling:
Examples:

daemon.c:1458 - writes PID to file:

write_file(pid_file, "%"PRIuMAX, (uintmax_t) getpid());

builtin/gc.c:789-790 - writes PID to lockfile:

strbuf_addf(&sb, "%"PRIuMAX" %s", (uintmax_t) getpid(), my_host);

In terms of cross platform compatibility, I think I got lucky because it was already a first class citizen.
Did a little research and it works across the board.

getpid() is available on all Git-supported platforms (provided by libc on Unix, emulated on Windows)

pid_t is defined everywhere (native on Unix, typedef int on Windows via mingw-posix.h)

lockfile.c

hcmaATshopify · 2025-12-02T14:04:03Z

Thanks for doing this @pcasaretto - LGTM!

pcasaretto · 2025-12-02T15:00:43Z

/preview

gitgitgadget · 2025-12-02T15:01:54Z

Preview email sent as pull.2011.git.1764687711531.gitgitgadget@gmail.com

pcasaretto · 2025-12-02T15:03:56Z

/submit

gitgitgadget · 2025-12-02T15:07:54Z

Submitted as pull.2011.git.1764688047077.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2011/pcasaretto/pid-holder-file-v1

To fetch this version to local tag pr-2011/pcasaretto/pid-holder-file-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2011/pcasaretto/pid-holder-file-v1

gitgitgadget · 2025-12-02T22:46:34Z

On the Git mailing list, "D. Ben Knoble" wrote (reply to this):

On Tue, Dec 2, 2025 at 10:07 AM Paulo Casaretto via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Paulo Casaretto <pcasaretto@gmail.com>
>
> When a lock file is held, it can be helpful to know which process owns
> it, especially when debugging stale locks left behind by crashed
> processes. Add an optional feature that creates a companion .lock.pid
> file alongside each lock file, containing the PID of the lock holder.
>
> The .lock.pid file is created when a lock is acquired (if enabled), and
> automatically cleaned up when the lock is released (via commit or
> rollback). The file is registered as a tempfile so it gets cleaned up
> by signal and atexit handlers if the process terminates abnormally.
>
> When a lock conflict occurs, the code checks if the PID from the .pid
> file is still running using kill(pid, 0). This allows providing
> context-aware error messages. With PID info enabled:
>
>   Lock is held by process 12345. Wait for it to finish, or remove
>   the lock file to continue.
>
> Or for a stale lock:
>
>   Lock was held by process 12345, which is no longer running.
>   Remove the stale lock file to continue.
>
> Without PID info (default):
>
>   Another git process seems to be running in this repository.
>   Wait for it to finish, or remove the lock file to continue.
>
> The feature is opt-in via GIT_LOCK_PID_INFO=1 environment variable.
>
> Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com>

Sounds interesting. I think by the time I wish I knew what else was
using the lockfile, it's too late for me to alter my environment.
Perhaps (in addition to allowing the environment opt-in) we could
opt-in via configuration? Or is this really only useful, say, on the
server side where the environment is carefully controlled? I don't
relish putting this variable into my environment to take advantage of
something that looks very useful.

Are there downsides that make it necessary to be opt-in? I also
imagine this could be a useful default; occasionally folks at work hit
something similar and ask "what's up with that?"

Only other thing is: just because a process X is running doesn't mean
it was the one holding the lock, right? Since PIDs can be reused.

-- 
D. Ben Knoble

gitgitgadget · 2025-12-02T22:46:36Z

User "D. Ben Knoble" <ben.knoble@gmail.com> has been added to the cc: list.

gitgitgadget · 2025-12-03T19:52:50Z

On the Git mailing list, Torsten Bögershausen wrote (reply to this):

On Tue, Dec 02, 2025 at 03:07:27PM +0000, Paulo Casaretto via GitGitGadget wrote:
> From: Paulo Casaretto <pcasaretto@gmail.com>
> 
> When a lock file is held, it can be helpful to know which process owns
> it, especially when debugging stale locks left behind by crashed
> processes. Add an optional feature that creates a companion .lock.pid
> file alongside each lock file, containing the PID of the lock holder.
> 
> The .lock.pid file is created when a lock is acquired (if enabled), and
> automatically cleaned up when the lock is released (via commit or
> rollback). The file is registered as a tempfile so it gets cleaned up
> by signal and atexit handlers if the process terminates abnormally.
> 
> When a lock conflict occurs, the code checks if the PID from the .pid
> file is still running using kill(pid, 0). This allows providing
> context-aware error messages. With PID info enabled:
> 
>   Lock is held by process 12345. Wait for it to finish, or remove
>   the lock file to continue.
> 
> Or for a stale lock:
> 
>   Lock was held by process 12345, which is no longer running.
>   Remove the stale lock file to continue.
> 
> Without PID info (default):
> 
>   Another git process seems to be running in this repository.
>   Wait for it to finish, or remove the lock file to continue.
> 
> The feature is opt-in via GIT_LOCK_PID_INFO=1 environment variable.
[]

I think that this makes sense.
However, as a frequent user of Git repos hosted on an NFS server
(without any problems in my setup):

Does it make sense to add the hostname here ?
We already have xgethostname() in Git, so that we can diagnose
who/which machine really left a lock.

[]

gitgitgadget · 2025-12-03T19:52:52Z

User Torsten Bögershausen <tboegi@web.de> has been added to the cc: list.

gitgitgadget · 2025-12-03T21:31:13Z

On the Git mailing list, Jeff King wrote (reply to this):

On Tue, Dec 02, 2025 at 03:07:27PM +0000, Paulo Casaretto via GitGitGadget wrote:

> The .lock.pid file is created when a lock is acquired (if enabled), and
> automatically cleaned up when the lock is released (via commit or
> rollback). The file is registered as a tempfile so it gets cleaned up
> by signal and atexit handlers if the process terminates abnormally.

I'm sympathetic to the goal of this series, and the implementation looks
cleanly done. But I wonder if there might be some system-level side
effects that make these .pid files awkward.

Temporarily having an extra .git/index.lock.pid file is probably not a
big deal. But for other namespaces, like refs, we're colliding with
names that have other meanings. So if we want to update refs/heads/foo,
for example, we'll create refs/heads/foo.lock now. And after your patch,
also refs/heads/foo.lock.pid.

The ".lock" suffix is special, in that we disallow it in a refname and
know to skip it when iterating over loose refs. But for the ".pid"
variant, we run the risk of colliding with a real branch named
"foo.lock.pid", both for reading and writing.

On the writing side, creating foo.lock.pid may accidentally overwrite an
existing branch. This can be mitigated by using O_EXCL when creating the
pid file.

But we can see the writes in the opposite order, which I think can also
lead to data loss. Something like:

  - process A wants to write branch "foo", so it holds
    refs/heads/foo.lock and now also the matching foo.lock.pid

  - process B wants to write branch "foo.lock.pid", so it holds
    refs/heads/foo.lock.pid.lock (and the matching pid)

  - process B finishes its write and atomically renames
    foo.lock.pid.lock into foo.lock.pid. It's expected to overwrite
    the existing file there, so now process A's pid file is gone.

  - process A finishes its write and tries to clean up its pid file.  So
    it deletes foo.lock.pid, which contains the actual data from process
    B's write.

And now process B's write has been lost (and maybe even the entire
existence of the foo.lock.pid branch, if it was not also present in the
packed-refs file).

On the reading side, anybody iterating refs/heads/ may racily see the
foo.lock.pid file and think it is supposed to be an actual ref. So they
open it, find it contains garbage, and then flag it as an error.

I think both could be mitigated if we disallowed ".lock.pid" as a suffix
in refnames, but that is a big user-facing change.

In the long run, alternate ref stores like reftable won't suffer from
this issue. It's possible there are other spots where we use lockfiles
alongside arbitrarily-named files, though I couldn't think of any.

So I dunno what that means for your patch. I notice that the user has to
enable the feature manually. But it feels more like it should be
selective based on which subsystem is using the lockfile (so refs would
never want it, but other lockfiles/tempfiles might).

-Peff

gitgitgadget · 2025-12-03T21:31:16Z

User Jeff King <peff@peff.net> has been added to the cc: list.

gitgitgadget · 2025-12-03T22:37:38Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <peff@peff.net> writes:

> So I dunno what that means for your patch. I notice that the user has to
> enable the feature manually. But it feels more like it should be
> selective based on which subsystem is using the lockfile (so refs would
> never want it, but other lockfiles/tempfiles might).

Or perhaps the way to opt into the feature is to create an empty
file $GIT_DIR/lockfile-audit, and the lockfile subsystem will append
to it every time a lock is taken?  We need to ensure that a PID and
pathname formatted into a single record is small enough and O_APPEND
would relieve us from worrying about multi writer races, which may
introduce different kind of complications, though.

gitgitgadget · 2025-12-03T22:37:42Z

On the Git mailing list, Jeff King wrote (reply to this):

On Wed, Dec 03, 2025 at 02:21:11PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > So I dunno what that means for your patch. I notice that the user has to
> > enable the feature manually. But it feels more like it should be
> > selective based on which subsystem is using the lockfile (so refs would
> > never want it, but other lockfiles/tempfiles might).
> 
> Or perhaps the way to opt into the feature is to create an empty
> file $GIT_DIR/lockfile-audit, and the lockfile subsystem will append
> to it every time a lock is taken?  We need to ensure that a PID and
> pathname formatted into a single record is small enough and O_APPEND
> would relieve us from worrying about multi writer races, which may
> introduce different kind of complications, though.

I like a single log much better from a management perspective. I agree
that atomicity is a potential issue, though. I think that even if we
kept it small, network filesystems like NFS do not provide great
guarantees for atomic appends. Something like flock() can work there,
but that's not something we've relied on before.

It also raises questions about reading (do we find pid files in the log
in order to provide more directed advice?) and maintenance (do we ever
clean it up, or just let it grow without bound?).

-Peff

gitgitgadget · 2025-12-03T23:26:09Z

On the Git mailing list, Taylor Blau wrote (reply to this):

On Wed, Dec 03, 2025 at 04:16:10PM -0500, Jeff King wrote:
> On Tue, Dec 02, 2025 at 03:07:27PM +0000, Paulo Casaretto via GitGitGadget wrote:
>
> > The .lock.pid file is created when a lock is acquired (if enabled), and
> > automatically cleaned up when the lock is released (via commit or
> > rollback). The file is registered as a tempfile so it gets cleaned up
> > by signal and atexit handlers if the process terminates abnormally.
>
> I'm sympathetic to the goal of this series, and the implementation looks
> cleanly done. But I wonder if there might be some system-level side
> effects that make these .pid files awkward.
>
> Temporarily having an extra .git/index.lock.pid file is probably not a
> big deal. But for other namespaces, like refs, we're colliding with
> names that have other meanings. So if we want to update refs/heads/foo,
> for example, we'll create refs/heads/foo.lock now. And after your patch,
> also refs/heads/foo.lock.pid.
>
> The ".lock" suffix is special, in that we disallow it in a refname and
> know to skip it when iterating over loose refs. But for the ".pid"
> variant, we run the risk of colliding with a real branch named
> "foo.lock.pid", both for reading and writing.

Good point. I don't have a strong opinion on whether or not we should
use an append-only log of which PIDs grabbed which lockfiles when versus
tracking them on a per-lock basis. But I wonder if this would be
mitigated by either:

 - Keeping the ".lock" suffix as-is, so that holding a lockfile at path
   "$GIT_DIR/index.lock" would create "$GIT_DIR/index-pid.lock" or
   something similar.

 - Introducing a new reference name constraint that treats ".lock.pid"
   as a reserved in a manner identical to how we currently treat
   ".lock".

Between the two, I vastly prefer the former, but see below for more on
why.

> But we can see the writes in the opposite order, which I think can also
> lead to data loss. Something like:
>
>   - process A wants to write branch "foo", so it holds
>     refs/heads/foo.lock and now also the matching foo.lock.pid
>
>   - process B wants to write branch "foo.lock.pid", so it holds
>     refs/heads/foo.lock.pid.lock (and the matching pid)

Changing the naming scheme as above would cause us to hold
"foo.pid.lock" in addition to "foo.lock". That would allow process B
here to write branch "foo.lock.pid" (as is the case today). But if the
scenario were instead "process B wants to write branch foo.pid.lock", it
would fail immediately since the ".lock" suffix is reserved.

> I think both could be mitigated if we disallowed ".lock.pid" as a suffix
> in refnames, but that is a big user-facing change.

Yeah, I don't think that we should change the refname constraints here,
especially in a world where reftable deployments are more common. In
that world I think we should err on the side of removing constraints,
not adding them ;-).

> So I dunno what that means for your patch. I notice that the user has to
> enable the feature manually. But it feels more like it should be
> selective based on which subsystem is using the lockfile (so refs would
> never want it, but other lockfiles/tempfiles might).

Yeah, I think that something similar to the "which files do we fsync()
and how?" configuration we have today would be a nice complement here.

(As an aside, I wonder if that interface, too, could be slightly
improved. Right now we have a comma-separated list of values in the
"core.fsync" configuration for listing different "components", and then
a global core.fsyncMethod to either issue a fsync(), or a pagecache
writeback, or writeout-only flushes in batches. It might be nice to have
something like:

  [fsync "loose-object"]
    method = fsync
  [fsync "packfile"]
    method = writeout

, so the analog here would be something like:

  [lockfile "refs"]
    pidfile = false
  [lockfile "index"]
    pidfile = true

or similar. That could also be represented as core.lockfile=index,
omitting "refs" to avoid tracking it. It may be that people don't really
care to ever use different fsync methods for different fsync-able
components, so perhaps the analogy doesn't hold up perfectly.)

Thanks,
Taylor

gitgitgadget · 2025-12-03T23:49:36Z

On the Git mailing list, Taylor Blau wrote (reply to this):

Hi Paulo,

On Tue, Dec 02, 2025 at 03:07:27PM +0000, Paulo Casaretto via GitGitGadget wrote:
> The .lock.pid file is created when a lock is acquired (if enabled), and
> automatically cleaned up when the lock is released (via commit or
> rollback). The file is registered as a tempfile so it gets cleaned up
> by signal and atexit handlers if the process terminates abnormally.

(For the purposes of this review, I'll ignore the naming conventions
that are discussed elsewhere in the thread, which I think can be
resolved separately of any technical concerns.)

> diff --git a/Documentation/git.adoc b/Documentation/git.adoc
> index ce099e78b8..6fdd509d34 100644
> --- a/Documentation/git.adoc
> +++ b/Documentation/git.adoc
> @@ -1010,6 +1010,16 @@ be in the future).
>  	the background which do not want to cause lock contention with
>  	other operations on the repository.  Defaults to `1`.
>
> +`GIT_LOCK_PID_INFO`::
> +	If this Boolean environment variable is set to `1`, Git will create
> +	a `.lock.pid` file alongside each lock file containing the PID of the
> +	process that created the lock. This information is displayed in error
> +	messages when a lock conflict occurs, making it easier to identify
> +	stale locks or debug locking issues. The PID files are automatically
> +	cleaned up via signal and atexit handlers; however, if a process is
> +	terminated abnormally (e.g., SIGKILL), the file may remain as a stale
> +	indicator. Disabled by default.

Regardless of whether or not we expose this functionality behind an
environment variable or configuration, I think it would be nice to be
able to turn PID tracking on and off for different components (e.g., for
scenarios where you care about who is holding open, say, $GIT_DIR/index,
but not who is creating a lock during ref creation).

If we determined this through an environment variable, I think it would
be reasonable to adopt the convention from the "core.fsync"
configuration and use a comma-separated list. Alternatively, we could
adopt that same convention for a configuration variable, say,
"core.lockfile".

But I think that having this exposed as a per-component setting via
configuration is preferable than a global switch, since callers don't
have to remember to set this variable in their environment to get the
desired effect. Callers that want to opt-in or out of this feature on a
one-off basis can still override the configuration via the top-level
"-c" flag.

>  `GIT_REDIRECT_STDIN`::
>  `GIT_REDIRECT_STDOUT`::
>  `GIT_REDIRECT_STDERR`::
> diff --git a/lockfile.c b/lockfile.c
> index 1d5ed01682..4a694b9c3d 100644
> --- a/lockfile.c
> +++ b/lockfile.c
> @@ -6,6 +6,9 @@
>  #include "abspath.h"
>  #include "gettext.h"
>  #include "lockfile.h"
> +#include "parse.h"
> +#include "strbuf.h"
> +#include "wrapper.h"
>
>  /*
>   * path = absolute or relative path name
> @@ -71,6 +74,62 @@ static void resolve_symlink(struct strbuf *path)
>  	strbuf_reset(&link);
>  }
>
> +/*
> + * Lock PID file functions - write PID to a .lock.pid file alongside
> + * the lock file for debugging stale locks. The PID file is registered
> + * as a tempfile so it gets cleaned up by signal/atexit handlers.
> + */
> +
> +static int lock_pid_info_enabled(void)
> +{

With the above suggestion, I think this function would get a little more
complicated to instead take a component argument. That's not a huge deal
in and of itself, but callers that *create* a lockfile will have to
somehow pass in the component name when acquiring the lock.

> +	return git_env_bool(GIT_LOCK_PID_INFO_ENVIRONMENT, 0);
> +}
> +
> +static struct tempfile *create_lock_pid_file(const char *lock_path, int mode)
> +{
> +	struct strbuf pid_path = STRBUF_INIT;
> +	struct strbuf content = STRBUF_INIT;
> +	struct tempfile *pid_tempfile = NULL;
> +	int fd;
> +
> +	if (!lock_pid_info_enabled())
> +		return NULL;
> +
> +	strbuf_addf(&pid_path, "%s%s", lock_path, LOCK_PID_SUFFIX);
> +	fd = open(pid_path.buf, O_WRONLY | O_CREAT | O_TRUNC, mode);

I'm not sure whether or not we should pass O_EXCL here, but I think it
depends on the naming convention we pick.

> +	if (fd >= 0) {

This is a style preference, but I'd suggest handling the happy path
outside of this conditional if possible by inverting it. Perhaps
something like:

    if (fd < 0)
        goto out;

    strbuf_addf(&content, ...);
    if (write_in_full(fd, content.buf, content.len) < 0)
        warning_errno(...);

    close(fd);
    pid_tempfile = register_tempfile(pid_path.buf);

    out:
        strbuf_release(...);
        return pid_tempfile;

> +static int read_lock_pid(const char *lock_path, uintmax_t *pid_out)
> +{
> +	struct strbuf pid_path = STRBUF_INIT;
> +	struct strbuf content = STRBUF_INIT;
> +	int ret = -1;
> +
> +	strbuf_addf(&pid_path, "%s%s", lock_path, LOCK_PID_SUFFIX);
> +	if (strbuf_read_file(&content, pid_path.buf, LOCK_PID_MAXLEN) > 0) {
> +		char *endptr;
> +		*pid_out = strtoumax(content.buf, &endptr, 10);
> +		if (*pid_out > 0 && (*endptr == '\n' || *endptr == '\0'))
> +			ret = 0;

Same note here. I'd suggest setting "ret = 0" during initialization, and
inverting this conditional to:

    if (*pid_out <= 0 || (*endptr != '\n' && *endptr != '\0')) {
        warning(...);
        ret = -1;
        goto out;
    }
> +		else
> +			warning(_("malformed lock pid file '%s'"), pid_path.buf);
> +	}
> +	strbuf_release(&pid_path);
> +	strbuf_release(&content);
> +	return ret;
> +}
> +
>  /* Make sure errno contains a meaningful value on error */
>  static int lock_file(struct lock_file *lk, const char *path, int flags,
>  		     int mode)
> @@ -80,9 +139,12 @@ static int lock_file(struct lock_file *lk, const char *path, int flags,
>  	strbuf_addstr(&filename, path);
>  	if (!(flags & LOCK_NO_DEREF))
>  		resolve_symlink(&filename);
> -
>  	strbuf_addstr(&filename, LOCK_SUFFIX);
> +

These look like stray whitespace changes that were left behind from
development.

> +		if (pid_status == 1)
> +			strbuf_addf(buf, _("Lock is held by process %"PRIuMAX". "
> +			    "Wait for it to finish, or remove the lock file to continue"),
> +			    pid);
> +		else if (pid_status == -1)
> +			strbuf_addf(buf, _("Lock was held by process %"PRIuMAX", "
> +			    "which is no longer running. Remove the stale lock file to continue"),
> +			    pid);
> +		else
> +			strbuf_addstr(buf, _("Another git process seems to be running in this repository. "
> +			    "Wait for it to finish, or remove the lock file to continue"));

On one hand I prefer the new "Another git process" message for when we
don't have a PID lockfile. But on the other hand, I think the "If it
still fails, a git process may have crashed..." message is useful for
users who may not be immediately aware of the consequences of simply
removing the lockfile to continue.

I do think the original message is somewhat verbose, so maybe the change
here in the non-PID case is worth doing. What are your thoughts?

> +
> +		strbuf_release(&lock_path);
>  	} else
>  		strbuf_addf(buf, _("Unable to create '%s.lock': %s"),
>  			    absolute_path(path), strerror(err));
> @@ -207,6 +292,8 @@ int commit_lock_file(struct lock_file *lk)
>  {
>  	char *result_path = get_locked_file_path(lk);
>
> +	delete_tempfile(&lk->pid_tempfile);
> +

Do we want to wait to delete the PID file until after we know that we
successfully committed the lockfile?

> +/* Maximum length for PID file content */
> +#define LOCK_PID_MAXLEN 32

Makes sense ;-). This should be plenty of space, since on my system
maximum PID value is 2^22:

    $ cat /proc/sys/kernel/pid_max
    4194304

> +test_expect_success 'PID info file cleaned up on successful operation when enabled' '
> +	git init repo4 &&
> +	(
> +		cd repo4 &&
> +		echo content >file &&
> +		env GIT_LOCK_PID_INFO=1 git add file &&
> +		# After successful add, no lock or PID files should exist
> +		! test -f .git/index.lock &&
> +		! test -f .git/index.lock.pid

These should be:

    test_path_is_missing .git/index.lock &&
    test_path_is_missing .git/index.lock.pid

instead of bare "! test -f"'s.

Thanks,
Taylor

When a lock file is held, it can be helpful to know which process owns it, especially when debugging stale locks left behind by crashed processes. Add an optional feature that creates a companion PID file alongside each lock file, containing the PID of the lock holder. For a lock file "foo.lock", the PID file is named "foo-pid.lock". The PID file is created when a lock is acquired (if enabled), and automatically cleaned up when the lock is released (via commit or rollback). The file is registered as a tempfile so it gets cleaned up by signal and atexit handlers if the process terminates abnormally. When a lock conflict occurs, the code checks for an existing PID file and, if found, uses kill(pid, 0) to determine if the process is still running. This allows providing context-aware error messages: Lock is held by process 12345. Wait for it to finish, or remove the lock file to continue. Or for a stale lock: Lock was held by process 12345, which is no longer running. Remove the stale lock file to continue. The feature is controlled via core.lockfilePid configuration, which accepts per-component values similar to core.fsync: - none: Disable for all components (default) - all: Enable for all components - index, config, refs, commit-graph, midx, shallow, gc, other: Enable for specific components Multiple components can be combined with commas: git config core.lockfilePid index,config Each lock file call site specifies which component it belongs to, allowing users fine-grained control over which locks create PID files. Existing PID files are always read when displaying lock errors, regardless of the core.lockfilePid setting. This ensures helpful diagnostics even when the feature was previously enabled and later disabled. Signed-off-by: Paulo Casaretto <pcasaretto@gmail.com>

pcasaretto · 2025-12-04T18:07:06Z

/preview

gitgitgadget · 2025-12-04T18:08:25Z

Preview email sent as pull.2011.v2.git.1764871702690.gitgitgadget@gmail.com

pcasaretto · 2025-12-04T18:18:30Z

/submit

gitgitgadget · 2025-12-04T18:19:29Z

Error: 7068ec0 was already submitted

gitgitgadget · 2025-12-05T11:26:11Z

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Dec 03, 2025 at 06:19:46PM -0500, Taylor Blau wrote:
> On Wed, Dec 03, 2025 at 04:16:10PM -0500, Jeff King wrote:
> > On Tue, Dec 02, 2025 at 03:07:27PM +0000, Paulo Casaretto via GitGitGadget wrote:
> >
> > > The .lock.pid file is created when a lock is acquired (if enabled), and
> > > automatically cleaned up when the lock is released (via commit or
> > > rollback). The file is registered as a tempfile so it gets cleaned up
> > > by signal and atexit handlers if the process terminates abnormally.
> >
> > I'm sympathetic to the goal of this series, and the implementation looks
> > cleanly done. But I wonder if there might be some system-level side
> > effects that make these .pid files awkward.
> >
> > Temporarily having an extra .git/index.lock.pid file is probably not a
> > big deal. But for other namespaces, like refs, we're colliding with
> > names that have other meanings. So if we want to update refs/heads/foo,
> > for example, we'll create refs/heads/foo.lock now. And after your patch,
> > also refs/heads/foo.lock.pid.
> >
> > The ".lock" suffix is special, in that we disallow it in a refname and
> > know to skip it when iterating over loose refs. But for the ".pid"
> > variant, we run the risk of colliding with a real branch named
> > "foo.lock.pid", both for reading and writing.
> 
> Good point. I don't have a strong opinion on whether or not we should
> use an append-only log of which PIDs grabbed which lockfiles when versus
> tracking them on a per-lock basis. But I wonder if this would be
> mitigated by either:
> 
>  - Keeping the ".lock" suffix as-is, so that holding a lockfile at path
>    "$GIT_DIR/index.lock" would create "$GIT_DIR/index-pid.lock" or
>    something similar.
> 
>  - Introducing a new reference name constraint that treats ".lock.pid"
>    as a reserved in a manner identical to how we currently treat
>    ".lock".
> 
> Between the two, I vastly prefer the former, but see below for more on
> why.

Yeah, I agree that this is not really a feasible change for the "files"
reference backend. Besides all the issues mentioned in this thread, we
also have to consider alternative implementations of Git and old
versions. Those wouldn't know that the ".lock.pid" files are special, so
they will misbehave if Git started to write them now.

We could of course work around that issue by introducing a new
repository extension. But I doubt that this is something we want to
pursue in this context. The provided benefit just isn't high enough.

The overall idea still has merit though. So if we still want to pursue
it we can likely work around the issue by introducing a toggle that
allows specific callers to opt out of creating the PID files.

That'd raise the question though when we are most likely to need the PID
files for debugging stuff. From my own experience I only ever had issues
with stale locks in the ref subsystem, so if we disable the mechanism in
exactly that subsystem it may be way less useful.

If references are the main culprit one could also think about a slightly
ugly alternative approach: loose refs handle it just fine if they
contain additional lines. So in theory, we could write a second line for
each lock file that contains the PID. We do have an fsck check that
warns about this, but in theory this should just work. Whether we want
to go there is a different question though.

Patrick

gitgitgadget · 2025-12-05T11:26:13Z

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

gitgitgadget · 2025-12-05T18:54:47Z

On the Git mailing list, Jeff King wrote (reply to this):

On Wed, Dec 03, 2025 at 06:19:46PM -0500, Taylor Blau wrote:

> Changing the naming scheme as above would cause us to hold
> "foo.pid.lock" in addition to "foo.lock". That would allow process B
> here to write branch "foo.lock.pid" (as is the case today). But if the
> scenario were instead "process B wants to write branch foo.pid.lock", it
> would fail immediately since the ".lock" suffix is reserved.

I agree that this gets rid of any corruption or race issues, since all
versions understand how to handle ".lock" specially (and assuming we
create foo.pid.lock with O_EXCL). But there is still a namespace
collision; I cannot write refs "foo" and "foo.pid" at the same time.

> > So I dunno what that means for your patch. I notice that the user has to
> > enable the feature manually. But it feels more like it should be
> > selective based on which subsystem is using the lockfile (so refs would
> > never want it, but other lockfiles/tempfiles might).
> 
> Yeah, I think that something similar to the "which files do we fsync()
> and how?" configuration we have today would be a nice complement here.

I'm not sure it makes sense for the user to configure this. I more meant
that the ref files-backend code would set a flag for "no, do not create
a pid file for me ever" (or inversely, other bits of the code would
add a flag for "yes, it's OK to do so").

-Peff

This comment was marked as spam.

Sign in to view

hcmaATshopify reviewed Dec 1, 2025

View reviewed changes

pcasaretto force-pushed the pid-holder-file branch from 5452ae9 to 51dbf28 Compare December 2, 2025 13:03

hcmaATshopify reviewed Dec 2, 2025

View reviewed changes

lockfile.c Outdated Show resolved Hide resolved

lockfile.c Outdated Show resolved Hide resolved

pcasaretto force-pushed the pid-holder-file branch from 51dbf28 to 7068ec0 Compare December 2, 2025 14:13

pcasaretto marked this pull request as ready for review December 2, 2025 14:14

pcasaretto force-pushed the pid-holder-file branch from 7068ec0 to cfbf3a8 Compare December 4, 2025 19:55

lockfile: add holder info file for debugging stale locks #2011

Are you sure you want to change the base?

lockfile: add holder info file for debugging stale locks #2011

Uh oh!

Conversation

pcasaretto commented Nov 27, 2025 • edited by gitgitgadget bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as spam.

hcmaATshopify left a comment

Choose a reason for hiding this comment

Pros of pid

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hcmaATshopify commented Dec 2, 2025

Uh oh!

pcasaretto commented Dec 2, 2025

Uh oh!

gitgitgadget bot commented Dec 2, 2025

Uh oh!

pcasaretto commented Dec 2, 2025

Uh oh!

gitgitgadget bot commented Dec 2, 2025

Uh oh!

gitgitgadget bot commented Dec 2, 2025

Uh oh!

gitgitgadget bot commented Dec 2, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

gitgitgadget bot commented Dec 3, 2025

Uh oh!

pcasaretto commented Dec 4, 2025

Uh oh!

gitgitgadget bot commented Dec 4, 2025

Uh oh!

pcasaretto commented Dec 4, 2025

Uh oh!

gitgitgadget bot commented Dec 4, 2025

Uh oh!

gitgitgadget bot commented Dec 5, 2025

pcasaretto commented Nov 27, 2025 •

edited by gitgitgadget bot

Loading