Skip to content

Rebase onto upstream drbd-utils v9.32.0 and sync netlink headers to a6d1604c (36/25)#4

Open
astef wants to merge 85 commits into
flantfrom
flant-9.32.0
Open

Rebase onto upstream drbd-utils v9.32.0 and sync netlink headers to a6d1604c (36/25)#4
astef wants to merge 85 commits into
flantfrom
flant-9.32.0

Conversation

@astef

@astef astef commented Jun 23, 2026

Copy link
Copy Markdown
Member

Moves our userspace-tools fork from v9.31.0 onto upstream v9.32.0, keeps our two drbdsetup options, and re-pins drbd-headers to match the kernel's renumbered netlink ABI.

  • Rebased onto v9.32.0 — the newest utils still ABI-compatible with the DRBD-9.2 flant headers (v9.33+/9.34 moved to an incompatible headers line).
  • Carries drbdsetup --non-voting and --quorum-dynamic-voters.
  • Bumped drbd-headersa6d1604c (non_voting=36, quorum_dynamic_voters=25) to match 3p-drbd 9.2.18-flant.4+.

Ships with kernel flant-9.2.18 (9.2.18-flant.9) and headers flant — identical a6d1604c on all three sides.

raltnoeder and others added 30 commits April 7, 2025 10:35
Enables loading of event files into DRBDmon that were created
using the --timestamps option with drbdsetup events2 by
preprocessing the event lines before they are passed on to
DRBDmon.
The device config has been neglected so far. Initially, it only
contained the max-bio-size and the intentionally diskless flag. Later
I added the block-size member.
I forgot to output these fields in the output of 'drbdsetup show'.

This patch adds outputting the device configuration options in the
disk options.
This allows dnf/yum updates to succeed without --allowerasing.
Fedora just released version 42, which means some tools were missing from the
test job.

We also pin to fedora 42.
This is most likely a fallout from changes introduced in
f962315

We have "versioned man pages" and create symlinks to them (e.g.,
drbdmeta.8.gz -> drbdmeta-9.0.8.gz). Creating and deleting these is
managed in post and preun scripts. drbdmon is special as there is no
versioned man page, there is only drbdmon.8.gz (the final file, no
symlinks).

As is, we tried to rm the file twice, once in preun and then as part of
"files". This resulted in a completely harmless message that a
non-existent file was tried to get removed.
The long option name is --initialize-bitmap, not --initialize-bitmap-mode.
Fix the man page.
Signed-off-by: Aleksandr Stefurishin <aleksandr.stefurishin@flant.com>
Error out with a useful message. Previously, it would just say 'input in flex scanner failed'.
Allow "dump" to take a specific volume (for example specifying "minor-N").
Also add a new sub command, sh-list-config-file, that will list
<path-to-config-file>:<line-number>:resource-name
When handling supposedly constant strings,
we should use (const char *) more often.
The message "Can not open '%s'.\n.", with that trailing dot, has been
accidentally introduced in the very first attempts of writing drbdadm,
drbdadm_main.c was not even four weeks old at that time.
It has been copy-pasted around since 2002 unnoticed.

Finally drop those trailing dots.
Its counterpart --config-to-exclude is already aggregating.

This should allow to "plausibility check" multiple changes to multiple config
files with one invocation.
Source code pattern is:
	r = ctx_by_name(&ctx, "id", ...);
	if (!ctx.res)
		log error
	if (r)
		exit 1

If you specify a number of resources on the command line,
and the first one is defined, we forgot to log the error,
but still do the exit 1.

We need to reset ctx with each call to ctx_by_...().

This also changes the logic:
we no longer exit on the first "non-existing" resource, but log the error,
accumulate the "exit 1", then still continue with all other resource names
that are listed on the command line.
This is a revert/rework of
6407c42 (drbdadm: reset ctx in ctx_by_* functions, 2025-05-15)

we need to reset (v84 drbdadm already did),
but must not reset the ctx->arg respectively ctx->cmd,
that does not change while iterating over resources,
and is not expected to be reset to NULL.

Also add two command line test cases that would have caught this on the previous commit.
These are available on all the distros we care about, the oldest currently
being SLES 12sp5 and RHEL 7.9.
No need to template this in autotools.
No need to set this with autotools. Contrary to previous suggestions, packaging
guidelines do seem to indicate that things like "make" and "gcc" should be
included.
We no longer support any systems using rpm < 4.6. The oldest distros we support
both use RPM 4.11.X
Some files that are packaged as the drbd-pacemaker package are installed
in the wrong make section. This causes rpmbuild errors if run with pacemaker
support disabled.
No need to set a bash completion suffix, as well as weird optional requirements
for bash completions.

Also, move the completions to the system location instead of the admin config
directory.
Our udev rules have been supported for more than 20 years, no need to check
for ancient udev versions to disable the rules. Should it be needed, we already
support optional removal of the udev builds.
raltnoeder and others added 28 commits July 7, 2025 15:45
It appears that phil forgot to add sh-list-adjustable to the is_adjust
condition, which led to a segfault and connection-mesh errors.

I am also adding a test case to highlight the issue. That unveiled
that if there were a running resource that needs to be taken down, the
sh-list-adjustable would output that resource name twice. Also fixing
that corner case.
We prepare an error description string already.  If drbdsetup exits with error,
at least print that error description if available.
Pass options to genl_connect(). For now, that is send- and receive buffer sizes.
"Too much" activity causes many events to be multicast via netlink.
That may overrun the receive buffer.
We want to be able to increase the receive buffer size in large or busy setups.
If we not immediately succeed setting the SO_RCVBUF, retry with SO_RCVBUFFORCE.
Allow to override rcvbuf size from environment or command line:
DRBD_GENL_RCVBUF_SZ
events2 --rcvbuf

Uses our "m_strol", so units like 10m are fine.
drbdsetup(8) man page update about how to
adjust the netlink receive buffer size.
This reverts commit 7cd43be.

I will re-implement this in another way.
Prepare for improving `drbdadm adjust` by giving it the complete
connection state for each connection.
There is no point in having the CFG_STAGE values spread throughout the
code in the callsites of schedule_deferred_cmd(). Centralize them in
struct adm_cmd.
…cies

So far, the CFG_STAGE (enum drbd_cfg_stage) constants allowed a static
ordering of the scheduled commands.
From now on, scheduled_deferred_cmd() returns a pointer to a
deferred_cmd and takes such a pointer as an optional dependency as an
argument.
When drbd.conf changes how a diskless node reaches nodes holding the
data, applying the changes might be nontrivial.

Consider a diskless primary, having one peer with an UpToDate
disk. One might remove the connection between the diskless node and
that node with the backing disk and add another connection to another
diskful node at the same time. That is a valid change when the two
diskful nodes have a connection.

So far, `drbdadm adjust` has failed to apply such a change, as it
attempts to remove the connection first and then add the other one.

Fixing that by leveraging an explicit dependency from the del-peer to
a wait-connect for the new peer.

The failure of a disconnect command triggered the previous
implementation, so it relied on the DRBD state engine itself to reject
the first disconnect. The new implementation is overly cautious, so it
attempts to establish the new connection before deleting the old one,
even if it is not strictly necessary.
Consider a diskless primary, having two storage servers configured as
peers. When the diskless primary switches to another network, drbdadm
adjust should reconfigure the connections first that are not currently
connected, give the reconfigured connection time to establish and
finally apply changes to the connections that were established
initially.
…just

This is a basic test that verifies the commands emitted by drbdadm
adjust when it starts from an empty state. This was useful for
verifying that the work on the drbdadm adjust code does not break the
existing behaviour.
In this case we do not need to wait for the new connection to be
established before setting the options on the existing connection.
The only user of this script was drbd-reactor in its early days. This
was rewritten in the drbd-reactor project a long time ago in rust. We
have enough users for the rust implementation which used that for years.

The "better" (i.e., non-shell) version can be found here:
https://github.com/LINBIT/drbd-reactor/blob/master/src/bin/ocf-rs-wrapper.rs
https://github.com/LINBIT/drbd-reactor/blob/master/example/ocf.rs%40.service
double_quote_string() does not handle NULL values up to now.

Program received signal SIGSEGV, Segmentation fault.
0x000000000040ea21 in double_quote_string (str=0x0) at config_flags.c:668
(gdb) bt
 0  0x000000000040ea21 in double_quote_string (str=0x0) at config_flags.c:668
 1  0x0000000000404d48 in __print_options (attr=<optimized out>, ctx=0x42bd50 <show_net_options_ctx>,
    sect_name=sect_name@entry=0x41bab2 "net", opened=true, opened@entry=false, close=close@entry=true) at drbdsetup.c:1357
 2  0x0000000000407edb in __print_options (attr=<optimized out>,
    ctx=<error reading variable: Cannot access memory at address 0x0>, sect_name=<optimized out>, opened=<optimized out>,
    close=<optimized out>) at drbdsetup.c:1381
 3  print_options (attr=<optimized out>, ctx=<error reading variable: Cannot access memory at address 0x0>,
    sect_name=0x41bab2 "net") at drbdsetup.c:1383
 4  print_options (attr=<optimized out>, ctx=<error reading variable: Cannot access memory at address 0x0>,
    sect_name=0x41bab2 "net") at drbdsetup.c:1381
 5  show_connection (connection=0x436d40, peer_devices=<optimized out>) at drbdsetup.c:2250
 6  show_resource_list (resources_list=0x436340, old_objname=0x7fffffffe632 "r0") at drbdsetup.c:2440
 7  show_cmd (cm=<optimized out>, argc=<optimized out>, argv=<optimized out>) at drbdsetup.c:2574
 8  0x000000000040c997 in drbdsetup_main (argc=<optimized out>, argv=0x7fffffffe390) at drbdsetup.c:4861
 9  0x00007ffff7c2a1ca in __libc_start_call_main (main=main@entry=0x400f40 <main>, argc=argc@entry=4,
    argv=argv@entry=0x7fffffffe388) at ../sysdeps/nptl/libc_start_call_main.h:58
 10 0x00007ffff7c2a28b in __libc_start_main_impl (main=0x400f40 <main>, argc=4, argv=0x7fffffffe388, init=<optimized out>,
    fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffe378) at ../csu/libc-start.c:360
 11 0x0000000000400f75 in _start ()

Fixes: 8b48c59 ("drbdadm adjust: detect and drop options unsupported by kernel")
For consistency with all other statements.

Fixes: 8b48c59 ("drbdadm adjust: detect and drop options unsupported by kernel")
Its the only distribution that does not support the syntax.
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
our preset file acutally depends on pkg-config and systemd.pc. We can add
pkg-config as a general dependency. systemd.pc moved from "systemd" to
"sytemd-dev", so we can not add it here, but downstream distros should
add the one that fits.
Signed-off-by: Roland Kammerer <roland.kammerer@linbit.com>
Expose the new kernel resource option quorum-dynamic-voters as a
boolean in drbdsetup and drbdadm. Defaults to yes, preserving the
existing quorum behavior.

Update drbd-headers submodule to include the new option definition.

Signed-off-by: David Magton <david.magton@flant.com>
Expose the new kernel disk option non-voting as a boolean in drbdsetup
and drbdadm. Defaults to no, preserving existing behavior.

Usage:
  drbdsetup disk-options <minor> --non-voting=yes

  or in drbd.conf:
    on shadow {
        disk { non-voting yes; }
    }

Signed-off-by: David Magton <david.magton@flant.com>
The flant-9.2 branch was renamed to flant in 3p-drbd-headers,
causing submodule checkout failures during build.

Signed-off-by: David Magton <david.magton@flant.com>
Move the drbd-headers submodule from ee91191b to a6d1604c, which renumbers
our two custom netlink fields:
  - non_voting (disk_conf):           28 -> 36
  - quorum_dynamic_voters (res_opts): 17 -> 25

This keeps drbdsetup/drbdadm wire-compatible with the kernel module, which
adopted the same renumbering in 9.2.18-flant.4. Both repos now pin the
identical headers commit a6d1604c.
@astef astef self-assigned this Jun 23, 2026
flant-9.32.0 is flant rebased onto upstream drbd-utils v9.32.0, with the
drbd-headers submodule re-pinned to a6d1604c (netlink IDs 36/25) to match
3p-drbd 9.2.18-flant.4+. This merge records the pre-rebase branch as an
ancestor so PR #4 can fast-forward; the tree is taken entirely from
flant-9.32.0 (strategy=ours).

The base only added the gitlink ee91191b (a direct ancestor of a6d1604c)
and the two drbdsetup options, both already present in the rebased head
(verified). The GitHub "conflict" was only the submodule gitlink, which
GitHub cannot fast-forward without the headers objects; this merge resolves
it. Only obsolete metadata is superseded.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.