fix: drbd-module-loader failing due to missing mount on debian trixie#872
fix: drbd-module-loader failing due to missing mount on debian trixie#872BokuNoGF wants to merge 1 commit intopiraeusdatastore:v2from
Conversation
6579bb8 to
ed63cfe
Compare
The DRBD module loader fails on Debian Trixie due to a chain of softlinks from /lib/modules -> /usr/src -> /usr/lib not being available for a dependency of the `make` call, so add it as a volume mount. Signed-off-by: Boku NoGF <gokunobf@gmail.com>
WanzenBug
left a comment
There was a problem hiding this comment.
While this might fix the specific issue on Debian, it is very likely that this change would break all other distributions, and in some cases even Debian.
The issue is that /usr/lib already contains libraries used by the programs of the container, such as gcc, patch, etc... and now you are replacing them with some random version from the host OS.
This can and will fail. A lot.
What we have done instead is installing linux-kbuild-* in the build containres, which installs the required scripts.
Ah okay, yep that makes sense, good catch. So would the fix then be to dynamically look up the host kernel in the build container and installing the appropriate kernel build packages each time on startup (since host kernel upgrades would break build containers if they're behind)? I'm guessing this would require modifying or wrapping the entry script? |
|
Ah, I see now that Debian switched from having one It's a tricky situation. Installing the right package could work, but then we have to modify the entry.sh script which is already complicated enough for my taste. |
|
It seems like even if we use something like mounting /usr to /host/user and do some tricks with settings the right KDIR variable won't work, because the way the symlinks and make targets are set up makes it still search for |
One hack I can think of is to have a top level init container mount host's root (or needed dirs), copy the contents of the But that is dirty, and probably modifying entry.sh to install the needed package dynamically would be better since the copy would be ~600MB at least on my machine. Could also add logic to that init container to determine what specific build directory to copy as that would be significantly smaller, but would be more complex. |
|
Perhaps this is overkill, and super super hacky, but could we instead "merge" the host We should be able to create our own temporary mounts: we already have permissions to insert kernel modules, so mounts should not be any worse from the permission perspective. We would probably need to experiment with the order, I guess having the host directory as the lower_dir, container as the upper_dir and an empty working_dir should work |
|
So this is a bit insane, but if we reconfigure the We basically merge the whole "/usr" from the container on top of "/usr" from the host. This means we guarantee that all the build tools exist: they are all part of the /usr from the container. But we also ensured that all the kernel sources, scripts and tools are present, because they got merged in from the host. I'll have to think a bit more if this actually a workable solution. |
|
Hi I have manually applied your changes to one of my node's satellite And here is the result In other nodes, I'm getting the same |
|
Please try the following configuration: |
Unfortunately I was forced to setup cluster fast and I have switched back to Debian Bookworm |
|
@WanzenBug |
|
Can you confirm that it works with the above patch applied? |
I will try it in my R&D cluster |
I can confirm that this works 🎉 |
The DRBD module loader fails on Debian Trixie due to a chain of softlinks from /lib/modules -> /usr/src -> /usr/lib not being available for a dependency of the
makecall, so add it as a volume mount.The specific error that is seen is:
This is because
scriptsis actually a softlink to the/usr/lib/linux-kbuild-6.12.43+deb13/scripts/directory on Debian Trixie and the call fails since the hostPath is not mounted into the pod.This was replicated on Debian 13.1 using the
quay.io/piraeusdatastore/drbd9-trixie:v9.2.14image viapodTemplateoverrides. The fix was also tested this way.