When running a big container image with a custom userns it takes a long time to create the id-mapped copy layer and that all happen under the layers.lock and images.lock so it blocks many other commands at the same time from doing anything.
$ podman pull ghcr.io/home-assistant/home-assistant:stable
$ podman run --rm --userns keep-id ghcr.io/home-assistant/home-assistant:stable true
@lock_duration[containers.lock]:
[0] 24 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1] 1 |@@ |
@lock_duration[images.lock]:
[0] 228 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1] 1 | |
[2, 4) 0 | |
[4, 8) 0 | |
[8, 16) 0 | |
[16, 32) 0 | |
[32, 64) 0 | |
[64, 128) 0 | |
[128, 256) 0 | |
[256, 512) 0 | |
[512, 1K) 0 | |
[1K, 2K) 0 | |
[2K, 4K) 0 | |
[4K, 8K) 0 | |
[8K, 16K) 0 | |
[16K, 32K) 0 | |
[32K, 64K) 1 | |
@lock_duration[layers.lock]:
[0] 549 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1] 2 | |
[2, 4) 0 | |
[4, 8) 0 | |
[8, 16) 0 | |
[16, 32) 0 | |
[32, 64) 0 | |
[64, 128) 0 | |
[128, 256) 0 | |
[256, 512) 0 | |
[512, 1K) 0 | |
[1K, 2K) 0 | |
[2K, 4K) 0 | |
[4K, 8K) 0 | |
[8K, 16K) 0 | |
[16K, 32K) 0 | |
[32K, 64K) 1 | |
@lock_duration[storage.lock]:
[0] 552 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1] 1 | |
@lock_max[storage.lock]: 1
@lock_max[containers.lock]: 1
@lock_max[layers.lock]: 49194
@lock_max[images.lock]: 49194
When running a big container image with a custom userns it takes a long time to create the id-mapped copy layer and that all happen under the layers.lock and images.lock so it blocks many other commands at the same time from doing anything.
Using my bpftrace script from #378 (comment) shows it takes like 50s to do that
#378 doesn't touch this part so it didn't help for this. I am not sure how much much the changes there can be used for this id-map copy code path as well but I think that is likely the next obvious bottleneck that needs fixing because holding locks for such long durations is not good.