initramfs: Move UFS and SDHCI storage drivers to initramfs#2286
initramfs: Move UFS and SDHCI storage drivers to initramfs#2286Kavinaya99 wants to merge 4 commits into
Conversation
lumag
left a comment
There was a problem hiding this comment.
probe ordering issues and race conditions with other subsystems on some platforms.
Which probe issues? Which race conditions? Which platforms are affected and how? Please be exact.
Also drop the template or AI prompt, describing the changes. It's pretty obvious from the commit itself. Focus on something which is not obvious - reasons, errors, affected devices.
| initramfs-module-udev \ | ||
| kernel-module-governor-simpleondemand \ | ||
| kernel-module-ufshcd-core \ | ||
| kernel-module-ufshcd-pltfrm \ |
There was a problem hiding this comment.
Aren't those pulled in by module dependencies?
There was a problem hiding this comment.
Without adding these modules to the recipe, we are getting "unable to mount root fs" errors.
| kernel-module-ufshcd-core \ | ||
| kernel-module-ufshcd-pltfrm \ | ||
| kernel-module-ufs-qcom \ | ||
| kernel-module-sdhci-msm \ |
There was a problem hiding this comment.
All of the kernel modules should go into the variable MACHINE_ESSENTIAL_EXTRA_RRECOMMENDS, which is more appropriate for the purpose. In its current form, we will get an error if any of the kernel modules are built-in.
There was a problem hiding this comment.
Moving kernel modules to MACHINE_ESSENTIAL_EXTRA_RRECOMMENDS ensures they are included in the image when built as modules, but it does not guarantee that they will be loaded during boot.
I have tried using MACHINE_ESSENTIAL_EXTRA_RRECOMMENDS and saw bootup failures as modules were not loaded.
The error: root '/dev/disk/by-partlabel/rootfs' doesn't exist or does not contain a /dev.
There was a problem hiding this comment.
Moving kernel modules to MACHINE_ESSENTIAL_EXTRA_RRECOMMENDS ensures they are included in the image when built as modules
Well. No. Packages listed in MACHINE_ESSENTIAL_EXTRA_RRECOMMENDS don't get included into the initramfs-rootfs-image. The recipe overrides PACKAGE_INSTALL (purposedly), so packagegroup-core-boot doesn't get included into the image. That's why you've observed errors.
At the same time, no, you can't list modules here. The image should be generic. Also, it should not fail to build if one changes the kernel config. If you need modules, you need to have a packagegroup which would recommend necessary packages. I'd have said that you should resurrect packagegroup-qcom-boot and initramfs-qcom-image, see commit 05b73a1 ("initramfs-qcom-image: remove the recipe and packagegroup").
Also, this commit should be the first one, otherwise booting of the image would be broken between commits moving the drivers to the modules and this one (and thus breaking git bisect, which is a bad idea).
Updated the commit message |
lumag
left a comment
There was a problem hiding this comment.
On QCS8300 (Monaco), the kernel experiences a race condition
during boot where UFS storage driver, ARM-SMMU, and GPUCC (GPU
Clock Controller) all initialize at the same device_initcall
level (level 6). This creates a dependency chain where UFS
requires SMMU, SMMU requires GPUCC clocks, but GPUCC may not
finish initialization before SMMU times out waiting for it.
Hmm, no. The GPU SMMU and the UFS SMMU are two different SMMU instances. So, the fact that 3da0000.iommu has not probed should not affect probing of 15000000.iommu and the UFS.
Please continue and find the actual root cause. What exactly is causing UFS to not to probe?
The race condition manifests as:
gcc-qcs8300 100000.clock-controller: sync_state() pending
due to 3d90000.clock-controller
arm-smmu 3da0000.iommu: deferred probe timeout, ignoring
dependency
arm-smmu 3da0000.iommu: probe with driver arm-smmu failed with
error -110
Kernel panic in iommu_domain_free() at gmu_core_iommu_init()
The commit msg need to be re-written. The issue is not GPU SMMU blocking UFS , but actually few configs which are part of static image however their dependencies are configured as modules. This causes repeated probe deferrals sometime delaying ufs bring up and in some other cases exceeding probe deferral timeout that blocks other re-probes. Few cases from last debug that were identified are below:
I prefer such modules (which are dependencies for drivers that are part of static kernel image) be moved to initramfs so boot delays (and deferrable timeout) can be better managed.
|
Is this verified? Doesn't fw_devlink take care of this? There should be no probe_deferral's (as in probe() callback returning -EPROBE_DEFER).
Ideally fw_devlink should take care of it and prevent extra probes and extra deferrals. Anyway, I don't see changes related to either of your points in this PR. |
Currently, some drivers are compiled as part of the static kernel image (=y in the upstream defconfig) while their dependencies are configured as modules (=m) and reside in rootfs. Examples include: 1.CONFIG_ARM_SMMU_QCOM=y (for GPU) depends on CONFIG_SA_GPUCC_8775P=m 2.CONFIG_USB_DWC3_QCOM=y depends on CONFIG_PHY_QCOM_USB_SNPS_FEMTO_V2=m Due to this configuration mismatch, built-in drivers attempt to probe before their module dependencies are loaded from rootfs, causing repeated probe deferrals. Randomly, the kernel deferred probe timeout exceeds leading to probe failures. This probe timeout issue has cascading effects. When critical drivers like GPU SMMU or USB controller fail to probe due to missing dependencies, it delays the overall boot process and affects other subsystems, causing secondary failures in storage, camera, and other drivers. These modules will be packaged in ramdisk to ensure they are available early during boot, preventing probe timeout issues and boot delays. Signed-off-by: Kavinaya S <kavinaya@qti.qualcomm.com>
Create initramfs-qcom-image that extends initramfs-rootfs-image with QCOM specific boot requirements via packagegroup-qcom-boot. This ensures boot-critical kernel modules (GPUCC, FEMTO PHY, UFS) are included in the initramfs, allowing built-in drivers to find their module dependencies early during boot and avoiding deferred probe timeout issues. Signed-off-by: Kavinaya S <kavinaya@qti.qualcomm.com>
Remove CONFIG_SCSI_UFS_QCOM=y from bsp-additions.cfg, allowing it to default to module (=m) configuration. The module will be loaded from initramfs during early boot. The current kernel configuration creates a dependency mismatch where built-in drivers (=y) depend on modules (=m) located in rootfs. Examples include: 1.CONFIG_ARM_SMMU_QCOM=y depends on CONFIG_SA_GPUCC_8775P=m 2.CONFIG_USB_DWC3_QCOM=y depends on CONFIG_PHY_QCOM_USB_SNPS_FEMTO_V2=m This mismatch causes built-in drivers to probe before their module dependencies are available, resulting in repeated probe deferrals. When the kernel's deferred probe timeout expires, these drivers fail to initialize, causing boot delays, cascading failures in dependent subsystems Signed-off-by: Kavinaya S <kavinaya@qti.qualcomm.com>
Remove CONFIG_SCSI_UFS_QCOM=y from bsp-additions.cfg, allowing it to default to module (=m) configuration. The module will be loaded from initramfs during early boot. The current kernel configuration creates a dependency mismatch where built-in drivers (=y) depend on modules (=m) located in rootfs. Examples include: 1.CONFIG_ARM_SMMU_QCOM=y depends on CONFIG_SA_GPUCC_8775P=m 2.CONFIG_USB_DWC3_QCOM=y depends on CONFIG_PHY_QCOM_USB_SNPS_FEMTO_V2=m This mismatch causes built-in drivers to probe before their module dependencies are available, resulting in repeated probe deferrals. When the kernel's deferred probe timeout expires, these drivers fail to initialize, causing boot delays, cascading failures in dependent subsystems Signed-off-by: Kavinaya S <kavinaya@qti.qualcomm.com>
Updated the contents and commit message accordingly. |
Test Results 25 files 25 suites 2h 25m 32s ⏱️ For more details on these failures, see this check. Results for commit 512a9f3. |
Move UFS and SDHCI storage drivers from static kernel build to initramfs modules to improve boot initialization timing